Jump to content

SQL Search Help


nylyon

Recommended Posts

Posted

Since 4.1 removed Sphinx, I need to get the SQL search working for my forum.  The primary item I need is the ability to search hyphenated words.  Here's what I have tried:

In /usr/share/mysql/charsets/ I edited latin1.xml (where character_set_server and character_set_database point to) and in the ctable entry changed 10 to 01 in the 3rd row 14th column (0x2D)

563e3e53f1dac_ScreenShot2015-11-07at1.07

I then restarted mysql and even apache, rebuilt the search index and still am unable to search for hyphenated words. Can anyone help me with what I may be doing wrong?

 

Posted

Without editing the character set, using my 2-character enabled MySql search:

D-4
Not found

D-23
Found

No quotes wrapped around the search terms needed.

The dash-enabled search stuff was asked for and tackled way way back in the alpha days a year ago. 

What are some examples of terms you are looking for?

Posted

Repeating my test from above. Made two new posts in my test install: C-160. 160.

Search for C-160 finds: Post with C-160, Post with 160.

Search for "C-160" finds: Just C-160.

Wrap the search term in quotes.

Posted

Hey, thank you for your help.  While it still doesn't function as I think it should, it gets me closer to where it's probably livable.  I had to change ft_min_word_len=3 for the searches to find anything.

I made 6 posts

one: C-160
two: D-160
three: C 160
four: D 160
five: 160
six: C160

Search for:

C-160: one, two, three, four, five 
"C-160": one, three <-- why did it find three?
"C 160": one, three <-- why did it find one?
-160: one, two, three, four, five
"-160": one, two, three, four, five
160: zero results <-- shouldn't that find them all?
"160": one, two, three, four, five <-- not sure why 160 gets zero and "160" finds them all?
c160: six

Sphinx saw the hyphen more as a character, something is still not seeing the hyphen as a character in SQL search I think.  

I can probably live with this, but would have hoped for a more precise search, losing Sphinx removes a more comprehensive search for me.

Posted

I'm not entirely sure, I'd have to poke the code to see how the hyphen is being accounted for. I suspect it is accounted for but also ignored (that's a horrible way of saying it but I just woke up :) )

ft_min_length? You may wish to convert your posts table to inno_db instead of MyIsam if not for anything else the stopwords list is much much smaller with innoDB..

https://dev.mysql.com/doc/refman/5.7/en/fulltext-fine-tuning.html

If it already is innoDB then you need the other var setting:

[mysqld]
innodb_ft_min_token_size=2
ft_min_word_len=2

Or change it to 3 if you like. The index size is bigger but not anywhere near as big as I thought it would be when set to two characters.

Let me enter your test entries and give it a whirl...

 

Posted

I'm diving a bit deeper into this now. Gonna wait to hear from you regarding whether you have inno or myisam set and/or configured right before I share what I just got results wise.

The  IPS search index looks pretty good though. Everything is there minus html elements. If IPS's goal was to drop Sphinx in order to make the default MySql search more robust they have an opportunity to do so. Might even be an opportunity for an app/plugin to bulk up the query routine on it's own.

Posted

For MyISAM you'd have to change it to 3 character no matter what as the default for that is 4 character (inno is three by default).

For my test install 160 or "160" finds everything. No idea why 160 is not working for you.

Additional FYI:

 

 

Posted

My production forum is running 4.0.13.1, I am biting the bullet and converting to SQL search.  Currently rebuilding the index.  It is as yours is as well, were searching for 160 is bringing up everything.  This is livable, but not optimal.

Posted

Make sure you have both ft_min_word_len and innodb_ft_min_token_size set to what you want for character searches.

The routine actually reads in the MyIsam ft_min_word_len to determine the min character search on the IPS side. If not found it defaults to 3.

So even if you have the search index as innodb, etc. etc... Like I said, rabbit hole...

Archived

This topic is now archived and is closed to further replies.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...