November 7, 2015 in Technical Problems
Since 4.1 removed Sphinx, I need to get the SQL search working for my forum. The primary item I need is the ability to search hyphenated words. Here's what I have tried:
In /usr/share/mysql/charsets/ I edited latin1.xml (where character_set_server and character_set_database point to) and in the ctable entry changed 10 to 01 in the 3rd row 14th column (0x2D)
I then restarted mysql and even apache, rebuilt the search index and still am unable to search for hyphenated words. Can anyone help me with what I may be doing wrong?
Without editing the character set, using my 2-character enabled MySql search:
No quotes wrapped around the search terms needed.
The dash-enabled search stuff was asked for and tackled way way back in the alpha days a year ago.
What are some examples of terms you are looking for?
Searching for C-160. If I reduce the MySQL search to 3 characters it will find EVERY occurance of 160 not necessarily C-160.
Repeating my test from above. Made two new posts in my test install: C-160. 160.
Search for C-160 finds: Post with C-160, Post with 160.
Search for "C-160" finds: Just C-160.
Wrap the search term in quotes.
Hey, thank you for your help. While it still doesn't function as I think it should, it gets me closer to where it's probably livable. I had to change ft_min_word_len=3 for the searches to find anything.
I made 6 posts
three: C 160
four: D 160
C-160: one, two, three, four, five
"C-160": one, three <-- why did it find three?
"C 160": one, three <-- why did it find one?
-160: one, two, three, four, five
"-160": one, two, three, four, five
160: zero results <-- shouldn't that find them all?
"160": one, two, three, four, five <-- not sure why 160 gets zero and "160" finds them all?
Sphinx saw the hyphen more as a character, something is still not seeing the hyphen as a character in SQL search I think.
I can probably live with this, but would have hoped for a more precise search, losing Sphinx removes a more comprehensive search for me.
I'm not entirely sure, I'd have to poke the code to see how the hyphen is being accounted for. I suspect it is accounted for but also ignored (that's a horrible way of saying it but I just woke up )
ft_min_length? You may wish to convert your posts table to inno_db instead of MyIsam if not for anything else the stopwords list is much much smaller with innoDB..
If it already is innoDB then you need the other var setting:
Or change it to 3 if you like. The index size is bigger but not anywhere near as big as I thought it would be when set to two characters.
Let me enter your test entries and give it a whirl...
I'm diving a bit deeper into this now. Gonna wait to hear from you regarding whether you have inno or myisam set and/or configured right before I share what I just got results wise.
The IPS search index looks pretty good though. Everything is there minus html elements. If IPS's goal was to drop Sphinx in order to make the default MySql search more robust they have an opportunity to do so. Might even be an opportunity for an app/plugin to bulk up the query routine on it's own.
My test system is myisam the production has innodb for the post and topic tables.
These tests on your test install or production?
On the test, production is running Sphinx
For MyISAM you'd have to change it to 3 character no matter what as the default for that is 4 character (inno is three by default).
For my test install 160 or "160" finds everything. No idea why 160 is not working for you.
Hell of a rabbit hole you are sending me on. I'll leave it at that. :)
My production forum is running 188.8.131.52, I am biting the bullet and converting to SQL search. Currently rebuilding the index. It is as yours is as well, were searching for 160 is bringing up everything. This is livable, but not optimal.
Make sure you have both ft_min_word_len and innodb_ft_min_token_size set to what you want for character searches.
The routine actually reads in the MyIsam ft_min_word_len to determine the min character search on the IPS side. If not found it defaults to 3.
So even if you have the search index as innodb, etc. etc... Like I said, rabbit hole...
This topic is now archived and is closed to further replies.
Started Yesterday at 01:29 AM
Started October 4
Started Wednesday at 05:56 PM