Jump to content

Community

removing sphinx hurts...


Dmacleo

Recommended Posts

  • 2 weeks later...
  • Management

This is interesting, because we have deployed IPS4 on many, many enterprise sites - with millions of posts and have not had a single need for Sphinx. In fact, we haven't used Sphinx even before IPS4, literally in years. 

I want to stop short of saying you're doing it wrong and I will note that we still have plans for external search methods such as a cloudsearch platform - but this is more for feature integration. I think we need to better understand why you miss Sphinx because we have some pretty complex sites on our platform, it seems strange that all of yours would be even more so. Often, we assume we need something because it's what we've always used and what we know. 

Please help me understand your requirements. We understand the character limit -- that can be adjusted on a mysql config level, if desired (though dropping to 2 is really not advised.) Other than shaving a couple milliseconds off, what is the use-case for Sphinx now that MySQL 5.6 supports fulltext innodb and does a fairly decent job in its own right? 

Link to comment
Share on other sites

9 hours ago, Lindy said:

Please help me understand your requirements. We understand the character limit -- that can be adjusted on a mysql config level, if desired (though dropping to 2 is really not advised.) Other than shaving a couple milliseconds off, what is the use-case for Sphinx now that MySQL 5.6 supports fulltext innodb and does a fairly decent job in its own right? 

When you have 20 million posts and you don't wish to add super-restrictive search intervals or don't wish to only include let's say the last year of content to search through. Then the inbuilt database search system simply doesn't cut it for full text searches in terms of performance. It can take a substantial amount of resources in a short amount of time with the "wrong searches". 

I've had to cut down the amount of searchable content to 1 year (1.7 million posts) because of performance issues related to having the full index available.

It's a lot more than simply a few milliseconds when it comes up to those amounts of content. 

Link to comment
Share on other sites

It is also usually cheaper to put in sphinx/elasticsearch rather than upgrade mysql server instances but it has been years since I ran mysql based search. When I moved to sphinx it resolved a few bizarre performance bottle knecks that would occur at times and it allowed me to remove most of the time based search restrictions (such as a search every 10s)

Edited by ZeroHour
Link to comment
Share on other sites

  • Management
5 hours ago, TSP said:

When you have 20 million posts and you don't wish to add super-restrictive search intervals or don't wish to only include let's say the last year of content to search through. Then the inbuilt database search system simply doesn't cut it for full text searches in terms of performance. It can take a substantial amount of resources in a short amount of time with the "wrong searches". 

I've had to cut down the amount of searchable content to 1 year (1.7 million posts) because of performance issues related to having the full index available.

It's a lot more than simply a few milliseconds when it comes up to those amounts of content. 

Do you have 20 million posts and have you used the latest versions of MySQL lately? We have sites that do and we have and I can tell you it's a non-issue on any size database we've thrown at it thus far. 

So, saying you want Sphinx because you want to scale down and run lesser MySQL specs is quite a lot different than saying Sphinx is required for large sites or short search inquiries. I'm not dismissing the need/desire to have Sphinx over a powerful database server - but it's important to understand why you need Sphinx, or rather, why you think you do. 

If you haven't used IPS4 on MySQL 5.6 with fulltext innodb, I'd urge you to do so before jumping on the bring back Sphinx bandwagon. IPS4 introduced a search index, so gone are the days of doing intensive table scans across 15 tables. It's actually quite efficient and if character limits are your concern, you can adjust those via your server config. 

Alternate search is on the radar, mind you, but I just wanted to clarify - it's more likely to be a cloud-based search platform to bring forth cool features and not only to solve the issue of underpowered database servers. 

Link to comment
Share on other sites

  • 5 weeks later...

Having said that, we upgraded from 3.4 to 4.1 in early February, and our final task (search reindex) is still running on our 10 million posts. While we're dealing with such large data sets from older IPS software, it would be great to have a tool that would bring to bear the full resources of the server to reindex, much like the UTF8 conversion. This task plodding away in the background, even in manual mode, is intensive enough to cause issues while users are posting, but not intensive enough to warrant taking the forum offline the entire time.

 

(we've tried running manually in a browser window, and scheduled tasks... neither speed up the task appreciably to consider taking the forum offline).

Link to comment
Share on other sites

What I really miss is stemming. When we ran a Sphinx server in 3.x we had stemming turned on, which greatly improved search result quality. We experimented with both a standard English morphology as well as Soundex and Metaphone. I'd really have preferred even better Sphinx integration because I would also like to weight results so that a title hit was better than a post content hit, etc. People are used to Google-quality results, and a fulltext search just doesn't give that. 

Link to comment
Share on other sites

@Lindy, for the alternate search feature you talk of, do you plan on making it like the login system where you can choose which handler to use? That approach would allow someone to write a Sphinx "plugin" if they wanted to use that instead of what IPS provides.

Using MySQL's fulltext indexes is certainly the easiest approach to take and I can understand why IPS made that choice since it works best for the majority of sites. But it would be nice for those of us who manage larger boards to have the flexibility of choosing other options.

Link to comment
Share on other sites

  • Management

I can say we have no immediate plans for Sphinx support. There's far better solutions out there and anything we do will likely be cloud based. I believe we initially suggested this could be done by third parties, but because of the complexity and tight integration of content discovery (which is "search", activity streaming, etc.) this would be very difficult if not impossible for a third party author to do unless we broke things apart... there's no plans for that at this time. 

Cloud based search features are on the horizon - I don't have an ETA, but it's something we really want to address. We recognize the need for more Google-esque quality results and features... we just feel Sphinx is a short-sighted "solution" that's not going to gain you much more than a few ms quicker results. Think bigger! ^_^

Link to comment
Share on other sites

9 hours ago, Lindy said:

----

Any ETA on a search for commerce? at the moment there is no way to search for products, I am loosing money because customers want to search for what they want by typing in a word, and not checking each category, which I have had to change so customers can find the products better, but that will only work for so long, in the next 3 weeks I will be adding 30+ new products, and I can just see people giving up looking for the version they want.

I got told 6 months ago that you guys are working on it. 6 months on and not a single word about it.

PLEASE tell me a search is coming soon for it?

Link to comment
Share on other sites

On 14/04/2016 at 9:39 PM, Lindy said:

I can say we have no immediate plans for Sphinx support. There's far better solutions out there and anything we do will likely be cloud based. I believe we initially suggested this could be done by third parties, but because of the complexity and tight integration of content discovery (which is "search", activity streaming, etc.) this would be very difficult if not impossible for a third party author to do unless we broke things apart... there's no plans for that at this time. 

Cloud based search features are on the horizon - I don't have an ETA, but it's something we really want to address. We recognize the need for more Google-esque quality results and features... we just feel Sphinx is a short-sighted "solution" that's not going to gain you much more than a few ms quicker results. Think bigger! ^_^

But that Cloud Based Solution will cost us money? Money that will not even go to Invision?

The thing im not understanding, is that there are free solutions out there, and the only solution is a Cloud Based that is not even cheap? And Invision will advertise that?

This is almost the same thing as Invision advertise us to use Amazon Dedicated Servers instead of their own Cloud Packages, because they are better.

Link to comment
Share on other sites

On 4/5/2016 at 4:06 PM, rllmukforum said:

Having said that, we upgraded from 3.4 to 4.1 in early February, and our final task (search reindex) is still running on our 10 million posts.

That seems ridiculous.  I'd like to know if this is typical for large databases?  I have not yet upgraded to 4.x, and I am running Sphinx.  I also have two-character searches on my forum, which works great with Sphinx.  I don't want to remove this if possible.  

This thread is making me twitch. 

Link to comment
Share on other sites

On 4/17/2016 at 9:42 AM, AtariAge said:

That seems ridiculous.  I'd like to know if this is typical for large databases?  I have not yet upgraded to 4.x, and I am running Sphinx.  I also have two-character searches on my forum, which works great with Sphinx.  I don't want to remove this if possible.  

This thread is making me twitch. 

 

As we're moving hardware , we temporarily moved to cloud hosting with double the cores and evidently some sort of serious SAN setup. It crunched the last 25% of the search reindex in about a day. It's unfortunate, because our previous hardware used to run 3.x without breaking a sweat. For large data sets, if you don't have serious hardware, it does appear to be worth ponying up for some cloud compute during the transition. 

Link to comment
Share on other sites

  • Management
On April 15, 2016 at 1:51 AM, MarcusH said:

Any ETA on a search for commerce? at the moment there is no way to search for products, I am loosing money because customers want to search for what they want by typing in a word, and not checking each category, which I have had to change so customers can find the products better, but that will only work for so long, in the next 3 weeks I will be adding 30+ new products, and I can just see people giving up looking for the version they want.

I got told 6 months ago that you guys are working on it. 6 months on and not a single word about it.

PLEASE tell me a search is coming soon for it?

It is coming. I'll check on ETA. 

On April 17, 2016 at 7:50 AM, RevengeFNF said:

But that Cloud Based Solution will cost us money? Money that will not even go to Invision?

The thing im not understanding, is that there are free solutions out there, and the only solution is a Cloud Based that is not even cheap? And Invision will advertise that?

This is almost the same thing as Invision advertise us to use Amazon Dedicated Servers instead of their own Cloud Packages, because they are better.

Just because Sphinx is free doesn't make it a good option for IPS4. There's better options -- elasticsearch, solr, etc. If we were to do anything outside of something cloud based, it would be one of those. The big advantage to Sphinx back in the day was delivering results a bit faster than a beefy MySQL server. The overall relevancy and quality of results isn't significantly better than stock MySQL.

 

Link to comment
Share on other sites

15 hours ago, Lindy said:

Just because Sphinx is free doesn't make it a good option for IPS4. There's better options -- elasticsearch, solr, etc. If we were to do anything outside of something cloud based, it would be one of those. The big advantage to Sphinx back in the day was delivering results a bit faster than a beefy MySQL server. The overall relevancy and quality of results isn't significantly better than stock MySQL.

 

I don't need it to be Sphinx, it can be Elasticsearch or another free solution. I just hope the paid cloud based search will not be the only solution.

Link to comment
Share on other sites

  • 2 months later...
  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...

Important Information

We use technologies, such as cookies, to customise content and advertising, to provide social media features and to analyse traffic to the site. We also share information about your use of our site with our trusted social media, advertising and analytics partners. See more about cookies and our Privacy Policy