Jump to content

Weak search of IPB


Guest Stepashka

Recommended Posts

Posted

The search works fine for me... and many others I assume.

It might help if you elaborate on exactly what is the problem and possibly provide an example. Certainly if every single member of your forum is unhappy with the search function, then perhaps there is a deeper problem, possibly server related?

Posted

The search works fine for me... and many others I assume.



It might help if you elaborate on exactly what is the problem and possibly provide an example. Certainly if [b]every single member[/b] of your forum is unhappy with the search function, then perhaps there is a deeper problem, possibly server related?



No! my server works perfect.

First of all the serach is making extremely heavy load! If you have more then 2 million posts you would probably know that issue already.
Second it
  • Management
Posted

You may not be aware that an inherit limitation within MySQL is that it limits (usually to 3 or 4) the length that a search term can be searched for. So, even if you edit the IPB settings to 1, MySQL will override this. There is nothing we can do about it.

All IPB does is say to MySQL "please search for this" and MySQL returns results - if that is server intensive or does not work there is nothing we can do about that really.

That said, we provide alternate search systems (like Sphinx) which are designed to overcome these limitations in MySQL which you may wish to look into.

Posted

You may not be aware that an inherit limitation within MySQL is that it limits (usually to 3 or 4) the length that a search term can be searched for. So, even if you edit the IPB settings to 1, MySQL will override this. There is nothing we can do about it.



All IPB does is say to MySQL "please search for this" and MySQL returns results - if that is server intensive or does not work there is nothing we can do about that really.



That said, we provide alternate search systems (like Sphinx) which are designed to overcome these limitations in MySQL which you may wish to look into.



Sphinx will be integrated to IPB3?
  • Management
Posted

Support for Sphinx is included in IPB3. It will require you have Sphinx configured on your server, of course, but if you do IPB will be able to use it.

Posted

I find search works better when you disable the full text search. When it's enabled searches often don't reveal any results even though you know there are some results.

3DKiwi

  • Management
Posted

Disabling fulltext search is a very bad idea. So bad in fact we are removing the option to disabled it in IPB3. While it may occasionally return more results than fulltext searching, it is also extremely intensive on the server.

Posted

We have not announced a final release date as we are still in the beta phase.



Sphinx seems to work very fast now. but still if the search value contains "-" or "+" or
Posted

I think I know why, at least partially, of Stepashka's concerns over the high server loads. IPS has told me many times, through support tickets, that IPB doesn't cause excessive server loads which is completely false. I noticed a huge leap in server load spikes happening on the site, which I was told by my webhost provider that it was caused by the IPB forum software and that it was the database causing this.

I had first noticed these huge spikes when my webhost brought it to my attention and that it happened right after I had upgraded to IPB 2.2. I had thought it was kind of peculiar because no matter what I had done, even with IPS increasing the amount of memory that was configured for IPB was increased, these server spikes were still occurring.

I'm also not convinced that this problem has been solved with IPB 3. For one thing, server loads aren't reported with IPB 3 despite having it enabled. I think it may have something to do with maybe something I read on someone else's blog, that IPB 2 just had a lot of messy CSS code (which was their quote, not mine). Despite many attempts to try and fix this, I still get server spikes of over 20, which is overly excessive for IPB software. With very few mods installed, I've found that this is a major problem that IPB hasn't been able to overcome.

It's not on the part of my webhost because the servers on which my site is on has 2 CPU's (dual core) and plenty of memory ... that it has to do with with the scripts that run IPB.

Posted

I think I know why, at least partially, of Stepashka's concerns over the high server loads. IPS has told me many times, through support tickets, that IPB doesn't cause excessive server loads which is completely false. I noticed a huge leap in server load spikes happening on the site, which I was told by my webhost provider that it was caused by the IPB forum software and that it was the database causing this.



I had first noticed these huge spikes when my webhost brought it to my attention and that it happened right after I had upgraded to IPB 2.2. I had thought it was kind of peculiar because no matter what I had done, even with IPS increasing the amount of memory that was configured for IPB was increased, these server spikes were still occurring.



I'm also not convinced that this problem has been solved with IPB 3. For one thing, server loads aren't reported with IPB 3 despite having it enabled. I think it may have something to do with maybe something I read on someone else's blog, that IPB 2 just had a lot of messy CSS code (which was their quote, not mine). Despite many attempts to try and fix this, I still get server spikes of over 20, which is overly excessive for IPB software. With very few mods installed, I've found that this is a major problem that IPB hasn't been able to overcome.



It's not on the part of my webhost because the servers on which my site is on has 2 CPU's (dual core) and plenty of memory ... that it has to do with with the scripts that run IPB.



In my experience, IPB is no more server heavy than any other comparable script. As well as a software company, we are also a hosting company and something we offer is hosted IPB installations - and we have servers with many many installations on which run smoothly.
If the server-load is way too high in your or your host's opinions, I would encourage you to open a ticket for us to look into the cause. I can't imagine a technician dismissing you with "IPB doesn't cause high server loads" as you suggested, and we'd would be more than happy to look into your problems and discuss the matter further.
Posted

Well, the response I got was that it wasn;t being caused by IPB and my webhost said it was due to the scripts that IPB uses. I'll post a ticket but I've inquired about that twice before and received the same response.

Posted

It's a lot to do with context and situation. If you have 1000 members at once trying to search on a forum with 5 million posts from a shared hosting account, then yes, you are going to have problems, and no amount of performance tweaking is going to solve that. I don't know your specific situation nor am I a technician, but there are natural limitations based on the size of your forum and the hardware you are running on, so it may be that you're overstretching what's possible in your environment. That's something a tech would be able to discuss with you :)

Posted

Try Google Custom Search for your site. It will allow people to search the entire site including the forums. You can limit results to exclude areas such as the lofi version, etc..



http://www.google.com/coop/cse/
But google also search through everything on the pages. Not just the content in the posts itself but also the member info etc. IPB should have a good search function and not just say; use google. Because I don't like to search on forums with google. I would also like to have the abillity to show the results as a topic overview as before.


@Stepashka: Why did I do what?
Posted

IPB should have a good search function and not just say; use google.



We do the very best we can with what we have. Unfortunately it's unrealistic to expect someone running on a shared hosting account to have a search that doesn't kill the server and yet returns results like Google or other real search engines do. We run on MySQL and have customers on a massive variety of hosting setups, so we're limited in what we can do. That's why we support things like Sphinx integration, or MSSQL, for those customers that do have the flexibility to manage their own hardware and need better performance than can be provided by default.
Posted

Well, I can tell you that I don't have quite that many. Only about 115,000 posts. Sometimes I see spikes above 10 reaching as high as 35 to 40.

Posted

We do the very best we can with what we have. Unfortunately it's unrealistic to expect someone running on a shared hosting account to have a search that doesn't kill the server and yet returns results like Google or other real search engines do. We run on MySQL and have customers on a massive variety of hosting setups, so we're limited in what we can do. That's why we support things like Sphinx integration, or MSSQL, for those customers that do have the flexibility to manage their own hardware and need better performance than can be provided by default.



can you answer me please?

Sphinx seems to work very fast now. but still if the search value contains "-" or "+" or “/” etc… if the search contains one value, it’s not searching and shows error. I think it will be better to add some code that will ignore this 1 value in every search!



how can i do it on my ipb2.3?
Posted

Well, I can tell you that I don't have quite that many. Only about 115,000 posts. Sometimes I see spikes above 10 reaching as high as 35 to 40.



I will speak from experience (I run forums with 5GB post tables)... the main hit is Disk IO when doing a search. Increasing the CPU power on my server was a limited success in improving loading however when I went to a nice RAID configuration with 15k RPM SCSI drives the loads dropped tons. If you think about it this makes sense for large post tables (having to scan it -- even with indexes) and will lower wait time and locked tables. Obviously this is not an option for many people cost wise (talking several hundred $$ a month) so if you are using a shared host inquire on their HD configurations. I have seen hosts put several hundred people on a box with a single hard drive (7200RPM drive -- which is slow for webhosting). Look for a shared host that has a RAID configuration (striping) beyond just for redundancy (mirroring). Look also for hosts that use SAS (SCSI) drives of the 15k RPM 2.5" flavor. They are really not that much more expensive so some reputable hosts should use them. Avoiding the "cheapy" houses that use low end components will save you headaches down the road anyhow.

Also for Google CSE you can prevent indexing (and returned results) of members pages and many other items. You do this by placing a filter in the Google CSE setup to exclude for example "http://forums.invisionpower.com/index.php?showuser=*". It works better than you might think if you give it a shot. I do not claim that it is more versatile for searching the forums in many cases for specific information, however if you offer it to your users as the "main search" it will result in them using it more often and lowering your loads. Offer the built in search as an "advanced forums search".
Posted

could You also tell me why can I not found such phrase on my forum: "4.70" for sure there are few topics with such name and it's common in topics - it's model name of tractor produced by DF company.
we are using sphinx and 3 chars minimum strings

Posted

Isn't that just an extremely bad idea?



So that means that if I make a thread I call "html and css-guide" and a user search for html guide, he will not find any result, he will need to search for 'html' or 'guide' - which can also cause him to find a lot of unrelevant matches.




And get back the option to show results as topics please...


While we can improve upon the search itself, you have to understand what you are working with. There are inherent limitations in MySQL that we simply cannot overcome. How can we determine what words to "strip"? How are we to know that we should strip "and" and "css-" from your search? Should we create some master database of words to strip before searching? There are very real and difficult challenges to searching using MySQL and it's not as simple as "make it better".

As for results as topics, we are discussing what to do about search. There are still some things that need to be cleaned up.


Try Google Custom Search for your site. It will allow people to search the entire site including the forums. You can limit results to exclude areas such as the lofi version, etc..



http://www.google.com/coop/cse/


While Google Search might be nice as an option, it cannot be the only option in some cases. For instance, naturally Google cannot index your private forums, so you'd be unable to find topics from those private forums in search results. However, for many larger/corporate sites, that may be a tradeoff worth taking to offload search load to Google.

The good thing is with the system in place we can add new search methods easily, and quite likely will continue to do so. Lucene is another popular searching system that works great from what I hear, and we may look into adding support for it in the future.

Well, I can tell you that I don't have quite that many. Only about 115,000 posts. Sometimes I see spikes above 10 reaching as high as 35 to 40.



Please understand that numbers themselves are only an indicator. Do you know what was happening at the time? How many people were online? What were they doing? Were there any locked queries? What kind of server is this - shared? If so, what was happening on all the other sites on the server at the time? What time was it - perhaps backups or a cron was running that might have locked a table IPB uses?

There are simply too many variables for us to just say "oh, well, this is your problem right here" without taking a look. If you have issues, submit a ticket. If you aren't happy with the response provided, push the issue and escalate to management if the need arises.


could You also tell me why can I not found such phrase on my forum: "4.70" for sure there are few topics with such name and it's common in topics - it's model name of tractor produced by DF company.


we are using sphinx and 3 chars minimum strings



Because Sphinx considers "." a stopword endpoint, so it would search for 4 and 70 separately, both of which are below the 3 char limit. You need to configure sphinx specially if this is a concern for you.

http://www.sphinxsearch.com/docs/

Archived

This topic is now archived and is closed to further replies.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...