Jump to content

Featured Replies

Yes, petalbot isn't the best and as you can see we're already blocking it. But, I can show you an example of bingbot doing virtually the same, cycling through the pages  profile/[user]/content/page for dozens of profiles at a time overnight last night .  Maybe not as hardcore rate wise as petal bot, but it's still sizeable when you take into the account that the SQL being used is cumbersome. 

The point is that the search bots do crawl at the rate I've described (even if you ignore the likes of petalbot), and do cycle through the pages in the profile. Eg - I'm not making this stuff up!

 

  • Management
  On 5/29/2024 at 12:32 PM, Dll said:

Yes, petalbot isn't the best and as you can see we're already blocking it. But, I can show you an example of bingbot doing virtually the same, cycling through the pages  profile/[user]/content/page for dozens of profiles at a time overnight last night .  Maybe not as hardcore rate wise as petal bot, but it's still sizeable when you take into the account that the SQL being used is cumbersome. 

The point is that the search bots do crawl at the rate I've described (even if you ignore the likes of petalbot), and do cycle through the pages in the profile. Eg - I'm not making this stuff up!

 

I know, and you have valid points but unruly bots hammering the site is really a server/network issue. CDN caching for guests and using a WAF will help fix that.

In terms of the original post though, I don't think that is the cause. I have moved it to a ticket so I hope to learn some more from it.

  On 5/29/2024 at 1:38 PM, Matt said:

I know, and you have valid points but unruly bots hammering the site is really a server/network issue. CDN caching for guests and using a WAF will help fix that.

 

I agree, to an extent - I'm not sure there's a cdn cache policy that can be especially helpful for profiles and multiple pages within them, as they're relatively low human traffic, with lots of unique pages which won't necessarily be visited particularly frequently beyond when bots are hitting them. But, absolutely, we're using our WAF to mitigate everything fine. I'm not sure that should be used instead of well-written, efficient SQL queries though, and I'm pretty sure there's scope to improve those in this instance. 

 

Edited by Dll

  • Management
  On 5/29/2024 at 2:22 PM, Dll said:

I'm not sure that should be used instead of well-written, efficient SQL queries though, and I'm pretty sure there's scope to improve those in this instance.

The search area is very complex with a lot going on. We have plans to improve it post 5.0.0.

  • Management
  On 5/29/2024 at 11:56 AM, Westfield Sports Car Club said:

Removing FORCE INDEX makes a massive difference!

I ran the full query manually and it just hung for over 5 mins, then without FORCE INDEX it ran in an instant - with only a few results though.

EXPLAIN output attached as CSV

forum_WSCC Forum_29-05-2024@12-34.csvUnavailable

Give this plugin a go (it's unsupported, etc, etc if anything goes wrong, just disable it).

Search_ no force index() 1.0.0.xmlUnavailable

Just to follow up - the patch appears to be doing its job perfectly. Thanks for the excellent support 👍

  • Management

Good to know. I think I'll adjust the code to not use that index if it's MySQL 8. That should keep it efficient for MySQL 5, but not punish MySQL 8.

  • Author
  On 5/29/2024 at 10:55 AM, Matt said:

Do you have topics with thousands of pages of posts? We did remove an old optimisation that helped with older versions of MySQL but caused issues with. MySQL 8.

We have some. 

I've provided more details in a reply I just made to the ticket. 

Recently Browsing 0

  • No registered users viewing this page.