Jump to content

Recommended Posts

Posted

Yes, petalbot isn't the best and as you can see we're already blocking it. But, I can show you an example of bingbot doing virtually the same, cycling through the pages  profile/[user]/content/page for dozens of profiles at a time overnight last night .  Maybe not as hardcore rate wise as petal bot, but it's still sizeable when you take into the account that the SQL being used is cumbersome. 

The point is that the search bots do crawl at the rate I've described (even if you ignore the likes of petalbot), and do cycle through the pages in the profile. Eg - I'm not making this stuff up!

 

  • Management
Posted
1 hour ago, Dll said:

Yes, petalbot isn't the best and as you can see we're already blocking it. But, I can show you an example of bingbot doing virtually the same, cycling through the pages  profile/[user]/content/page for dozens of profiles at a time overnight last night .  Maybe not as hardcore rate wise as petal bot, but it's still sizeable when you take into the account that the SQL being used is cumbersome. 

The point is that the search bots do crawl at the rate I've described (even if you ignore the likes of petalbot), and do cycle through the pages in the profile. Eg - I'm not making this stuff up!

 

I know, and you have valid points but unruly bots hammering the site is really a server/network issue. CDN caching for guests and using a WAF will help fix that.

In terms of the original post though, I don't think that is the cause. I have moved it to a ticket so I hope to learn some more from it.

Posted (edited)
44 minutes ago, Matt said:

I know, and you have valid points but unruly bots hammering the site is really a server/network issue. CDN caching for guests and using a WAF will help fix that.

 

I agree, to an extent - I'm not sure there's a cdn cache policy that can be especially helpful for profiles and multiple pages within them, as they're relatively low human traffic, with lots of unique pages which won't necessarily be visited particularly frequently beyond when bots are hitting them. But, absolutely, we're using our WAF to mitigate everything fine. I'm not sure that should be used instead of well-written, efficient SQL queries though, and I'm pretty sure there's scope to improve those in this instance. 

 

Edited by Dll
  • Management
Posted
7 minutes ago, Dll said:

I'm not sure that should be used instead of well-written, efficient SQL queries though, and I'm pretty sure there's scope to improve those in this instance.

The search area is very complex with a lot going on. We have plans to improve it post 5.0.0.

  • Management
Posted
2 hours ago, Westfield Sports Car Club said:

Removing FORCE INDEX makes a massive difference!

I ran the full query manually and it just hung for over 5 mins, then without FORCE INDEX it ran in an instant - with only a few results though.

EXPLAIN output attached as CSV

forum_WSCC Forum_29-05-2024@12-34.csv 498 B · 4 downloads

Give this plugin a go (it's unsupported, etc, etc if anything goes wrong, just disable it).

Search_ no force index() 1.0.0.xml

  • Management
Posted

Good to know. I think I'll adjust the code to not use that index if it's MySQL 8. That should keep it efficient for MySQL 5, but not punish MySQL 8.

Posted
On 5/29/2024 at 12:55 PM, Matt said:

Do you have topics with thousands of pages of posts? We did remove an old optimisation that helped with older versions of MySQL but caused issues with. MySQL 8.

We have some. 

I've provided more details in a reply I just made to the ticket. 

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...