loccom Posted February 5, 2019 Posted February 5, 2019 I have started getting loads of notifcations from google about 429 error codes, 1000's of errors all of a sudden and seems mainly on TAG pages. Any idea what is causing this and how to fix? see attached
loccom Posted February 5, 2019 Author Posted February 5, 2019 also noticed this page has <meta name="robots" content="noindex"> strange why google is even bothering with it
Nathan Explosion Posted February 5, 2019 Posted February 5, 2019 9 minutes ago, loccom said: Any idea what is causing this A 429 response is 'too many requests' - in other words, the client machine has sent too many requests in a period of time and is being rate limited.
loccom Posted February 5, 2019 Author Posted February 5, 2019 Seems strange how this only happens on a tag page when the page instructs google not to index. I am looking in to the rate limit for this, perhaps i need to slow the crawl down slighty?
Ryan Ashbrook Posted February 5, 2019 Posted February 5, 2019 2 hours ago, loccom said: Seems strange how this only happens on a tag page when the page instructs google not to index. I am looking in to the rate limit for this, perhaps i need to slow the crawl down slighty? Simply having a noindex on the page doesn't necessarily mean they won't crawl it - it just means they won't store it in their index. https://webmasters.stackexchange.com/a/100831 The 429 response is likely coming from the Search Flood control, which is hard coded to 30 seconds for spiders.
loccom Posted February 5, 2019 Author Posted February 5, 2019 @Ryan Ashbrook My guest group has 20 second search flood control. You mention hard coded for spiders, does that mean search engines are treated differently to guests?
Ryan Ashbrook Posted February 5, 2019 Posted February 5, 2019 1 minute ago, loccom said: @Ryan Ashbrook My guest group has 20 second search flood control. You mention hard coded for spiders, does that mean search engines are treated differently to guests? Technically, yes - if you truly want to override this, though, you can do so using constants.php with the following: define( 'BOT_SEARCH_FLOOD_SECONDS', 30 ); Changing the thirty to whatever you wish. I should note, though, that some bots may crawl at an alarming rate (we've had to block some, like Ahrefs, entirely on Community in the Cloud) so you may see an uptick in server resource usage if you decrease this.
loccom Posted February 5, 2019 Author Posted February 5, 2019 @Ryan Ashbrook Do you know when this was introduced at all? When they visit the tag section, is that based on a search? Does it use the search function? would it be more wiser to not allow them to the tag section? Sitemap does all that work afterall? thanks
Ryan Ashbrook Posted February 5, 2019 Posted February 5, 2019 Yes, it is search based and uses the search system to try and be as efficient as possible. Blocking them from seeing the tag pages is ultimately up to you, honestly. It's one of those things that you should adjust based on your sites needs.
loccom Posted February 5, 2019 Author Posted February 5, 2019 @Ryan Ashbrook Sorry for all the questions. If i turn off tags for guest group, that would stop the tags appearing for guests and bots. Does tags pages get listed within site map? trying to not end up with mountain of 403's Edit: I tried this and the tags still appear
Recommended Posts
Archived
This topic is now archived and is closed to further replies.