Big increase in 429 Google errors for tag searches

sadams101 · August 17, 2018

I am seeing a huge increase in errors from my site's tags. The errors are 429, and in Webmaster tools it says:

Quote

Googlebot couldn't access the contents of this URL because the server had an internal error when trying to process the request. These errors tend to be with the server itself, not with the request.

I believe these are related to the flood control setting in admin...so google's bot is being blocked from the search due to the flood control setting.

My question is this--the links listed all actually work, I tested them. Since such a spike in errors could have a negative impact in google search, what is the best way to handle this? A "nofollow" on the tags? Any idea how to do that?

sadams101 · August 17, 2018

PS - Another option is robots.txt and
Disallow: /tags/
but do we really not want to index these? It seems like this would be good search engine content.

A better option might be rel="nofollow" or rel="nofollow noindex" but that would likely cause the same thing--the links would be excluded from the index.

sadams101 · August 22, 2018

I thought of another possible approach, but I doubt this is allowed in the robots.txt. Could you put a crawl-delay on a specific directory, for example:

Allow: /tags/ Crawl-delay: 60

Ever hear of this?

bfarber · August 23, 2018

No, those options won't work. You can either prevent bots from indexing the content at all (via robots.txt), slow down their crawling globally (you do this in webmaster tools, not via crawl-delay directive), accept that Google will get some 429 responses which is ok (it doesn't hurt SEO, just limits how quickly they can crawl those pages), or remove the search flood control.

Apfelstrudel · August 24, 2018

At the moment tag pages are not being indexed by default.

The ipb default setting is „noindex“.

Just a hint.

sadams101 · August 24, 2018

I am using the latest version of IPB, and in the Pages app my tags do not include noindex:

I also don't see this an a setting option in the ACP.

sadams101 · August 24, 2018

So I found the tag template:

	<a href="{url="app=core&module=search&controller=search&tags={$urlEncodedTag}" seoTemplate="tags"}" class='ipsTag' title="{lang="find_tagged_content" sprintf="$tag"}" rel="tag">

and changed it to:

	<a href="{url="app=core&module=search&controller=search&tags={$urlEncodedTag}" seoTemplate="tags"}" class='ipsTag' title="{lang="find_tagged_content" sprintf="$tag"}" rel="tag noindex">

I guess this would be the best solution to avoid having spiders potentially running many searches per minute/second, and to stop the google errors.

sadams101 · August 26, 2018

I wanted to make a correction here--the "noindex" is only a meta tag attribute, not a link attribute, so I ended up with a simple Disallow: /tags/ in my robots.txt.

In a perfect world I could simply turn off the flood control, but then google and other spiders would likely be running these searches constantly, which would affect performance.

Apfelstrudel · August 27, 2018

As I mentioned above all tag pages are already marked with the meta tag attribute „noindex“. Just open a tag page and check out the source code.

But this meta tag has nothing to do with how often Google visits those pages.

sadams101 · August 30, 2018

I did look for the "noindex" as an attribute, and in may page I do not see this. In any case, noindex may still, at least to google, mean ok to crawl, just no ok to index. If it really is noindex then people should have the block in robots.txt to stop google from crawling it.

Five Invision Community 5 features your team will love

Five Invision Community 5 features your members will love

Invision Community 4: SEO, prepare for v5 and dormant account notifications

Invision Community 5: Beta testing and latest updates

Big increase in 429 Google errors for tag searches

Recommended Posts

sadams101

sadams101

sadams101

bfarber

Apfelstrudel

sadams101

sadams101

sadams101

Apfelstrudel

sadams101

Archived

Recently Browsing 0 members

More