Jump to content

Big increase in 429 Google errors for tag searches


sadams101

Recommended Posts

Posted

I am seeing a huge increase in errors from my site's tags. The errors are 429, and in Webmaster tools it says:

Quote

Googlebot couldn't access the contents of this URL because the server had an internal error when trying to process the request. These errors tend to be with the server itself, not with the request.

I believe these are related to the flood control setting in admin...so google's bot is being blocked from the search due to the flood control setting. 

My question is this--the links listed all actually work, I tested them. Since such a spike in errors could have a negative impact in google search, what is the best way to handle this? A "nofollow" on the tags? Any idea how to do that? 

Posted

PS - Another option is robots.txt and
Disallow: /tags/
but do we really not want to index these? It seems like this would be good search engine content.

A better option might be rel="nofollow" or rel="nofollow noindex" but that would likely cause the same thing--the links would be excluded from the index.

Posted

I thought of another possible approach, but I doubt this is allowed in the robots.txt. Could you put a crawl-delay on a specific directory, for example:

Allow: /tags/ Crawl-delay: 60

Ever hear of this?

Posted

No, those options won't work. You can either prevent bots from indexing the content at all (via robots.txt), slow down their crawling globally (you do this in webmaster tools, not via crawl-delay directive), accept that Google will get some 429 responses which is ok (it doesn't hurt SEO, just limits how quickly they can crawl those pages), or remove the search flood control.

Posted

I am using the latest version of IPB, and in the Pages app my tags do not include noindex:

<a href="https://www.url.com" class='ipsTag' title="Find other content tagged with 'tag text'" rel="tag">

I also don't see this an a setting option in the ACP.

Posted

So I found the tag template:

	<a href="{url="app=core&module=search&controller=search&tags={$urlEncodedTag}" seoTemplate="tags"}" class='ipsTag' title="{lang="find_tagged_content" sprintf="$tag"}" rel="tag">

and changed it to:

	<a href="{url="app=core&module=search&controller=search&tags={$urlEncodedTag}" seoTemplate="tags"}" class='ipsTag' title="{lang="find_tagged_content" sprintf="$tag"}" rel="tag noindex">

 

I guess this would be the best solution to avoid having spiders potentially running many searches per minute/second, and to stop the google errors.

Posted

I wanted to make a correction here--the "noindex" is only a meta tag attribute, not a link attribute, so I ended up with a simple Disallow: /tags/ in my robots.txt. 

In a perfect world I could simply turn off the flood control, but then google and other spiders would likely be running these searches constantly, which would affect performance.

Posted

As I mentioned above all tag pages are already marked with the meta tag attribute „noindex“. Just open a tag page and check out the source code.

But this meta tag has nothing to do with how often Google visits those pages. 

Posted

I did look for the "noindex" as an attribute, and in may page I do not see this. In any case, noindex may still, at least to google, mean ok to crawl, just no ok to index. If it really is noindex then people should have the block in robots.txt to stop google from crawling it.

Archived

This topic is now archived and is closed to further replies.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...