Invision Community 4: SEO, prepare for v5 and dormant account notifications Matt November 11, 2024Nov 11
Posted August 17, 20186 yr I am seeing a huge increase in errors from my site's tags. The errors are 429, and in Webmaster tools it says: Quote Googlebot couldn't access the contents of this URL because the server had an internal error when trying to process the request. These errors tend to be with the server itself, not with the request. I believe these are related to the flood control setting in admin...so google's bot is being blocked from the search due to the flood control setting. My question is this--the links listed all actually work, I tested them. Since such a spike in errors could have a negative impact in google search, what is the best way to handle this? A "nofollow" on the tags? Any idea how to do that?
August 17, 20186 yr Author PS - Another option is robots.txt and Disallow: /tags/ but do we really not want to index these? It seems like this would be good search engine content. A better option might be rel="nofollow" or rel="nofollow noindex" but that would likely cause the same thing--the links would be excluded from the index.
August 22, 20186 yr Author I thought of another possible approach, but I doubt this is allowed in the robots.txt. Could you put a crawl-delay on a specific directory, for example: Allow: /tags/ Crawl-delay: 60 Ever hear of this?
August 23, 20186 yr No, those options won't work. You can either prevent bots from indexing the content at all (via robots.txt), slow down their crawling globally (you do this in webmaster tools, not via crawl-delay directive), accept that Google will get some 429 responses which is ok (it doesn't hurt SEO, just limits how quickly they can crawl those pages), or remove the search flood control.
August 24, 20186 yr At the moment tag pages are not being indexed by default. The ipb default setting is „noindex“. Just a hint.
August 24, 20186 yr Author I am using the latest version of IPB, and in the Pages app my tags do not include noindex: <a href="https://www.url.com" class='ipsTag' title="Find other content tagged with 'tag text'" rel="tag"> I also don't see this an a setting option in the ACP.
August 24, 20186 yr Author So I found the tag template: <a href="{url="app=core&module=search&controller=search&tags={$urlEncodedTag}" seoTemplate="tags"}" class='ipsTag' title="{lang="find_tagged_content" sprintf="$tag"}" rel="tag"> and changed it to: <a href="{url="app=core&module=search&controller=search&tags={$urlEncodedTag}" seoTemplate="tags"}" class='ipsTag' title="{lang="find_tagged_content" sprintf="$tag"}" rel="tag noindex"> I guess this would be the best solution to avoid having spiders potentially running many searches per minute/second, and to stop the google errors.
August 26, 20186 yr Author I wanted to make a correction here--the "noindex" is only a meta tag attribute, not a link attribute, so I ended up with a simple Disallow: /tags/ in my robots.txt. In a perfect world I could simply turn off the flood control, but then google and other spiders would likely be running these searches constantly, which would affect performance.
August 27, 20186 yr As I mentioned above all tag pages are already marked with the meta tag attribute „noindex“. Just open a tag page and check out the source code. But this meta tag has nothing to do with how often Google visits those pages.
August 30, 20186 yr Author I did look for the "noindex" as an attribute, and in may page I do not see this. In any case, noindex may still, at least to google, mean ok to crawl, just no ok to index. If it really is noindex then people should have the block in robots.txt to stop google from crawling it.
Archived
This topic is now archived and is closed to further replies.