Invision Community 4: SEO, prepare for v5 and dormant account notifications By Matt Monday at 02:04 PM
marklcfc Posted June 28, 2019 Posted June 28, 2019 Do we still use these? If so is there a recommended setup? My host has recommended it as I'm getting hit by a lot of these, coupled with 503 service unavailable errors Googlebot/2.1; +http://www.google.com/bot.html YandexImages/3.0; +http://yandex.com/bots AhrefsBot/6.1; +http://ahrefs.com/robot/ MJ12bot/v1.4.8; http://mj12bot.com/ bingbot/2.0; +http://www.bing.com/bingbot.htm MojeekBot/0.6; +https://www.mojeek.com/bot.html BrandVerity/1.0 (http://www.brandverity.com/why-is-brandverity-visiting-me)" SemrushBot/3~bl; +http://www.semrush.com/bot.html)
bfarber Posted June 28, 2019 Posted June 28, 2019 Those are are spider identification strings, but on their own that wouldn't exactly do anything if you added it to robots.txt. What exactly is your host recommending? For instance, if you need to throttle googlebot because they are hitting your community too much, you would need to do that in your Webmaster Tools account.
marklcfc Posted June 28, 2019 Author Posted June 28, 2019 15 minutes ago, bfarber said: Those are are spider identification strings, but on their own that wouldn't exactly do anything if you added it to robots.txt. What exactly is your host recommending? For instance, if you need to throttle googlebot because they are hitting your community too much, you would need to do that in your Webmaster Tools account. I've been trying to find out why I keep getting 503 server unavailable errors, its been happening for the past month. They seem to suggest it was when the site was busy but it wasn't, and I expect my site to be much busier than the periods these errors came up. They are suggesting it was a lot of hits from BrandVerity.
bfarber Posted June 28, 2019 Posted June 28, 2019 That's entirely possible - sometimes rogue bots can consume a LOT of server resources. Generally speaking, if that's the case, you don't block those bots with robots.txt because you're relying on the bot to actually honor robots.txt which isn't a guarantee. I would suggest that, instead, you may wish to block the bot at the firewall level (or, if not possible, use .htaccess to block any IP addresses associated with that bot). Now - your host included googlebot and many others in the list which I would definitely NOT recommend blocking, unless you want to basically delist your site from all search engines. Be careful in other words. Some bots don't really matter, but some (like googlebot) definitely do.
marklcfc Posted June 28, 2019 Author Posted June 28, 2019 Yeah I wouldn't have blocked them anyway, I wasn't comfortable blocking any since I've not had any issues like this for years until the past month for some reason?
Recommended Posts
Archived
This topic is now archived and is closed to further replies.