Invision Community 4: SEO, prepare for v5 and dormant account notifications Matt November 11, 2024Nov 11
Posted December 25, 201212 yr Hello, What can be done to block spam bots beside uploading robots.txt?
December 25, 201212 yr robots.txt doesn't stop spam bots... It is for search engine crawlers... A combination of a good Question and Answer Challenge, IPS Spam Service and 3rd party hooks, such as http://community.invisionpower.com/files/file/5143-stop-spammer-registration/ Are the first steps to take...
December 25, 201212 yr Yeah ditto to Aiwa.., StopForumSpam is a great way. I've used SFS even on sites that are not forums it works great, and one thing I normally do if the board is busy is new users have to get atleast they're first post moderated before they can post freely. I find this just tends to keep the odd human spammer away as they can't be bothered, and the other ways prevent the non human ones. :smile: It's always best to use 3rd party tools, although there is lists of ip's that are known offenders this take up space in iptables which slows down web performance. It's better to just check these users on registration. robots.txt is basically useless, as most of these robots don't obey that file at all. It's best to have a proper firewall in place to weed out bad user agents. I like having multiple lines of defense, and even if the registration will fail it's just a waste of server resources. A lot of these bots are using either spoofed headers, or errors in the user-agent line. So an easy way to null those connections completely.
December 25, 201212 yr robots.txt is not useless... Legit crawlers DO obey it... And it can be used to SLOW the crawlers down so they don't eat resources on your board... Yes, there are crawlers that don't obey it... Baidu for example. But they are the exception and DO need to be handled via .htaccess if you don't want them crawling your board like mad...
December 25, 201212 yr There is a key distinction here.. Spam bots = bots that register and post... Crawlers = Search engines that crawl your sites as guest to get your content.. robots.txt is for CRAWLERS...
December 25, 201212 yr Author Hello, Actually I was talking about crawlers. I have uploaded robots.txt but it is not blocking bots. Is there a way to make the forum invisible to search engine bots? If there are more steps that can be taken to stop search engine bots, please mention about them.
December 25, 201212 yr Don't allow guests to view your content. Just so you're aware, if you block bots entirely you won't show up on search engine searches.
December 25, 201212 yr remember the default one is also commented out, its basically a template and needs some edits to actually work.
December 28, 201212 yr robots.txt is not useless... Legit crawlers DO obey it... And it can be used to SLOW the crawlers down so they don't eat resources on your board... Yes, there are crawlers that don't obey it... Baidu for example. But they are the exception and DO need to be handled via .htaccess if you don't want them crawling your board like mad... I didn't mean in general. I meant that robots that don't obey, obviously a robot.txt is useless since they don't obey it. But I see now OP is asking about crawlers anyway so kind of irrelevant now. :smile:
Archived
This topic is now archived and is closed to further replies.