Jump to content

Unknown robot crawl


Recommended Posts

Posted

HI can anyone help with how to block an "Unknown robot (identified by 'crawl')
Last month it used 7GB, so far today it hs taken 264.25 MB


I have been googling for a week and tried robots.txt (banned everything except google) and it keeps coming back


I cant identify what it is or IP. Tried downloading my raw access logs but I cant open them


I also have some strange sites listed in my "Links from an external page" listing such as


[font=verdana, arial, helvetica, sans-serif][size=2][url="http://www.drugs-24h.com/drugs-women%27s_health.htm"]drugs-24h.com/drugs-women%27s_health.htm[/url] and [/size][/font]

[font=verdana, arial, helvetica, sans-serif][size=2][url="http://www.cheap-24h.com/Vogue-cigarettes"]www.cheap-24h.com/Vogue-cigarettes[/url][/size][/font]




Its doing my head in not being able to solve it

Posted

Do you have Awstats? If so you can see the top bandwidth users by IP address. You'll then be able to block those. Best done at server level e.g. cPanel IP Deny Manager.

The bad search engine spiders ignore robots.txt.

3DKiwi

Archived

This topic is now archived and is closed to further replies.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...