Jump to content

a lot of bot-visits since upgrade to 2.3.1


Guest Mesmer

Recommended Posts

Posted

Actually, I *THINK*, and Brandon will have to back me up on this.. That spiders, if they are entered in your bot list, even if they are coming from multiple IPs, would only be counted once as a visitor to the site.

Problem is.. The Yahoo spider has changed where it appears to come from (Used to be yahoo and now it's slurp) and Yahoo isn't a default spider in the spider list... slurp@inktomi=Hot Bot is in there, but does that get caught as a Yahoo spider? I don't think it does., but I could be wrong.

So, even if you had 100 Yahoo IPs crawling your site, they'd all be the same spider, and count as one user..

Of course, spiders tend to melt my brain.. It's like the online list showing 15 entries on one page, then 7 on another and 2 on another and 1 on another, then back to 15 on the next page.. I just had to get brandon to 'splain it and I put it in a file and copy and paste that when someone asks.

This is a limitation/pseudo-but of the online list in it's present form. It's a little technical, but you may be able to follow. Basically, the page counts are derived from the number of active sessions. So if there are 100 active sessions with 25 sessions per page, then there will be 4 pages. Then, IP.Board will pull the 25 sessions that should display on the page you are viewing (i.e. on the first page, IP.Board will pull the most recent 25 sessions from the database.



Now here is where the problem kicks in - if the sessions are from the same IP address, IP.Board will not display them more than once. For example, Yahoo Slurp is/was on your board with 20 sessions from the same IP address (it is not abnormal for this to happen with search engine spiders). However, because they are all from the same IP address, when rendering the page IP.Board will only show this "user" once - so it is quite possible to have less than 25 sessions displayed on the page, even though there should be 25 results per page. In fact, it is even possible to have 1 session listed on a page that should have 25 results.



Just looking at that makes my head hurt.
Posted

Other robots.txt checker say

Crawl-delay

is invalid syntax :huh:


Then your robots.txt checker should to be fixed. It is valid for both Slurp and MSN crawlers (a few other too I believe). If you had

User-agent: *
Crawl-delay: 30


that would be invalid. Using a Slurp-specific extension to robots.txt is valid.

Posted

Actually, I *THINK*, and Brandon will have to back me up on this.. That spiders, if they are entered in your bot list, even if they are coming from multiple IPs, would only be counted once as a visitor to the site.



Problem is.. The Yahoo spider has changed where it appears to come from (Used to be yahoo and now it's slurp) and Yahoo isn't a default spider in the spider list... slurp@inktomi=Hot Bot is in there, but does that get caught as a Yahoo spider? I don't think it does., but I could be wrong.



So, even if you had 100 Yahoo IPs crawling your site, they'd all be the same spider, and count as one user..



Of course, spiders tend to melt my brain.. It's like the online list showing 15 entries on one page, then 7 on another and 2 on another and 1 on another, then back to 15 on the next page.. I just had to get brandon to 'splain it and I put it in a file and copy and paste that when someone asks.


Just looking at that makes my head hurt.



This is very annoying. Few rows on each page.. rows 2 page 1, rows 1 on page 2, etc. This should be fixed by now...

bfarber... fix this for next version :)
  • 3 weeks later...
Posted

.. slurp@inktomi=Hot Bot is in there, but does that get caught as a Yahoo spider? I don't think it does., but I could be wrong.


You're right. We have hundreds from the 74.6.*.* block, but they are all just Guest and we see no Slurp.

I put 74.6.*.* in the ban filter, but sort of doubt that will work. The trouble is, the thing is bringing our server to its knees.

Edit: Ha! that seems to have worked. Online visitors dropped from 800 to 500. I'll let it back in after a while, we like being on search pages - it's just that too much is too much.
Posted

Turns out I had to change slurp@inktomi=Yahoo Bot to slurp=Yahoo Bot. Now it is correctly identified and behaves politely.

Archived

This topic is now archived and is closed to further replies.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...