Tsvi Posted October 1, 2012 Posted October 1, 2012 Hello, recently I'm working on improving my SEO rank in Google for my board, and I've encountered a small issue (Or at least I think it's an issue, want to know your opinion). Every banned member in my board, when a normal user (Or when Google is crawling my website) is getting Error 403 from his page. I'm managing my board on Google Webmasters Tools and it's kinda troubling me when I'm getting that amount of HTTP Error 403. So I want to ask, what's the idea about that exactly? Why no let Google 'see' those banned members? Because Google says that they aren't removing pages with 403 Error from their search engine, in hope those pages will be accessible again in a short time, and I think it's not so suitable to our situation with the banned members. You can read about what I've found in here (second comment): https://groups.google.com/forum/#!msg/google_web_search_help-troubleshooting/qi0FQC5K728/R2AR1J1JcOMJ Now I've found in the code where it can be changed, found eventually this line: $this->registry->output->showError( 'profiles_not_active', '10246.1', null, null, 403 ); But I don't know if I should change it, so I'm waiting for an answer ;) Thanks in advance!
Mat Barrie Posted October 1, 2012 Posted October 1, 2012 The problem is that 403 is the most appropriate response for those pages. Banned member profiles are not accessible by regular users, but they are accessible to administrators - so a 404 is not appropriate as the content still exists. 403 represents that the content exists, but the viewer does not have access to it. This sort of thing was actually done way back because of SEO. I think you may be pushing rocks uphill wanting to get that changed.
Tsvi Posted October 1, 2012 Author Posted October 1, 2012 The problem is that 403 is the most appropriate response for those pages. Banned member profiles are not accessible by regular users, but they are accessible to administrators - so a 404 is not appropriate as the content still exists. 403 represents that the content exists, but the viewer does not have access to it. This sort of thing was actually done way back because of SEO. I think you may be pushing rocks uphill wanting to get that changed. Well, that's why I opened this topic. To find out why it's like that and not in other way. ;) Because for me it's not making any sense why it's returning 403... and I wanted to hear others about this matter. And frankly, what's so bad to let Google crawl in banned member's profile / let guests enter those profiles? Where's the SEO in that? ;O Anyways, in Google Webmasters Tools where I manage my board I always get a big list of URLs of banned members (as 403 error), what should I do with that? Because, rumors say that Google looks either on your Crawl Errors (No, I'm not a SEO genius). And if that's true, it's bad for me. And if it's bad for me, it's bad for every other administrator that runs IP.Board. That's why I'm doing this little research.
BigStamp Posted October 1, 2012 Posted October 1, 2012 And frankly, what's so bad to let Google crawl in banned member's profile / let guests enter those profiles? Where's the SEO in that? ;O Because banned members are normally spammers that will fill thier profile with links, especially if they know that even if they are banned they will still get benefit from the search engines.
Tsvi Posted October 1, 2012 Author Posted October 1, 2012 Because banned members are normally spammers that will fill thier profile with links, especially if they know that even if they are banned they will still get benefit from the search engines. Waahh, good point. Didn't think about it till now. So maybe it's possible to do a redirect to the index if you try to access banned member (If you're not a mod). Thanks ;)
Mark Posted October 2, 2012 Posted October 2, 2012 That would be even worse for SEO as you're telling the search engine that the content exists elsewhere. 403 is most appropriate when you access a page you don't have access to. Although, a better solution would be for us not to link them (like we do for guests). Of course, if you wanted, you could adjust your settings in the Admin CP to make banned profiles visible to search engines - tech support will be able to help with that if you're not sure how :)
Tsvi Posted October 2, 2012 Author Posted October 2, 2012 That would be even worse for SEO as you're telling the search engine that the content exists elsewhere. 403 is most appropriate when you access a page you don't have access to. Although, a better solution would be for us not to link them (like we do for guests). Of course, if you wanted, you could adjust your settings in the Admin CP to make banned profiles visible to search engines - tech support will be able to help with that if you're not sure how :smile: Hi Mark, I think i have a solution for this problem. The main problem is not with 'no to link' to those banned members. Because if I'm not mistaken there are no links at all to banned members, the main problem is that those profiles got indexed by Google in the past, and there are still traces of those links in Google, every time Google is crawling his known links he's getting this 403 error. What we really need here is to make those links to disappear. and HOW? Just optimize the profile page, that not only it'll return HTTP Error 403, BUT add a meta tag: <neta name="robots" CONTENT="noindex"> That will be the final solution for this problem.And you'll have to do the same when a topic / forum not found. Because I guess that you return 403 when a topic is hidden and when a forum isn't accessible by guests / members. I hope you really implement this. :)
Feld0 Posted October 3, 2012 Posted October 3, 2012 Looking through the list of HTTP status codes, 410 Gone looks to me like the semantically correct response to serve for banned member profiles: 10.4.11 410 Gone The requested resource is no longer available at the server and no forwarding address is known. This condition is expected to be considered permanent. Clients with link editing capabilities SHOULD delete references to the Request-URI after user approval. If the server does not know, or has no facility to determine, whether or not the condition is permanent, the status code 404 (Not Found) SHOULD be used instead. This response is cacheable unless indicated otherwise. The 410 response is primarily intended to assist the task of web maintenance by notifying the recipient that the resource is intentionally unavailable and that the server owners desire that remote links to that resource be removed. Such an event is common for limited-time, promotional services and for resources belonging to individuals no longer working at the server's site. It is not necessary to mark all permanently unavailable resources as "gone" or to keep the mark for any length of time -- that is left to the discretion of the server owner. The software has already been adapted to not render any links to banned members' profiles, to prevent them from getting indexed and seen. Code 410 will serve to confirm, "Yes, there's nothing to see here anymore; get rid of the link," if a bot does come crawling onto the URL one way or another.
Dmacleo Posted October 4, 2012 Posted October 4, 2012 if crawlers were not ever accessing any member profiles (using robots.txt) from the beginning this would not even have been an issue would it?
Mat Barrie Posted October 4, 2012 Posted October 4, 2012 Looking through the list of HTTP status codes, 410 Gone looks to me like the semantically correct response to serve for banned member profiles: The software has already been adapted to not render any links to banned members' profiles, to prevent them from getting indexed and seen. Code 410 will serve to confirm, "Yes, there's nothing to see here anymore; get rid of the link," if a bot does come crawling onto the URL one way or another. That would be wrong. 410 is essentially the same as 404 - it means "this content no longer exists". However, the content actually does exist. The only exception is that the RFC actually permits sending 404 instead of 403 if the application does not want to make the reason for the refusal to serve the content known to the client. So although 410 is not correct, 404 would still comply with the RFC. However, I understand Google reacts rather poorly to large amounts of 404 errors when crawling a site.
Recommended Posts
Archived
This topic is now archived and is closed to further replies.