Jump to content

SEO: Blocked due to access forbidden (403) v. robots.txt


Go to solution Solved by Marc,

Recommended Posts

Posted (edited)

As you may noticed from my recent topics, I am trying to improve our website's SEO.

One thing that caught my eye in our website's Google Search Console was the Coverage > Blocked due to access forbidden (403) report, with 38.7k entries.

I noticed that most URLs have repeated patterns such as ?advanced_search_submitted=, ?do=reportComment, and ?do=markRead.

My question is very simple: for SEO strategy, should we add those patterns to our robots.txt, or this wouldn't matter at all?

My logic here: if we place these in robots.txt, Google won't try to crawl these pages, saving crawling time/bandwidth.

Cheers!

Edited by Gabriel Torres
  • 2 weeks later...
Posted (edited)

@Marc Stridgen Just a follow up regarding this specific report from Google (Coverage > Excluded > Blocked due to access forbidden (403)) and the changes I implemented here in my install that might be added in future versions.

I added the following to our robots.txt:

Disallow: /theme/
Disallow: /*/?do=markRead
Disallow: /*/?do=reportComment

Note that I didn't mention the /theme/ issue in my original post, but after analyzing the report, I found several /theme/?csrfKey=xxxxx links listed.

I searched all templates and added rel='nofollow' to all links related to markRead and reportComment. This includes templates from Pages as well. There were several templates that I had to edit.

The ?advanced_search_submitted= from my original post must be diregarded. This was caused by a link structure we had in our old, custom layout.

With these changes, the number of "affected pages" dropped from 38.7k to 26.1k so far, and we hope to see this number dropping further in the next few weeks.

Cheers

Edited by Gabriel Torres
  • 3 weeks later...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...