Jump to content

Featured Replies

Posted

As you may noticed from my recent topics, I am trying to improve our website's SEO.

One thing that caught my eye in our website's Google Search Console was the Coverage > Blocked due to access forbidden (403) report, with 38.7k entries.

I noticed that most URLs have repeated patterns such as ?advanced_search_submitted=, ?do=reportComment, and ?do=markRead.

My question is very simple: for SEO strategy, should we add those patterns to our robots.txt, or this wouldn't matter at all?

My logic here: if we place these in robots.txt, Google won't try to crawl these pages, saving crawling time/bandwidth.

Cheers!

Edited by Gabriel Torres

Solved by Marc

Go to solution

Have you updated to the latest version? There were some enhancements around this added.

 

Edited by AlexWebsites

3 hours ago, Gabriel Torres said:

@AlexWebsites Yes, and the latest version doesn't deal with the patterns I am talking about, hence me opening this topic.. 😉

 

Ah yes, I see those strings are not added to the dynamic robots.txt. Good call out, I’ll have to check on my end as well.

Thank you for bringing this issue to our attention! I can confirm this should be further reviewed and I have logged an internal bug report for our development team to investigate and address as necessary, in a future maintenance release.

 

 

This is what the parameters in my URL look like.

plus-123.png.c455d310abfc69118676488a0b2ed85e.png

  • Author

@Marc Stridgen Since we run a very large community, let me know if you need any report or access to our Google Search Console. I'd gladly give you full access.

  • 2 weeks later...
  • Author

@Marc Stridgen Just a follow up regarding this specific report from Google (Coverage > Excluded > Blocked due to access forbidden (403)) and the changes I implemented here in my install that might be added in future versions.

I added the following to our robots.txt:

Disallow: /theme/
Disallow: /*/?do=markRead
Disallow: /*/?do=reportComment

Note that I didn't mention the /theme/ issue in my original post, but after analyzing the report, I found several /theme/?csrfKey=xxxxx links listed.

I searched all templates and added rel='nofollow' to all links related to markRead and reportComment. This includes templates from Pages as well. There were several templates that I had to edit.

The ?advanced_search_submitted= from my original post must be diregarded. This was caused by a link structure we had in our old, custom layout.

With these changes, the number of "affected pages" dropped from 38.7k to 26.1k so far, and we hope to see this number dropping further in the next few weeks.

Cheers

Edited by Gabriel Torres

also ?do=showReactionsComment because this parameter is also displayed to me in the "Affected pages"

Edited by SeNioR-

Oh wait these pages already have a "noindex" tag so no entry in robots.txt is needed.

Edited by SeNioR-

Thank you for your feedback. The topic is added to our bug report, so would be reviewed on this being addressed in any case

  • Author

New addition to robots.txt for the same issue:

Disallow: /*?*do=clearFilters
Disallow: /*/?do=report

The following Pages templates must be updated to include rel='nofollow':

Listing > filterMessage (for the do=clearFilters)

Display > record (for the do=report)

  • 3 weeks later...

Recently Browsing 0

  • No registered users viewing this page.