Gabriel Torres Posted December 18, 2021 Posted December 18, 2021 (edited) As you may noticed from my recent topics, I am trying to improve our website's SEO. One thing that caught my eye in our website's Google Search Console was the Coverage > Blocked due to access forbidden (403) report, with 38.7k entries. I noticed that most URLs have repeated patterns such as ?advanced_search_submitted=, ?do=reportComment, and ?do=markRead. My question is very simple: for SEO strategy, should we add those patterns to our robots.txt, or this wouldn't matter at all? My logic here: if we place these in robots.txt, Google won't try to crawl these pages, saving crawling time/bandwidth. Cheers! Edited December 18, 2021 by Gabriel Torres SeNioR- 1
AlexWebsites Posted December 18, 2021 Posted December 18, 2021 (edited) Have you updated to the latest version? There were some enhancements around this added. Edited December 18, 2021 by AlexWebsites
Gabriel Torres Posted December 19, 2021 Author Posted December 19, 2021 @AlexWebsites Yes, and the latest version doesn't deal with the patterns I am talking about, hence me opening this topic.. 😉 Ibai and AlexWebsites 2
AlexWebsites Posted December 19, 2021 Posted December 19, 2021 3 hours ago, Gabriel Torres said: @AlexWebsites Yes, and the latest version doesn't deal with the patterns I am talking about, hence me opening this topic.. 😉 Ah yes, I see those strings are not added to the dynamic robots.txt. Good call out, I’ll have to check on my end as well. SeNioR- and Ibai 2
Marc Posted December 20, 2021 Posted December 20, 2021 Thank you for bringing this issue to our attention! I can confirm this should be further reviewed and I have logged an internal bug report for our development team to investigate and address as necessary, in a future maintenance release. SeNioR- and Gabriel Torres 1 1
SeNioR- Posted December 20, 2021 Posted December 20, 2021 This is what the parameters in my URL look like.
Gabriel Torres Posted December 20, 2021 Author Posted December 20, 2021 @Marc Stridgen Since we run a very large community, let me know if you need any report or access to our Google Search Console. I'd gladly give you full access. Marc and SeNioR- 2
Gabriel Torres Posted January 1, 2022 Author Posted January 1, 2022 (edited) @Marc Stridgen Just a follow up regarding this specific report from Google (Coverage > Excluded > Blocked due to access forbidden (403)) and the changes I implemented here in my install that might be added in future versions. I added the following to our robots.txt: Disallow: /theme/ Disallow: /*/?do=markRead Disallow: /*/?do=reportComment Note that I didn't mention the /theme/ issue in my original post, but after analyzing the report, I found several /theme/?csrfKey=xxxxx links listed. I searched all templates and added rel='nofollow' to all links related to markRead and reportComment. This includes templates from Pages as well. There were several templates that I had to edit. The ?advanced_search_submitted= from my original post must be diregarded. This was caused by a link structure we had in our old, custom layout. With these changes, the number of "affected pages" dropped from 38.7k to 26.1k so far, and we hope to see this number dropping further in the next few weeks. Cheers Edited January 1, 2022 by Gabriel Torres SeNioR- 1
SeNioR- Posted January 2, 2022 Posted January 2, 2022 (edited) also ?do=showReactionsComment because this parameter is also displayed to me in the "Affected pages" Edited January 2, 2022 by SeNioR- Gabriel Torres 1
SeNioR- Posted January 2, 2022 Posted January 2, 2022 (edited) Oh wait these pages already have a "noindex" tag so no entry in robots.txt is needed. Edited January 2, 2022 by SeNioR- Gabriel Torres 1
Marc Posted January 4, 2022 Posted January 4, 2022 Thank you for your feedback. The topic is added to our bug report, so would be reviewed on this being addressed in any case SeNioR- 1
Gabriel Torres Posted January 9, 2022 Author Posted January 9, 2022 New addition to robots.txt for the same issue: Disallow: /*?*do=clearFilters Disallow: /*/?do=report The following Pages templates must be updated to include rel='nofollow': Listing > filterMessage (for the do=clearFilters) Display > record (for the do=report) SeNioR- and Marc 2
Solution Marc Posted January 25, 2022 Solution Posted January 25, 2022 This issue has been resolved in 4.6.10 beta 1. Feel free to give this a try, or await the full release if you prefer Gabriel Torres 1
Recommended Posts