Invision Community 4: SEO, prepare for v5 and dormant account notifications Matt November 11, 2024Nov 11
Posted December 18, 20213 yr As you may noticed from my recent topics, I am trying to improve our website's SEO. One thing that caught my eye in our website's Google Search Console was the Coverage > Blocked due to access forbidden (403) report, with 38.7k entries. I noticed that most URLs have repeated patterns such as ?advanced_search_submitted=, ?do=reportComment, and ?do=markRead. My question is very simple: for SEO strategy, should we add those patterns to our robots.txt, or this wouldn't matter at all? My logic here: if we place these in robots.txt, Google won't try to crawl these pages, saving crawling time/bandwidth. Cheers! Edited December 18, 20213 yr by Gabriel Torres
December 18, 20213 yr Have you updated to the latest version? There were some enhancements around this added. Edited December 18, 20213 yr by AlexWebsites
December 19, 20213 yr Author @AlexWebsites Yes, and the latest version doesn't deal with the patterns I am talking about, hence me opening this topic.. 😉
December 19, 20213 yr 3 hours ago, Gabriel Torres said: @AlexWebsites Yes, and the latest version doesn't deal with the patterns I am talking about, hence me opening this topic.. 😉 Ah yes, I see those strings are not added to the dynamic robots.txt. Good call out, I’ll have to check on my end as well.
December 20, 20213 yr Thank you for bringing this issue to our attention! I can confirm this should be further reviewed and I have logged an internal bug report for our development team to investigate and address as necessary, in a future maintenance release.
December 20, 20213 yr Author @Marc Stridgen Since we run a very large community, let me know if you need any report or access to our Google Search Console. I'd gladly give you full access.
January 1, 20223 yr Author @Marc Stridgen Just a follow up regarding this specific report from Google (Coverage > Excluded > Blocked due to access forbidden (403)) and the changes I implemented here in my install that might be added in future versions. I added the following to our robots.txt: Disallow: /theme/ Disallow: /*/?do=markRead Disallow: /*/?do=reportComment Note that I didn't mention the /theme/ issue in my original post, but after analyzing the report, I found several /theme/?csrfKey=xxxxx links listed. I searched all templates and added rel='nofollow' to all links related to markRead and reportComment. This includes templates from Pages as well. There were several templates that I had to edit. The ?advanced_search_submitted= from my original post must be diregarded. This was caused by a link structure we had in our old, custom layout. With these changes, the number of "affected pages" dropped from 38.7k to 26.1k so far, and we hope to see this number dropping further in the next few weeks. Cheers Edited January 1, 20223 yr by Gabriel Torres
January 2, 20223 yr also ?do=showReactionsComment because this parameter is also displayed to me in the "Affected pages" Edited January 2, 20223 yr by SeNioR-
January 2, 20223 yr Oh wait these pages already have a "noindex" tag so no entry in robots.txt is needed. Edited January 2, 20223 yr by SeNioR-
January 4, 20223 yr Thank you for your feedback. The topic is added to our bug report, so would be reviewed on this being addressed in any case
January 9, 20223 yr Author New addition to robots.txt for the same issue: Disallow: /*?*do=clearFilters Disallow: /*/?do=report The following Pages templates must be updated to include rel='nofollow': Listing > filterMessage (for the do=clearFilters) Display > record (for the do=report)
January 25, 20223 yr Solution This issue has been resolved in 4.6.10 beta 1. Feel free to give this a try, or await the full release if you prefer