Luca Barzelogna Posted September 2, 2022 Posted September 2, 2022 Hi, I have a big problem with my forum, there are months that google don't crawl and index the forum. In search console the sitemaps won't work. The address of the sitemap is right but is right but search console don't analize it.
teraßyte Posted September 2, 2022 Posted September 2, 2022 Do you get any specific errors? Maybe you have a robots.txt rule that is stopping google from indexing your site.
Luca Barzelogna Posted September 2, 2022 Author Posted September 2, 2022 Hi and thanks for reply, the robots.txt is directly ownen by ipb forum. I do not modify it
teraßyte Posted September 2, 2022 Posted September 2, 2022 Ok, so robots.txt is not the problem. What do you get if you try to submit the sitemap in Google Webmaster Tools? If there are issues it should say something there.
teraßyte Posted September 2, 2022 Posted September 2, 2022 I can see the sitemap just fine when I view the url: https://www.vespaonline.com/sitemap.php There must be something server side that is blocking Google's IP or useragent. It should show more details about the error if you click the bottom-right arrow in your screenshot.
Mark H Posted September 2, 2022 Posted September 2, 2022 As terabyte noted that error may be happening because your server IP or URL is blocked by Google, or that your server is blocking Google. You'd need to check with your Host to see if either condition above is the cause. Senior2323 1
opentype Posted September 2, 2022 Posted September 2, 2022 1 hour ago, Mark H said: As terabyte noted that error may be happening because your server IP or URL is blocked by Google, or that your server is blocking Google. There is another indication for that: When doing a “site:” search, Google says “no information available” and links to this: https://support.google.com/webmasters/answer/7489871?hl=en
Luca Barzelogna Posted September 3, 2022 Author Posted September 3, 2022 User-agent: * Crawl-Delay: 30 User-agent: AhrefsBot Disallow: / User-agent: MJ12bot Disallow: / this is the actual robots.txt
Luca Barzelogna Posted September 4, 2022 Author Posted September 4, 2022 Hi, the sitemap was succesfully load but google find only 5544 pages, it's impossible! they are hundred of thousand what can be wrong?
Dll Posted September 4, 2022 Posted September 4, 2022 Nothing, it won't list every single page on your forum in the sitemap.
Solution Sonya* Posted September 4, 2022 Solution Posted September 4, 2022 On 9/3/2022 at 2:04 AM, Luca Barzelogna said: User-agent: * Crawl-Delay: 30 User-agent: AhrefsBot Disallow: / User-agent: MJ12bot Disallow: / this is the actual robots.txt This is not IPS original. It is modified. You do not allow to index your website for any bot. Disallow applies in your case for User-agents lines above, like this: User-agent: * User-agent: AhrefsBot Disallow: / I recommend using robots.txt generated by IPS. It will exclude tons of duplicate URLs or those with no or poor content. If you block anything, your site will not be indexed. If you allow everything, this is also bad for SEO, as Google has to crawl many URLs that have no benefit for you. Read more here
opentype Posted September 4, 2022 Posted September 4, 2022 8 hours ago, Luca Barzelogna said: Hi, the sitemap was succesfully load but google find only 5544 pages, it's impossible! they are hundred of thousand what can be wrong? Keep in mind that a sitemap just says to Google to check out those pages. Indexing can still fail when the pages are blocked from indexing in any way. Jim M and Marc 2
Recommended Posts