Jump to content

Featured Replies

Posted

Hi, 

I have a big problem with my forum, there are months that google don't crawl and index the forum. In search console the sitemaps won't work. The address of the sitemap is right but  is right but search console don't analize it.

 

Solved by Sonya*

Go to solution
  • Community Expert

Do you get any specific errors? Maybe you have a robots.txt rule that is stopping google from indexing your site.

  • Author

Hi and thanks for reply, the robots.txt is directly ownen by ipb forum. I do not modify it

  • Community Expert

Ok, so robots.txt is not the problem.

What do you get if you try to submit the sitemap in Google Webmaster Tools? If there are issues it should say something there.

  • Community Expert

I can see the sitemap just fine when I view the url: https://www.vespaonline.com/sitemap.php

 

There must be something server side that is blocking Google's IP or useragent. It should show more details about the error if you click the bottom-right arrow in your screenshot.

As terabyte noted that error may be happening because your server IP or URL is blocked by Google, or that your server is blocking Google. 

You'd need to check with your Host to see if either condition above is the cause.

 

As terabyte noted that error may be happening because your server IP or URL is blocked by Google, or that your server is blocking Google. 

There is another indication for that: When doing a “site:” search, Google says “no information available” and links to this: https://support.google.com/webmasters/answer/7489871?hl=en

  • Author
User-agent: * 
Crawl-Delay: 30

User-agent: AhrefsBot 
Disallow: /

User-agent: MJ12bot
Disallow: /

this is the actual robots.txt

  • Author

Hi, the sitemap was succesfully load but google find only 5544 pages, it's impossible! they are hundred of thousand what can be wrong?

 

Could contain: Text, Page, Word, Paper

Nothing, it won't list every single page on your forum in the sitemap. 

  • Solution
 
User-agent: * 
Crawl-Delay: 30

User-agent: AhrefsBot 
Disallow: /

User-agent: MJ12bot
Disallow: /

this is the actual robots.txt

This is not IPS original. It is modified. You do not allow to index your website for any bot. Disallow applies in your case for User-agents lines above, like this:

User-agent: * 
User-agent: AhrefsBot
Disallow: /

I recommend using robots.txt generated by IPS. It will exclude tons of duplicate URLs or those with no or poor content. If you block anything, your site will not be indexed. If you allow everything, this is also bad for SEO, as Google has to crawl many URLs that have no benefit for you.

Read more here 

 

 

Hi, the sitemap was succesfully load but google find only 5544 pages, it's impossible! they are hundred of thousand what can be wrong?

Keep in mind that a sitemap just says to Google to check out those pages. Indexing can still fail when the pages are blocked from indexing in any way. 

Recently Browsing 0

  • No registered users viewing this page.