Jump to content

Requesting Google Sitemap


Maxxius

Do you want IPB to have sitemap generating function in its core?  

85 members have voted

You do not have permission to vote in this poll, or see the poll results. Please sign in or register to vote in this poll.

Recommended Posts

Posted

Here we go:

I've updated the sitemap generator application and it is now dubbed IPS Sitemap Generator v2.0.0. You can see what is new in the file listing, but key highlights include automatic priority calculation for topics, multiple sitemaps (of up to 50K URLs each) and blog/downloads support.

Posted

As for HTML sitemaps, I completely disagree. That's not an SEO feature, at best it constitutes a usability feature, but that's pushing it considering the sheer quantity of links it would have in it.



I'll let the Google Webmaster Blog, and Matt Cutts answer for me:


Where does Google, or any search engine say a sitemap should only be used for recent topics? Using stoo's sitemap:

/forumsitemap-1.xml.gz


Sitemap Oct 31, 2010 30 30


/topicsitemap-1.xml.gz


Sitemap Oct 30, 2010 20,000 19,080


/topicsitemap-10.xml.gz


Sitemap Oct 12, 2010 20,000 19,424


/topicsitemap-11.xml.gz


Sitemap Oct 30, 2010 20,000 19,368


/topicsitemap-12.xml.gz


Sitemap Oct 30, 2010 13,752 13,289


/topicsitemap-2.xml.gz


Sitemap Oct 30, 2010 20,000 19,041


/topicsitemap-3.xml.gz


Sitemap Oct 22, 2010 20,000 19,061


/topicsitemap-4.xml.gz


Sitemap Oct 31, 2010 20,000 19,156


/topicsitemap-5.xml.gz


Sitemap Nov 1, 2010 20,000 19,350


/topicsitemap-6.xml.gz


Sitemap Oct 30, 2010 20,000 19,245


/topicsitemap-7.xml.gz


Sitemap Oct 30, 2010 20,000 16,723


/topicsitemap-8.xml.gz


Sitemap Nov 1, 2010 20,000 16,284


/topicsitemap-9.xml.gz


Sitemap Oct 18, 2010 20,000 19,248


How is submitting only 20k, or 50k a good idea? This will only hurt large sites. They aren't called recent content maps, they are called sitemaps. I can't find the link now, but I remember reading about a change to the sitemap protocol to support huge sites. Billions of links. This wouldn't be necessary if they only expected a max of 50K to be submitted.

I think any IPS sitemap should contain all the content. I also think it should have dynamic recency and priority ratings. When a sitemap displays the same priority and frequency for every topic, search engines lose trust. New topics aren't indexed as fast, and most popular topics aren't aren't indexed as often.

Ideas on how you'd like to see priorities generated would be great.



Here's a snippet of vbseo's sitemap (free):

                if($vboptions['vbseo_sm_priority_smart'])

                   {

                       if($threadrow['sticky'])

                       {

                           $prior = 1;

                       }

                       else

                       {

                       $rate = $threadrow['votenum'] ? $threadrow['votetotal']/$threadrow['votenum'] : 0;

                       $relp1 = vbseo_math_avg_weight($threadrow['views'], 0, $st['maxv'], $st['avgv']);

                       $relp2 = vbseo_math_avg_weight($threadrow['replycount'], 0, $st['maxre'], $st['avgre']);

                       $relp3 = $rate/5;

                       $relp4 = $max_ping?$mp_array[$threadrow['threadid']]/$max_ping:0;


                       $relp = $relp1*0.45 + $relp2*0.25 + $relp3*0.15 + $relp4*0.15;


                       }

                   }

                  $prior = vbseo_sm_priority($vboptions['vbseo_sm_priority_rt'], $relp);


                if($vboptions['vbseo_sm_freq_tsmart'])

                   {

                       $dpassed = (time() - $threadrow['lastpost'])/86400;

                       if($dpassed<3)$freq = 'daily';

                       else if($dpassed<10)$freq = 'weekly';

                       else if($dpassed<100)$freq = 'monthly';

                       else $freq = 'yearly';

                   }else

                    $freq = $vboptions['vbseo_sm_freq_t'];




To calculate priority they use an algorithm comparing a value, to its maximum and average. For example the number of views a topic has received, versus the maximum views and average views of all topics. Then assign a percentage of importance. Here, topic views receive the most priority (45%), then number of replies (25%), topic rating (15%), and finally topicID (15%). I assume topicID is weighted to value new topics higher. Pinned topics receive maximum priority (1.0).

Frequency is determined by last post date; <3 days = daily, <10 days = weekly, <100 days = monthly, >100 days = yearly.

Again, this supports the concept of submitting a complete XML sitemap. How can you determine meaningful priority values if not all URLs are considered? Some of my most popular topics fall far outside the most recent 50k.

Posted

Is there a chance that we can see such auto-ranking added in and the limits removed? If so I'll likely move over to this Mod!




As I've said earlier in the topic, it's unlikely we'll be doing full sitemap generation any time soon... The server resources necessary to generate, store and serve that would be horrendous on large boards.

With that being said, a basic level of auto-prioritising has been built in for topics, it'll continue to be improved as time goes on.
Posted

As I've said earlier in the topic, it's unlikely we'll be doing full sitemap generation any time soon... The server resources necessary to generate, store and serve that would be horrendous on large boards.


Gzipped sitemaps served to a handful of search engines won't present either a storage or bandwidth issue. It really comes down to resources used to generate the sitemap. I think you've made an unfair generalization regarding server resources. Not all forum software have issues generating complete sitemaps for large forums. Not even all sitemaps for IPB forums. Stoo's sitemaps appears to scale reasonably well. From what I've read, Icelabz's IPB Sitemap scales extremely well. It's a coding challenge, not a resource challenge.

IPB needs a feature complete sitemap. I applaud IPB for offering a sitemap solution. I like a lot of what you are doing. I only ask that you also consider the needs of large sites.

Finally, it doesn't appear that the sitemap is being used here. Any plans to implement it? site:community.invisionpower.com at Google currently shows me 23,600 indexed links. I would expect more on a PR7 page rank domain. I count roughly 60k public topics (not including pagnation), plus 148k members, plus blogs, gallery, resources... seems you could benefit from XML and HMTL sitemaps.

Archived

This topic is now archived and is closed to further replies.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...