Axel Wers Posted May 26, 2012 Posted May 26, 2012 I think something like /page-2/ is better than /?st=20
realmaverickuk Posted May 26, 2012 Posted May 26, 2012 Check that that area of the template (topicViewTemplate) is wrapped in these tags. It was added in 3.3 to prevent those errors. <if test="canDeleteUrls:|:!$this->member->is_not_human"> // Delete stuff set up ipb.topic.deleteUrls['hardDelete'] = new Template( ipb.vars['base_url'] + "app=forums&module=moderate§ion=moderate&do=04&f={$forum['id']}&t={$topic['tid']}&st={$this->request['st']}&auth_key={$this->member->form_hash}&p=#{pid}" ); ipb.topic.deleteUrls['softDelete'] = new Template( ipb.vars['base_url'] + "app=forums&module=moderate§ion=moderate&do=postchoice&tact=sdelete&t={$topic['tid']}&f={$forum['id']}&auth_key={$this->member->form_hash}&selectedpids[#{pid}]=#{pid}&pid=#{pid}" ); </if> I don't have any of those errors in webmaster tools. I always keep my templates up to date. ;) Mine is also wrapped in these tags. Not sure then, how they are still in my source :s
Rimi Posted May 26, 2012 Posted May 26, 2012 Mine is also wrapped in these tags. Not sure then, how they are still in my source :sYou mean they're in the source when you fetch from googlebot?
realmaverickuk Posted May 26, 2012 Posted May 26, 2012 Whether I view as a guest or Googlebot, I see those links in my code. I'm not sure what I could have missed, as we went over it closely.
Lewis P Posted May 26, 2012 Posted May 26, 2012 Hmm... weird. It's the same variable that restricts the skin/language chooser.
realmaverickuk Posted May 26, 2012 Posted May 26, 2012 I worry a little about anything that serves or removes something for search engines only. In 2011 they tightened up on anything cloaking related. I know this kind of thing is only low level but it still worries me a bit. I always wished I could remove that massive chunk of JS at the top of each page, just for Google bots.
realmaverickuk Posted May 26, 2012 Posted May 26, 2012 Just noticed this thread is now ranking, for 7 unique pages :D
Steven UK Posted May 27, 2012 Author Posted May 27, 2012 Just noticed this thread is now ranking, for 7 unique pages :D Google will move 6, if not all of them soon enough. My test... [color=#ff0000]I am going to place this small test here on this thread.[/color] As of right now, a thread that was created yesterday is position 6 on the first page of Google, right underneath their own website, and in front of many competing websites for the same keyword:http://www.google.co...iw=1262&bih=738 (if you click this you may still see the old cache of Google search, you would need to do a fresh search to get the latest results) There are no {strings} attached, and there is only one page on the thread. Hopefully it won't change. There are no tags associated, and it is just static now, and SHOULD stay that way. Let's check it day by day, to see what happens to it. My test: The thread is now indexed as: makemoneyforum.com/topic/1413-littlefishblueprintcom-scam-read-my-review/page__p__5102 And has been demoted a position, and if previous tests are anything to go by, will be removed from page one by this time next week. The original thread has now been removed from Google.Why are those strings still being indexed even though we have been told our template, and files are now bang up to date? I have also just been checking in admin, and what we are being found for, and most of them are now page_20_ or similar, (demoted) with the original thread removed by the search engines. All this is now literally killing my forum, and all the effort we have put in.
Weppa333 Posted May 27, 2012 Posted May 27, 2012 Post 208 here above is one example of googlebot simply ignoring canonical tags, it's also my own experience, canonical is "at best" considered as a clue ( like keywords and description ) and most of the time are simply discarded altogether ( like keywords ) And again, I already typed here that googlebot is NOT the only way googe sees content. If this page, with that URL, has adsense codes on it, google is informed that this page, with its "paramaters" is where traffic is coming from. Same thing for google+ buttons, facebok likes. this is all wrong. Facbeook "like" should never point to such complicated URLs, yourself admit it by having a "canonical" URL in the code, that IPB does not use in its social buttons code... I could go on forever. Thare are at least 5 links to the same content in IPB where VB only has one, that (imho) the main problem that people complain about.
Weppa333 Posted May 27, 2012 Posted May 27, 2012 Why are those strings still being indexed even though because of the thread preview system
Management Matt Posted May 27, 2012 Management Posted May 27, 2012 Take a look in skin_topics.php for topicViewTemplate. Look for <meta itemprop="interactionCount" content="UserComments:{parse expression="intval($topic['replies'] + 1)"}" /> Change to <meta itemprop="interactionCount" content="UserComments:{parse expression="intval($topic['posts'] + 1)"}" /> This meta tag was incorrectly reporting 1 post in the topic which may have been taken into consideration by Google and confused it.
Rimi Posted May 27, 2012 Posted May 27, 2012 Hey Matt, can I ask you a quick question? Does any element with itemprop in it need to be inside an element that has "itemscope itemtype='whatever'". I ask because I've moved lots of things around for my topicviewtemplate. I'm assuming the answer is yes, but thought I'd ask...never used this stuff before.
Management Matt Posted May 28, 2012 Management Posted May 28, 2012 No, not always. Check out schema.org for more information.
Management Matt Posted May 28, 2012 Management Posted May 28, 2012 Ok, if you want to experiment a little and are happy to modify files, then you can try this. Note: this is not supported by IPS so please be aware of that. It's a quickly tested change to the FURL structure. To have URLs like : /topic/123-test/?st=40 - try this: Open up admin/extensions/coreVariables.php Look for 'public function fetchTemplates' make that __data__ block: 'end' => '/', 'varBlock' => '/?', 'varSep' => '=' ),'__data__' => array( 'start' => '-', To ensure old links are 301 correctly: Open /index.php and add this near the top (after the first <?php ): { preg_match( '#(.*)(page__.*)$#', $_SERVER['REQUEST_URI'], $matches ); $url = $matches[1]; $query = $matches[2]; $query = str_replace( 'page__', '?', $query ); $query = str_replace( '__', '=', $query ); header( "HTTP/1.1 301 Moved Permanently" ); header( "Location: http://" . $_SERVER['HTTP_HOST'] . $url . $query ); exit(); }if ( strstr( $_SERVER['REQUEST_URI'], '/page__' ) ) Log into Admin CP System -> Tools & Settings > Cache Management Rebuild FURL cache. - Done
Management Matt Posted May 28, 2012 Management Posted May 28, 2012 makemoneyforum.com/topic/1413-littlefishblueprintcom-scam-read-my-review/page__p__5102 Why are those strings still being indexed even though we have been told our template, and files are now bang up to date? The /unread/ link (found in 'forum last post' in board index) will load that page. I've just changed it for 3.3.3 so it loads the newer URL (/topic/123-here/#entry123) rather than the /page__p__123 syntax.
Lewis P Posted May 28, 2012 Posted May 28, 2012 ... Done that. Works fine. BUT, the URL looks a little weird when clicking on the topic name on the board index:http://site.co.uk/topic/2499-topic/?pid=336654=st=4720#entry336654 ?pid=336654=st=4720 - Should perhaps be ?pid=336654&st=4720 ?
Management Matt Posted May 28, 2012 Management Posted May 28, 2012 Done that. Works fine. BUT, the URL looks a little weird when clicking on the topic name on the board index:http://site.co.uk/topic/2499-topic/?pid=336654=st=4720#entry336654 ?pid=336654=st=4720 - Should perhaps be ?pid=336654&st=4720 ? No - that is fine. IPB understands it. I want to extend that part of the FURL templates to allow for a different separator and param conjoin. But for 3.3.3 that will work just fine even if it does look a little odd to a human.
Steven UK Posted May 28, 2012 Author Posted May 28, 2012 The /unread/ link (found in 'forum last post' in board index) will load that page. I've just changed it for 3.3.3 so it loads the newer URL (/topic/123-here/#entry123) rather than the /page__p__123 syntax. Hi Matt, I would prefer, that the urls just stay static, and stays as: makemoneyforum.com/topic/1413-littlefishblueprintcom-scam-read-my-review/ Because by allowing Google to index them, it forces them to compete with the original static URL, and the original static url always loses out. How do I achieve that? because of the thread preview system Also, just thinking out loud here, but why does the IPB software allow this indexing of such URLS, when no other forum software does this? It needs to be stopped. 3.3.3, is this the version we are all using now? Because if so, this this forum is still using: community.invisionpower.com/topic/363264-seo-rankings-flying-all-over-the-place-and-why-is-this/page__st__160
Management Matt Posted May 28, 2012 Management Posted May 28, 2012 If you're a guest (as all bots are) then you should never see the /unread/ extension on the board index. For example, when I browse your forum as a guest, I only see the topic URL (no /unread/). The link takes you to the first post in the topic using no other extensions (such as #entry or /page__) When you say "don't allow indexing of those URLs" - you can't mean including pagination? I'm sure you don't mean that because we do want Google to index page 2 - we just want it to realise it's actually a page in a multi-page topic. All of this is a little frustrating when we use the correct canonical URL and we even use the rel 'next' and 'previous' to provide hints to the previous and next page URLs. That said, moving forward, I intend to make it clear what is a page and what is not. It might just be possible that Google is still indexing the 'page__f__123' links on your site because it's crawled them already. It might take a little while before they are dropped. This is why I'm strongly leaning towards this format for IPB 3.4: /topic/123-hello-world/page-2/?p=123 The canonical will be /topic/123-hello-world/page-2/ And the query string will pass a clear signal to google that it is a sorted variation of the canonical page.
Steven UK Posted May 28, 2012 Author Posted May 28, 2012 Hi Matt, The example I give is just a single page thread, but Google has indexed a NEW version of the page: page__p__5102 Instead of just the original thread "/". This is what we need to stop happening, because I don't even think Google are seeing them as related, but see it as a new version of the original thread, and thus dump the original into the supplemental results. Just as a matter of interest, Google WILL, and ARE placing all the pages of the same thread into supplementals. For example, as the software currently stands, if a thread has 10 pages, for example, then eventually, Google are throwing 9 of them into supplemental results. The following video is showing the supplemental results of a 15+ page on our forum. Here is a quick video showing you what I mean, and keep an eye on what has been 'omitted' by Google, as these are supplemental results (duplicate, not important in Google's eyes, and will not see the light of day):http://screencast.com/t/wwcuL3kSmJ (view it in full screen) Are you any further down the road to knowing why Google are competing the pages, instead of relating them? Because I also think this is what is happening with the page__p__5102 situation also, it is all related - (well, unrelated if you wanted to view it through Google's cynical eyes) Also, just a question, but what makes you think that using /?st=40, will force Google to relate the pages, instead of competing the pages? Thanks.
Management Matt Posted May 28, 2012 Management Posted May 28, 2012 My belief is that the /page__ extension confuses Google because it not only serves to indicate pages but also for various other parameters (like find post, p, etc). The /page__x smells a bit like a page extension but there is little consistency so Google won't be able to spot a pattern for its use. But lets turn this on its head. What would you expect Google to do with a 10 page topic? Would you expect Google to put the last 9 pages into supplemental results or would you expect them to nest the results or would you expect them to list all 10 pages separately?
Management Matt Posted May 28, 2012 Management Posted May 28, 2012 Additionally, perhaps it would be beneficial to change the <h1> on the page (currently the large topic title text next to your photo) so it's clear there also that it's not a competing page. SEO Rankings flying all over the place, and why is this.. (Page 2)
Management Matt Posted May 28, 2012 Management Posted May 28, 2012 Another interesting tip that I've seen floating about is to add: Page X of series In the meta description for multi-page topics. We could also move the "Page X" nearer the topic title in the TITLE also.
Steven UK Posted May 28, 2012 Author Posted May 28, 2012 My belief is that the /page__ extension confuses Google because it not only serves to indicate pages but also for various other parameters (like find post, p, etc). The /page__x smells a bit like a page extension but there is little consistency so Google won't be able to spot a pattern for its use. But lets turn this on its head. What would you expect Google to do with a 10 page topic? Would you expect Google to put the last 9 pages into supplemental results or would you expect them to nest the results or would you expect them to list all 10 pages separately? Matt. Google will nest them usually. But will only nest 5 results max, one core, and 4 underneath, and the rest they will place in "similar results found" not necessarily the supplemental, just the first link I clicked on that video above. Actually, if you have say 10 different threads, with a related keyword, Google would nest the threads, instead of the pages, but will still relate the pages to each individual thread in the 'similar results found'. Supplemental results is very different to 'similar results found'. Additionally, perhaps it would be beneficial to change the <h1> on the page (currently the large topic title text next to your photo) so it's clear there also that it's not a competing page. SEO Rankings flying all over the place, and why is this.. [size=3](Page 2)[/size] On the page 2, yes, I think that would help, and just keep it on page 1.Edit: Just to add to that. I think the {strings} that are being indexed, and which are also competing with the original threads, are doing far more damage than the duplicate pages, with the duplicate pages coming a close second. I say this, because most threads will more often that not, only have a single page, so the second page then does not come into the equation, but the [stringy] links always do. Here is another one I have just found that has overwritten the original URL is Google: /page__p__4982#entry4982 I don't think any of this should be crawled by Google.
Steven UK Posted May 28, 2012 Author Posted May 28, 2012 Just one more question Matt, please (a little off topic so apologies). The star rating system, you didn't make it 'crawl-able' on the iPBlog, did you? Is there a patch that makes it crawl-able, so it can be indexed, the same way the board stars are indexed?
Recommended Posts
Archived
This topic is now archived and is closed to further replies.