Jump to content

Numbered

Members
  • Posts

    310
  • Joined

  • Last visited

  • Days Won

    1

Reputation Activity

  1. Like
    Numbered reacted to mark007 in Large community? You have a problems with sitemap!   
    No, you have to add the code after the line:
    After:
    $data = array( 'url' => $node->url() ); if (get_class($node) === 'IPS\forums\Forum' && isset($node->last_post)) { $data['lastmod'] = $node->last_post; }
  2. Like
    Numbered got a reaction from SeNioR- in Large community? You have a problems with sitemap!   
    Found one more sitemap problem.
    <lastmod> tag show generation time of the current sitemap file. It's right, but.. What is tell standard
    So, coming back to our case, now we have 5271 sitemap files. So google should get all of them! He get information 'it's modified! take it' and doesn't matter content inside changed or not. Moreover - inside sub-sitemap with url's we didn't have any <lastmod> tags. So google get very old url to subsitemap file, get it and see just list of urls without additional meta information.
     
    My proposal:
    add <lastmod> tag to every url inside all sub-sutemaps. It will tell google which urls contain new elements and which should be scan and it tell which one not changed and not need to re-scan => will optimize scan perfomance.
    Add <lastmod> tag to index sitemap file, which never tell date of this file generation - it should provide newer date of last modified url inside this file. With that google never download sitemap with 500 urls where no changes exist => will optimize scan perfomance.
    P.S. I'll try to create a patch. If i do this - i'll share it here (for other dev's checks and helping IPS).
    Thanks for you attension and support )
  3. Haha
    Numbered reacted to Midnight Modding in Large community? You have a problems with sitemap!   
    Off topic, but I was on a site a few days ago that has nearly 50 million posts! I would have been happy if I could ever get a forum with 100 posts per month. That's how pathetic mine were. lol.
  4. Like
    Numbered got a reaction from Daniel F in Large community? You have a problems with sitemap!   
    Sorry sorry -) I just want to make a main way for do needed stuff. And session class just as example . And i didn't know about convertLegacyParameters() method before. Thanks)
    I hook session class read method before for working with sso stuff. Totally agree with bad place for that )
    upd. deleted my code for prevent bad using it for others.. but it still saved in your quoting of my post.. 
  5. Haha
    Numbered reacted to Daniel F in Large community? You have a problems with sitemap!   
    NOPE, please not! For the record, I'm not talking about your use case, only about your hook... Just a minor suggestion, please use  IPS\Application: convertLegacyParameters() instead of overriding the session class. This would be the "official" place for such redirect, this definitely doesn't belong into the session object!

    You can see some examples in the core or forums app:)

     
  6. Thanks
    Numbered reacted to Matt in Large community? You have a problems with sitemap!   
    Ok, so it's worth rounding up what Invision Community DOES do in terms of SEO.
    Auto generated meta tags for description Custom meta tag editor for finer control in a single area Uses appropriate header codes 200 for OK, 301 for redirects, 303 for 'the page is actually here', 404 for not founds, 403s for permission errors, etc Uses appropriate HTML markup to highlight important content (h1, h2, etc) Uses rewritten URLs for a cleaner structure packed with keywords Creates and submits a sitemap to show Google which URLS are important to your community Uses nofollow where appropriate to stop pages like 'Contact Us' from being crawled Uses JSON-LD micro data markup to tell Google about what data they are seeing and how it should be used Allows easy integration with Google Search Console for tracking Uses https Has a responsive theme which gets the "Mobile Friendly" badge Here's what is coming in 4.3
    Meta description expanded to 300 characters Ability to rebuild your entire sitemap quickly Lastmod tag added to sitemap files Not to mention other retention tools like
    Bulk emailing tool available Emailed notifications Promote to social media Share to social media There seems to be a level of worry in this topic, and while I'm happy to field any questions you have, Google is a bit mysterious and prone to changing things overnight.
    We adhere to good standards and do all the right things as you can see from this list.
    We are not adverse to change and adding new features, but we never do it in a panic or with a knee-jerk until we get some hard evidence which supports the reason for change.
    We have been monitoring our own Google Search Console and clicks/impressions are up, indexes are down slightly, but Google has seen them and flagged them as 'discovered'. These tend to be profiles from people who have never posted (and we have about 200k of those alone).
    I do not believe we are facing any crisis, or that anything is substantially wrong.
    We can always do better, and we're listening. We just need a little more than a few charts to go on before we make drastic change.
  7. Like
    Numbered reacted to opentype in Large community? You have a problems with sitemap!   
    That question doesn’t really belong in this topic. 
    And it’s also not a big problem actually. You are allowed to block pages through the robots.txt. 
    To avoid this Google message in the future, open the sitemap settings in the ACP and turn off “profiles” by unchecking the “unlimited” and entering “0” instead. 
  8. Like
    Numbered reacted to opentype in Large community? You have a problems with sitemap!   
    Probably not, since there was still no causal connection shown between sitemap creating speed (the actual topic here) and decrease in indexing/ranking. So there is probably nothing to fix. 
     
    That might be, but IPS didn’t as well. Google changed things and you can check wether their algorithm updates (e.g. Fred in 2017) correlate with your problems. That would actually tell you more about what you could probably improve. 
  9. Like
    Numbered reacted to SebastienG in Large community? You have a problems with sitemap!   
    We can add admin to.
    in my case, with more than 1,200 sitemap, I think it's more useful to frequently generate the list of latest topics and blog comments than to regenerate everything.
    If i run the script every minute, my last topic sitemap will take more than 20 hours to build.
    So I modify the script that it forces the generation of all the x launches of the sitemap of the last sitemap of blogs and topics
  10. Sad
    Numbered reacted to sadams101 in Large community? You have a problems with sitemap!   
    I wanted to share the reply I got from Upgradeovec for the latest content for mycustomsitemapupdater.php, which, if you have a large site, apparently should be run once every second:
     
     
  11. Like
    Numbered reacted to Lindy in Large community? You have a problems with sitemap!   
    So many misconceptions.
    If @daveoh were to share his site, you would see it's very clean, well organized, has high quality content and a glance shows his backlink profile is strong. He also has little to no apparent ads.
    @Nesa while I understand your concerns and understand your frustration, the sitemap itself is not the cause of your issues. This was posted earlier today and was already indexed by Google hours ago. I've checked some of our larger enterprise sites and their content from today is also being indexed in a timely fashion. I just searched the topic you referenced and it too is listed. There's no magic bullet in 4.3 or even other software that's going to put you on the front page of Google or instantaneously push new content directly to Google.
    The big algorithm update of 2017 targeted a lot of sites with some losing as much as 90% of their traffic. Some of these sites flew under the radar with what Google considers low quality content and in Google's eyes, they were righting the scale. There are many factors that influence your position in Google, your index rate and frankly, whether or not Google just likes or hates you.
    Ads
    If your site has ads that detract from the user experience and content on your site, Google WILL penalize you and that was the single most biggest hit in the "Fred" update. An ad in the header and footer are ok. An affiliate link or "sponsored content" that is appropriately placed and blends with content is ok. Ads all over the page and in the sidebar are obnoxious to users and Google agrees and doesn't want to subject us to it (for which I'm, as a user, appreciative.)
    Backlinks
    Backlinks can account for up to 1/3 of your overall standing with Google -- that is, quality backlinks; not "hey, link to me and I'll link you back."
    Content Freshness/Quality
    You may very well have 15 million posts, but if they're all from 2003 and you get very little new content, your old content is going to have less value to Google and they may drop it. Keep it fresh, keep users engaged.
    Google has gotten very intelligent and can distinguish rehashing the same musings over fresh information. General discussion sites with loose/generic content won't likely fare as well as those with unique, relevant content. Pages with thin content (such as picture threads, a bunch of "lols", etc.) are likely to get hit by Google.
     
    Again, Google cleaned house in 2017 - many sites got hit, including those using Wordpress, vB, etc. There's just too many variables to get all up in arms over one component: a sitemap. A sitemap, even the complete absence of a sitemap is not going to cause your indexed pages to drop. If this has happened to you, it's because Google penalized you and dropped your content. A sitemap is also not a magic wand and Google will not consider it the be-all for indexed content. A sitemap is a good gap filler and provides Google additional insight, but most content is still indexed organically through links.
    4.3 will provide for the mentioned improvements to the sitemap. Given 4.3 is so close, I'm sorry, but it would not be worth re-engineering those improvements for 4.2 and again, it's not going to be the magic solution you believe it to be. Unfortunately, I can only suggest focusing on the above list as one or more of those items is the likely culprit for your site being dropped.
  12. Haha
    Numbered reacted to Nathan Explosion in Large community? You have a problems with sitemap!   
    Just to add some comic relief - a colleague of mine saw this topic over my shoulder earlier today. He asked what cryptocurrencies I was monitoring, and advised me to get out of the one in this post as soon as possible:
     
  13. Thanks
    Numbered got a reaction from SeNioR- in Large community? You have a problems with sitemap!   
    Support answered 
  14. Like
    Numbered got a reaction from supernal in Large community? You have a problems with sitemap!   
    IPS Sitemap generator using special database table source for refreshing - core_sitemap.
    Primary search engine source of sitemap is url https://example.com/sitemap.php which is list of sub-sitemap files. You can see list of that files proceed for this link.
    Each of that file contain no more than 1000 urls to specail pages (profile status, topic (without number of pages or comment) and other elements, with supported sitemap as core extension).
    One of our case is forum with more than 100k topics, more than 4.2kk posts and more than 6kk users. So with simply math we have 5214 sitemap files (you can simply count number of that files with command 
    select count(*) from core_sitemap; // 5214 Sitemap generator task run by default once per 15 minuts and update only one oldest element from that big list. With simple math we can try to answer question 'how many time we need for update everything?' (because users can post not only in newest and may post in some old topics... but.. new created topic will add to sitemap file only when ALL older files will newer than current file with new topic inside). So, how much time we need for update?
    5214*15 = 78210 minuts = 1303 hours = 54 days! 54! days! Search engine will add your newest content after 54 days after them posted. Incredible thing. Not believe? Or want to know this lag for your community? You can simple know your lag time with that sql:
    select FROM_UNIXTIME(updated,'%a %b %d %H:%i:%s UTC %Y') from core_sitemap order by updated asc limit 1; // Wed Nov 01 14:13:49 UTC 2017 Yep.. In our case oldest file last updated in 1 November...
    What we should do for fix it? Very fast solution - create a temp file, like a 'mycustomsitemapupdater.php' with this content:
    <?php require 'init.php'; $generator = new \IPS\Sitemap; $generator->buildNextSitemap(); $last = \IPS\Db::i()->select('FROM_UNIXTIME(updated, "%a %b %d %H:%i:%s UTC %Y")', 'core_sitemap', null, 'updated asc', 1)->first(); print_r('Oldest time now: ' . $last . PHP_EOL); And run it via web or cli so times, what you want (before oldest time not be so old).
    Solution for a longer time - add this script to the cron and run it every minute or, which better - change task 'sitemap generator' run time from 15 mins to one minute (but it may be not solve you special problem, if you need to update it faster - do it with smart).
    Better solution - wait for IPS updating of that system.
    Thanks for attension!
    P.S. If you read my text with negative speach - it's wrong. I love IPS and just want to make attension for that problem and help others with their large communities. 
  15. Like
    Numbered reacted to Nesa in Large community? You have a problems with sitemap!   
    Upgradeovec, ProSkill and me. It's not a single user.
    It is common for us to have little larger forums, ie Boards with over 700k posts.

    Sorry, but it's hard for me to accept your opinion because:
    - you never wrote that at least one line of code (written on this Topic by topic starter) is wrong.
    - the official IPS answer is that it will fix things with the sitemap.
    According to you, It turns out that everything written on this Topic is rubbish.
    I only urged this to be settled faster, and not wait for version 4.3 for a few more months. You think this will not affect the number of pages indexed by Google, I think it will.
    I am a client of this company for 10 years, and it seems to me that I have the right to ask, at least once, to resolve something faster.
  16. Like
    Numbered got a reaction from sadams101 in Large community? You have a problems with sitemap!   
    Error 2S136/4 thrown by IPS only in one situation - when somebody try to use report system. Just disable this ability for guests and all be fine. Now Google make your attention because it started crawl your site better and found that problem (it was exist before that changes and this changes is not link with that) 
    Your example links said about this too (?do=reportComment - guest shoudn't use report system).
  17. Like
    Numbered got a reaction from sadams101 in Large community? You have a problems with sitemap!   
    Found one more sitemap problem.
    <lastmod> tag show generation time of the current sitemap file. It's right, but.. What is tell standard
    So, coming back to our case, now we have 5271 sitemap files. So google should get all of them! He get information 'it's modified! take it' and doesn't matter content inside changed or not. Moreover - inside sub-sitemap with url's we didn't have any <lastmod> tags. So google get very old url to subsitemap file, get it and see just list of urls without additional meta information.
     
    My proposal:
    add <lastmod> tag to every url inside all sub-sutemaps. It will tell google which urls contain new elements and which should be scan and it tell which one not changed and not need to re-scan => will optimize scan perfomance.
    Add <lastmod> tag to index sitemap file, which never tell date of this file generation - it should provide newer date of last modified url inside this file. With that google never download sitemap with 500 urls where no changes exist => will optimize scan perfomance.
    P.S. I'll try to create a patch. If i do this - i'll share it here (for other dev's checks and helping IPS).
    Thanks for you attension and support )
  18. Like
    Numbered got a reaction from BomAle in Large community? You have a problems with sitemap!   
    Support answered 
  19. Like
    Numbered got a reaction from SeNioR- in Large community? You have a problems with sitemap!   
    IPS Sitemap generator using special database table source for refreshing - core_sitemap.
    Primary search engine source of sitemap is url https://example.com/sitemap.php which is list of sub-sitemap files. You can see list of that files proceed for this link.
    Each of that file contain no more than 1000 urls to specail pages (profile status, topic (without number of pages or comment) and other elements, with supported sitemap as core extension).
    One of our case is forum with more than 100k topics, more than 4.2kk posts and more than 6kk users. So with simply math we have 5214 sitemap files (you can simply count number of that files with command 
    select count(*) from core_sitemap; // 5214 Sitemap generator task run by default once per 15 minuts and update only one oldest element from that big list. With simple math we can try to answer question 'how many time we need for update everything?' (because users can post not only in newest and may post in some old topics... but.. new created topic will add to sitemap file only when ALL older files will newer than current file with new topic inside). So, how much time we need for update?
    5214*15 = 78210 minuts = 1303 hours = 54 days! 54! days! Search engine will add your newest content after 54 days after them posted. Incredible thing. Not believe? Or want to know this lag for your community? You can simple know your lag time with that sql:
    select FROM_UNIXTIME(updated,'%a %b %d %H:%i:%s UTC %Y') from core_sitemap order by updated asc limit 1; // Wed Nov 01 14:13:49 UTC 2017 Yep.. In our case oldest file last updated in 1 November...
    What we should do for fix it? Very fast solution - create a temp file, like a 'mycustomsitemapupdater.php' with this content:
    <?php require 'init.php'; $generator = new \IPS\Sitemap; $generator->buildNextSitemap(); $last = \IPS\Db::i()->select('FROM_UNIXTIME(updated, "%a %b %d %H:%i:%s UTC %Y")', 'core_sitemap', null, 'updated asc', 1)->first(); print_r('Oldest time now: ' . $last . PHP_EOL); And run it via web or cli so times, what you want (before oldest time not be so old).
    Solution for a longer time - add this script to the cron and run it every minute or, which better - change task 'sitemap generator' run time from 15 mins to one minute (but it may be not solve you special problem, if you need to update it faster - do it with smart).
    Better solution - wait for IPS updating of that system.
    Thanks for attension!
    P.S. If you read my text with negative speach - it's wrong. I love IPS and just want to make attension for that problem and help others with their large communities. 
  20. Like
    Numbered got a reaction from BomAle in Large community? You have a problems with sitemap!   
    IPS Sitemap generator using special database table source for refreshing - core_sitemap.
    Primary search engine source of sitemap is url https://example.com/sitemap.php which is list of sub-sitemap files. You can see list of that files proceed for this link.
    Each of that file contain no more than 1000 urls to specail pages (profile status, topic (without number of pages or comment) and other elements, with supported sitemap as core extension).
    One of our case is forum with more than 100k topics, more than 4.2kk posts and more than 6kk users. So with simply math we have 5214 sitemap files (you can simply count number of that files with command 
    select count(*) from core_sitemap; // 5214 Sitemap generator task run by default once per 15 minuts and update only one oldest element from that big list. With simple math we can try to answer question 'how many time we need for update everything?' (because users can post not only in newest and may post in some old topics... but.. new created topic will add to sitemap file only when ALL older files will newer than current file with new topic inside). So, how much time we need for update?
    5214*15 = 78210 minuts = 1303 hours = 54 days! 54! days! Search engine will add your newest content after 54 days after them posted. Incredible thing. Not believe? Or want to know this lag for your community? You can simple know your lag time with that sql:
    select FROM_UNIXTIME(updated,'%a %b %d %H:%i:%s UTC %Y') from core_sitemap order by updated asc limit 1; // Wed Nov 01 14:13:49 UTC 2017 Yep.. In our case oldest file last updated in 1 November...
    What we should do for fix it? Very fast solution - create a temp file, like a 'mycustomsitemapupdater.php' with this content:
    <?php require 'init.php'; $generator = new \IPS\Sitemap; $generator->buildNextSitemap(); $last = \IPS\Db::i()->select('FROM_UNIXTIME(updated, "%a %b %d %H:%i:%s UTC %Y")', 'core_sitemap', null, 'updated asc', 1)->first(); print_r('Oldest time now: ' . $last . PHP_EOL); And run it via web or cli so times, what you want (before oldest time not be so old).
    Solution for a longer time - add this script to the cron and run it every minute or, which better - change task 'sitemap generator' run time from 15 mins to one minute (but it may be not solve you special problem, if you need to update it faster - do it with smart).
    Better solution - wait for IPS updating of that system.
    Thanks for attension!
    P.S. If you read my text with negative speach - it's wrong. I love IPS and just want to make attension for that problem and help others with their large communities. 
  21. Like
    Numbered reacted to Matt in Large community? You have a problems with sitemap!   
    We've added the timestamp into the sitemap and we're looking to add a tool to quickly rebuild the sitemap on demand.
  22. Like
    Numbered reacted to sadams101 in Large community? You have a problems with sitemap!   
    Thank you for this! I'll try the "ugly" method and report back. In my case it is very clear that something very bad happened to IPB's site map around the end of June 2017. See below if you doubt this:
     

  23. Like
    Numbered got a reaction from AlexWebsites in Large community? You have a problems with sitemap!   
    Mainly the result of first patch already exist. Now google webmasters said 4 Jan - most oldest my sitemap file (actually all updated tiday, but google got it more frequently). Second patch with <lastmod> can provide ordering (last column). Now i can't see improvements by statistic. This big works for google get a lot of time and resources (and any update got it too). So need a more time. There is the graph of downloaded size per day

    I see average goes up and this is good. Numbers in KiB (as legend said).
    I'll post info here if i get some more intelligent proofs and results. Thanks you for your interest )
  24. Thanks
    Numbered got a reaction from SoloInter in Large community? You have a problems with sitemap!   
    Support answered 
  25. Thanks
    Numbered got a reaction from Markus Jung in Large community? You have a problems with sitemap!   
    Support answered 
×
×
  • Create New...