Invision Community 4: SEO, prepare for v5 and dormant account notifications By Matt Monday at 02:04 PM
taz.de Posted April 24, 2018 Posted April 24, 2018 hi! we just rolled back our update from PB 3.4 to 4.2.8 we noticed some citical changes in the posts in our database. at first we have some broken links. this is how a small post looks in the 3.4 database: Linkhinweise<br> <br> http://justlabelit.org/dangerous-weed-killers-are-helping-to-spread-superweeds/<br> http://gmo-awareness.com/<br> http://justlabelit.org/<br> http://righttoknow-gmo.org/states<br> http://www.naturalnews.com/037466_gm_food_global_elite_organic_gardens.html and this is how the same post looks immediately after the update in the 4.2.8 database: <p>Linkhinweise</p> <p> </p> <p><a href="<a%20href=" http: rel="">http://justlabelit.org/</a>dangerous-weed-killers-are-helping-to-spread-superweeds/'><a href="http://justlabelit.org/" rel="external nofollow">http://justlabelit.org/</a>dangerous-weed-killers-are-helping-to-spread-superweeds/</p> <p><a href="http://gmo-awareness.com/" rel="external nofollow">http://gmo-awareness.com/</a></p> <p><a href="http://justlabelit.org/" rel="external nofollow">http://justlabelit.org/</a></p> <p><a href="http://righttoknow-gmo.org/states" rel="external nofollow">http://righttoknow-gmo.org/states</a></p> <p><a href="http://www.naturalnews.com/037466_gm_food_global_elite_organic_gardens.html" rel="external nofollow">http://www.naturalnews.com/037466_gm_food_global_elite_organic_gardens.html</a></p> as you can see, the first link is completely broken for some reason. there are additional issues. the update also you replace ASCII smileys with emoticons in form of not-xhtml-safe <img>-tags (without the closing "/") and make links to youtube and others to embedded content, without asking. since we are not using your frontend for the visitors of our page and we pull just the data over the API to wrap them in XML, any broken or not-xhtml-safe stuff will break our output. we told that to the invision support so their developers could check their update routines, but they just answered, that it's nt possible for them to react, if they are not allowed to have a look at our system. but we are not allowed (and never will be) to give anyone access to our inhouse-systems for security and legal issues. my question is: did anybody else had such issues? and if yes, how did you handle them? we are planning to update again next week and our plan for now is to correct these issues by ourselves then directly in the database. ps: you may check if you have such broken links in your database with the simple database query SELECT pid from pb_forums_posts WHERE POST LIKE '% http: %';
opentype Posted April 24, 2018 Posted April 24, 2018 3.4, 4.3 …even 4.4 … it’s hard to follow this post with the numbers you throw around, which apparently don’t seem to make sense.
taz.de Posted April 24, 2018 Author Posted April 24, 2018 10 minutes ago, opentype said: 3.4, 4.3 …even 4.4 … it’s hard to follow this post with the numbers you throw around, which apparently don’t seem to make sense. yes, thanks, sorry, should be corrected now. exactly, the update was 3.4.9 -> 4.2.8
Aiwa Posted April 24, 2018 Posted April 24, 2018 There is a background task to rebuild posts if I'm not mistaken. Did you give that a chance to run?
taz.de Posted April 24, 2018 Author Posted April 24, 2018 1 minute ago, Aiwa said: There is a background task to rebuild posts if I'm not mistaken. Did you give that a chance to run? yes, i did. and as you can see, the posts were rebuilt.
taz.de Posted April 24, 2018 Author Posted April 24, 2018 44 minutes ago, Aiwa said: Have you submitted a ticket to IPS? yes, i wrote in my post i contacted the support. i did that with a ticket. i also wrote the answer i got.
opentype Posted April 24, 2018 Posted April 24, 2018 3 hours ago, taz.de said: ps: you may check if you have such broken links in your database with the simple database query SELECT pid from pb_forums_posts WHERE POST LIKE '% http: %'; I checked my upgraded 3.4 database. I have zero results for that query. I also cannot see from your sample what could cause this. Maybe if you look at the other occurrences there is a pattern recognizable?
taz.de Posted April 24, 2018 Author Posted April 24, 2018 here are some further broken links. the only pattern i can see yet is, that they are all links with at least one directory in the path endig with a slash. i send some below. looks like a regexp which is running wild. http://www.antisemitisme.fr/dl/2014-EN.pdf <a href="<a%20href=" http: rel="">http://www.antisemitisme.fr/dl/</a>2014-EN.pdf'><a href="http://www.antisemitisme.fr/dl/" rel="external nofollow">http://www.antisemitisme.fr/dl/</a>2014-EN.pdf http://www.antisemitisme.fr/dl/2014-FR.pdf <a href="<a%20href=" http: rel="">http://www.antisemitisme.fr/dl/</a>2014-FR.pdf'><a href="http://www.antisemitisme.fr/dl/" rel="external nofollow">http://www.antisemitisme.fr/dl/</a>2014-FR.pdf http://translate.google.de/translate?hl=de&sl=ru&u=http://www.helpdonbasspeople.ru/&prev=search <a href="http://translate.google.de/translate?hl=de&sl=ru&u=<a%20href=" http: rel="external nofollow">http://www.helpdonbasspeople.ru/</a>&prev=search'>http://translate.google.de/translate?hl=de&sl=ru&u=<a href="http://www.helpdonbasspeople.ru/" rel="external nofollow">http://www.helpdonbasspeople.ru/</a>&prev=search http://friedenswinter.de/aufruf/ <a href="<a%20href=" http: rel="">http://friedenswinter.de/</a>aufruf/'><a href="http://friedenswinter.de/" rel="external nofollow">http://friedenswinter.de/</a>aufruf/ http://ase.tufts.edu/gdae/Pubs/wp/14-03CapaldoTTIP.pdf <a href="<a%20href=" http: rel="">http://ase.tufts.edu/gdae/</a>Pubs/wp/14-03CapaldoTTIP.pdf '><a href="http://ase.tufts.edu/gdae/" rel="external nofollow">http://ase.tufts.edu/gdae/</a>Pubs/wp/14-03CapaldoTTIP.pdf http://www.usip.org/sites/default/files/TransitionHandbook.pdf <a href="<a%20href=" http: rel="">http://www.usip.org/</a>sites/default/files/TransitionHandbook.pdf'><a href="http://www.usip.org/" rel="external nofollow">http://www.usip.org/</a>sites/default/files/TransitionHandbook.pdf http://ru.flightaware.com/live/flight/MAS17/history/20140714/1000Z/EHAM/WMKK <a href="<a%20href=" http: rel="">http://ru.flightaware.com/live/flight/MAS17</a>/history/20140714/1000Z/EHAM/WMKK'><a href="http://ru.flightaware.com/live/flight/MAS17" rel="external nofollow">http://ru.flightaware.com/live/flight/MAS17</a>/history/20140714/1000Z/EHAM/WMKK http://justlabelit.org/dangerous-weed-killers-are-helping-to-spread-superweeds/ <a href="<a%20href=" http: rel="">http://justlabelit.org/</a>dangerous-weed-killers-are-helping-to-spread-superweeds/'><a href="http://justlabelit.org/" rel="external nofollow">http://justlabelit.org/</a>dangerous-weed-killers-are-helping-to-spread-superweeds/ http://literatenwelt.blog.de/2014/01/18/arno-schmidt-100-geburtstag-17606008/ <a href="<a%20href=" http: rel="">http://literatenwelt.blog.de/</a>2014/01/18/arno-schmidt-100-geburtstag-17606008/'><a href="http://literatenwelt.blog.de/" rel="external nofollow">http://literatenwelt.blog.de/</a>2014/01/18/arno-schmidt-100-geburtstag-17606008/ http://hnn.us/article/3166#sthash.65AnFb4e.dpuf <a href="<a%20href=" http: rel="">http://hnn.us/article/3166</a>#sthash.65AnFb4e.dpuf'><a href="http://hnn.us/article/3166" rel="external nofollow">http://hnn.us/article/3166</a>#sthash.65AnFb4e.dpuf http://narrenspiegel.blog.de/2013/12/16/grosse-koalition-koch-kellner-17418803 <a href="<a%20href=" http: rel="">http://narrenspiegel.blog.de</a>/2013/12/16/grosse-koalition-koch-kellner-17418803'><a href="http://narrenspiegel.blog.de" rel="external nofollow">http://narrenspiegel.blog.de</a>/2013/12/16/grosse-koalition-koch-kellner-17418803 ps: on top there is the link from the 3.4 database, below how it looks in 4.2.8
AlexWright Posted April 24, 2018 Posted April 24, 2018 This may be something to submit a ticket and/or a bug report for.
taz.de Posted April 24, 2018 Author Posted April 24, 2018 5 minutes ago, AlexWright said: This may be something to submit a ticket and/or a bug report for. as i wrote, i did, but they won't help until they have access to our database which we can't grant them. maybe with anonymized userdata. we are still talking about that.
Aiwa Posted April 24, 2018 Posted April 24, 2018 I'd see if you can compromise and just give them the posts table, strip out IP address and author id and let them import it into a test DB, fill in the blanks, and run it themselves.
taz.de Posted April 25, 2018 Author Posted April 25, 2018 2 hours ago, Aiwa said: I'd see if you can compromise and just give them the posts table, strip out IP address and author id and let them import it into a test DB, fill in the blanks, and run it themselves. the post table is not enough for them. they want the whole database. i even could anonymize the members data, that would take its time, but i offered it to them. no answer yet. it seams nobody is here who made the same experience. maybe i updated to late.
Management Lindy Posted April 25, 2018 Management Posted April 25, 2018 As noted in your ticket, I apologize for your frustration and empathize with your situation. Unfortunately, you're asking us to fly blind without any access to the site, much less the database. It's a bit like calling your auto dealer and saying "my check engine light is on; tell me what's wrong, but I can't allow you to see the vehicle because I have something sensitive in the trunk. I'll remove the ECM from the vehicle and bring that in though." That's just unfortunately not how it works I'm afraid and it's a very inefficient way for the dealer to resolve your issue. They may be willing to assist, go back and forth, put your ECM on the bench and see what they can find, but you're surely going to pay significantly more than the customer who brought the whole vehicle in as should be expected. We too are willing to try and assist with you sending us a modified database, within your requirements, but it is in fact outside the scope of our support (our standards of service make clear we require direct access) and we would need to bill you for out of bound support/consultation. As noted, we may be able to take a look at what you have going on and what happened to your upgrade if you create a separate database in your environment, clear the sensitive data you're concerned about (leaving posts in tact) and then provide us direct access to it. If you'd like us to download, import and work with your data in our environment, we can provide a quote, but I'm afraid that's not something included in standard support. Hopefully we're able to figure something out for you. ?
taz.de Posted April 25, 2018 Author Posted April 25, 2018 dear lindy, i understand you point. and you mine. what i don't understand is, why it's not possible to just check the update routine for that code which creates the links, run that against some of the links we posted just to see, if there is a problem. the other problems are the other automatically changed things in the database like the smileys and the embeddings - these we have to solve on ourself in any case as i understood you right. we will discuss the options we have, which are paying either the 100$ / hour for your support or update and clean up the data afterwards on our own. we will chse the way we think is the best and fastest. i just asked in this forum to check if other customers have the same problem. no hard feelings, ulf
bfarber Posted April 25, 2018 Posted April 25, 2018 I performed some testing and was able to identify the source of the issue, and will work on a resolution for 4.3.2. Unfortunately, the fix itself will not be retroactive.
taz.de Posted April 26, 2018 Author Posted April 26, 2018 22 hours ago, bfarber said: I performed some testing and was able to identify the source of the issue, and will work on a resolution for 4.3.2. Unfortunately, the fix itself will not be retroactive. wtf! great!!! millions of thanks, @bfarber! otherwise 4.3 contains so much changes to the api that it'll delay our update again. do you have a clue when 4.3.2 will come out? AND another question: does this only belong to the damaged links or also to the automated smiley-/embedding-changes?
bfarber Posted April 26, 2018 Posted April 26, 2018 My changes only address the broken links. I cannot say when 4.3.2 will be out exactly, I'm afraid.
taz.de Posted April 26, 2018 Author Posted April 26, 2018 7 hours ago, bfarber said: My changes only address the broken links. I cannot say when 4.3.2 will be out exactly, I'm afraid. but maybe roundabout and completely nonbinding? how does it feel? rather like 2 weeks than 4 weeks or 2 months?
Nathan Explosion Posted April 26, 2018 Posted April 26, 2018 You can make an educated guess by reviewing the release dates of previous versions: https://invisioncommunity.com/release-notes/
taz.de Posted April 26, 2018 Author Posted April 26, 2018 3 minutes ago, Nathan Explosion said: You can make an educated guess by reviewing the release dates of previous versions: https://invisioncommunity.com/release-notes/ yes thanks, i would have guessed ten to fourteen days, but its always good to ask.
newbie LAC Posted April 27, 2018 Posted April 27, 2018 The bug in the \system\Text\LegacyParser.php $value = preg_replace( '/' . preg_quote( $m, '/' ) . '/i', "<a href='{$m}'>{$m}</a>", $value, 1 ); When we converted http://gmo-awareness.com/ the code looks like Linkhinweise<br><br> <a href='http://justlabelit.org/dangerous-weed-killers-are-helping-to-spread-superweeds/'>http://justlabelit.org/dangerous-weed-killers-are-helping-to-spread-superweeds/</a><br> <a href='http://gmo-awareness.com/'>http://gmo-awareness.com/</a><br> http://justlabelit.org/<br> http://righttoknow-gmo.org/states<br> http://www.naturalnews.com/037466_gm_food_global_elite_organic_gardens.html When we converted http://justlabelit.org/ the code looks like Linkhinweise<br><br> <a href='<a href='http://justlabelit.org/'>http://justlabelit.org/</a>dangerous-weed-killers-are-helping-to-spread-superweeds/'>http://justlabelit.org/dangerous-weed-killers-are-helping-to-spread-superweeds/</a><br> <a href='http://gmo-awareness.com/'>http://gmo-awareness.com/</a><br> http://justlabelit.org/<br> http://righttoknow-gmo.org/states<br> http://www.naturalnews.com/037466_gm_food_global_elite_organic_gardens.html We found first http://justlabelit.org/ and replaced with <a href='http://justlabelit.org/'>http://justlabelit.org/</a> It's a bug. If you familiar with php you can fix it yourself.
BomAle Posted April 27, 2018 Posted April 27, 2018 12 hours ago, newbie LAC said: We found first http://justlabelit.org/ and replaced with <a href='http://justlabelit.org/'>http://justlabelit.org/</a> It's a bug. If you familiar with php you can fix it yourself. It won't work if links was long 4.3.1 3.4.7 EDIT: I can replicate it with this code ? <?php require_once './init.php'; \IPS\Dispatcher\External::i(); //replace localhost/ips4/ with youripssite.tld $reg = <<<EOF Si veda il topic: [url="http://localhost/ips4/index.php?showtopic=9260"]http://forum.aracnof...?showtopic=9260[/url] EOF; echo '<textarea>'; try { $reg_rules = \IPS\Text\Parser::parseStatic( \IPS\Text\LegacyParser::parseStatic($reg, null, true), true ); } catch( \Exception $e ) { if( $e->getcode() == 103014 ) { $reg_rules = preg_replace( "#\[/?([^\]]+?)\]#", '', $reg ); } else { throw $e; } } echo $reg_rules; echo '</textarea>'; EDIT2: LegacyParser replace https://yoursite.tld/ with https://invisioncommunity.com and Parser::parseStatic when try to parseAelement it erase the href attribute. EDIT3: before purify, href is defined after purify, href attribute is empty @bfarber EDIT4: \HTMLPurifier_URIParser::parse \system\3rd_party\HTMLPurifier\HTMLPurifier\URIParser.php (the regex not handle the uri) @taz.de can you check into sql if you have results for "%href=\"\"%"? SELECT word_key, word_default, word_custom FROM ipb_core_sys_lang_words WHERE word_default LIKE '%href=\"\"%' OR word_custom LIKE '%href=\"\"%' SELECT CONCAT('https://yourforum.tld/topic/',t.tid,'-',t.title,'/?do=findComment&comment=',p.pid) FROM ipb_forums_posts p LEFT JOIN ipb_forums_topics t ON t.tid=p.topic_id WHERE p.post LIKE "%href=\"\"%" change ipb_ with your prefix.
taz.de Posted May 3, 2018 Author Posted May 3, 2018 @BomAle Thanks for your work and your post. SELECT word_key, word_default, word_custom FROM ipb_core_sys_lang_words WHERE word_default LIKE '%href=\"\"%' OR word_custom LIKE '%href=\"\"%' doesn't gives back any results in our development/test-installation which already comntains the broken links (4.2.9) - as described above we have a single " http: " in those broken links. since we would like to update to 4.2.9 with our productive system (because if we would update to 4.3 we have to delay the upgrade again, since it includes several changes to the api). so a solution would be to find out in which file the legacyparser could be found and correct or replace it and then update our 3.4 version with that file(s).
taz.de Posted October 12, 2019 Author Posted October 12, 2019 hi, @BomAle: Unbelievable, i just upgraded 4.2.9 -> 4.4.7 and the bug is already in there! it again shreddered links in our posts in the same way! Just to let you know.
Recommended Posts
Archived
This topic is now archived and is closed to further replies.