Jump to content

changes in posts while update 3.4-4.2.8


taz.de

Recommended Posts

hi! we just rolled back our update from PB 3.4 to 4.2.8

we noticed some citical changes in the posts in our database. at first we have some broken links. this is how a small post looks in the 3.4 database:

Linkhinweise<br>
<br>
http://justlabelit.org/dangerous-weed-killers-are-helping-to-spread-superweeds/<br>
http://gmo-awareness.com/<br>
http://justlabelit.org/<br>
http://righttoknow-gmo.org/states<br>
http://www.naturalnews.com/037466_gm_food_global_elite_organic_gardens.html

and this is how the same post looks immediately after the update in the 4.2.8 database:

<p>Linkhinweise</p>
<p> </p>
<p><a href="&lt;a%20href=" http: rel="">http://justlabelit.org/</a>dangerous-weed-killers-are-helping-to-spread-superweeds/'&gt;<a href="http://justlabelit.org/" rel="external nofollow">http://justlabelit.org/</a>dangerous-weed-killers-are-helping-to-spread-superweeds/</p>
<p><a href="http://gmo-awareness.com/" rel="external nofollow">http://gmo-awareness.com/</a></p>
<p><a href="http://justlabelit.org/" rel="external nofollow">http://justlabelit.org/</a></p>
<p><a href="http://righttoknow-gmo.org/states" rel="external nofollow">http://righttoknow-gmo.org/states</a></p>
<p><a href="http://www.naturalnews.com/037466_gm_food_global_elite_organic_gardens.html" rel="external nofollow">http://www.naturalnews.com/037466_gm_food_global_elite_organic_gardens.html</a></p>

as you can see, the first link is completely broken for some reason.

there are additional issues. the update also you replace ASCII smileys with emoticons in form of not-xhtml-safe <img>-tags (without the closing "/") and make links to youtube and others to embedded content, without asking.

since we are not using your frontend for the visitors of our page and we pull just the data over the API to wrap them in XML, any broken or not-xhtml-safe stuff will break our output.

we told that to the invision support so their developers could check their update routines, but they just answered, that it's nt possible for them to react, if they are not allowed to have a look at our system. but we are not allowed (and never will be) to give anyone access to our inhouse-systems for security and legal issues.

my question is: did anybody else had such issues? and if yes, how did you handle them?

we are planning to update again next week and our plan for now is to correct these issues by ourselves then directly in the database.

ps: you may check if you have such broken links in your database with the simple database query
SELECT pid from pb_forums_posts WHERE POST LIKE '% http: %';

Link to comment
Share on other sites

10 minutes ago, opentype said:

3.4, 4.3 …even 4.4 … it’s hard to follow this post with the numbers you throw around, which apparently don’t seem to make sense. 

yes, thanks, sorry, should be corrected now. exactly, the update was 3.4.9 -> 4.2.8

Link to comment
Share on other sites

3 hours ago, taz.de said:

ps: you may check if you have such broken links in your database with the simple database query
SELECT pid from pb_forums_posts WHERE POST LIKE '% http: %';

I checked my upgraded 3.4 database. I have zero results for that query. 

I also cannot see from your sample what could cause this. Maybe if you look at the other occurrences there is a pattern recognizable?  

Link to comment
Share on other sites

here are some further broken links. the only pattern i can see yet is, that they are all links with at least one directory in the path endig with a slash. i send some below. looks like a regexp which is running wild.

http://www.antisemitisme.fr/dl/2014-EN.pdf
<a href="&lt;a%20href=" http: rel="">http://www.antisemitisme.fr/dl/</a>2014-EN.pdf'&gt;<a href="http://www.antisemitisme.fr/dl/" rel="external nofollow">http://www.antisemitisme.fr/dl/</a>2014-EN.pdf

http://www.antisemitisme.fr/dl/2014-FR.pdf
<a href="&lt;a%20href=" http: rel="">http://www.antisemitisme.fr/dl/</a>2014-FR.pdf'&gt;<a href="http://www.antisemitisme.fr/dl/" rel="external nofollow">http://www.antisemitisme.fr/dl/</a>2014-FR.pdf

http://translate.google.de/translate?hl=de&sl=ru&u=http://www.helpdonbasspeople.ru/&prev=search
<a href="http://translate.google.de/translate?hl=de&amp;sl=ru&amp;u=&lt;a%20href=" http: rel="external nofollow">http://www.helpdonbasspeople.ru/</a>&amp;prev=search'&gt;http://translate.google.de/translate?hl=de&amp;sl=ru&amp;u=<a href="http://www.helpdonbasspeople.ru/" rel="external nofollow">http://www.helpdonbasspeople.ru/</a>&amp;prev=search

http://friedenswinter.de/aufruf/
<a href="&lt;a%20href=" http: rel="">http://friedenswinter.de/</a>aufruf/'&gt;<a href="http://friedenswinter.de/" rel="external nofollow">http://friedenswinter.de/</a>aufruf/

http://ase.tufts.edu/gdae/Pubs/wp/14-03CapaldoTTIP.pdf
<a href="&lt;a%20href=" http: rel="">http://ase.tufts.edu/gdae/</a>Pubs/wp/14-03CapaldoTTIP.pdf
'&gt;<a href="http://ase.tufts.edu/gdae/" rel="external nofollow">http://ase.tufts.edu/gdae/</a>Pubs/wp/14-03CapaldoTTIP.pdf

http://www.usip.org/sites/default/files/TransitionHandbook.pdf
<a href="&lt;a%20href=" http: rel="">http://www.usip.org/</a>sites/default/files/TransitionHandbook.pdf'&gt;<a href="http://www.usip.org/" rel="external nofollow">http://www.usip.org/</a>sites/default/files/TransitionHandbook.pdf

http://ru.flightaware.com/live/flight/MAS17/history/20140714/1000Z/EHAM/WMKK
<a href="&lt;a%20href=" http: rel="">http://ru.flightaware.com/live/flight/MAS17</a>/history/20140714/1000Z/EHAM/WMKK'&gt;<a href="http://ru.flightaware.com/live/flight/MAS17" rel="external nofollow">http://ru.flightaware.com/live/flight/MAS17</a>/history/20140714/1000Z/EHAM/WMKK

http://justlabelit.org/dangerous-weed-killers-are-helping-to-spread-superweeds/
<a href="&lt;a%20href=" http: rel="">http://justlabelit.org/</a>dangerous-weed-killers-are-helping-to-spread-superweeds/'&gt;<a href="http://justlabelit.org/" rel="external nofollow">http://justlabelit.org/</a>dangerous-weed-killers-are-helping-to-spread-superweeds/

http://literatenwelt.blog.de/2014/01/18/arno-schmidt-100-geburtstag-17606008/
<a href="&lt;a%20href=" http: rel="">http://literatenwelt.blog.de/</a>2014/01/18/arno-schmidt-100-geburtstag-17606008/'&gt;<a href="http://literatenwelt.blog.de/" rel="external nofollow">http://literatenwelt.blog.de/</a>2014/01/18/arno-schmidt-100-geburtstag-17606008/

http://hnn.us/article/3166#sthash.65AnFb4e.dpuf
<a href="&lt;a%20href=" http: rel="">http://hnn.us/article/3166</a>#sthash.65AnFb4e.dpuf'&gt;<a href="http://hnn.us/article/3166" rel="external nofollow">http://hnn.us/article/3166</a>#sthash.65AnFb4e.dpuf

http://narrenspiegel.blog.de/2013/12/16/grosse-koalition-koch-kellner-17418803
<a href="&lt;a%20href=" http: rel="">http://narrenspiegel.blog.de</a>/2013/12/16/grosse-koalition-koch-kellner-17418803'&gt;<a href="http://narrenspiegel.blog.de" rel="external nofollow">http://narrenspiegel.blog.de</a>/2013/12/16/grosse-koalition-koch-kellner-17418803

ps: on top there is the link from the 3.4 database, below how it looks in 4.2.8

Link to comment
Share on other sites

5 minutes ago, AlexWright said:

This may be something to submit a ticket and/or a bug report for.

as i wrote, i did, but they won't help until they have access to our database which we can't grant them. maybe with anonymized userdata. we are still talking about that.

Link to comment
Share on other sites

2 hours ago, Aiwa said:

I'd see if you can compromise and just give them the posts table, strip out IP address and author id and let them import it into a test DB, fill in the blanks, and run it themselves. 

the post table is not enough for them. they want the whole database. i even could anonymize the members data, that would take its time, but i offered it to them. no answer yet. it seams nobody is here who made the same experience. maybe i updated to late.

Link to comment
Share on other sites

  • Management

As noted in your ticket, I apologize for your frustration and empathize with your situation. Unfortunately, you're asking us to fly blind without any access to the site, much less the database. It's a bit like calling your auto dealer and saying "my check engine light is on; tell me what's wrong, but I can't allow you to see the vehicle because I have something sensitive in the trunk. I'll remove the ECM from the vehicle and bring that in though." That's just unfortunately not how it works I'm afraid and it's a very inefficient way for the dealer to resolve your issue. They may be willing to assist, go back and forth, put your ECM on the bench and see what they can find, but you're surely going to pay significantly more than the customer who brought the whole vehicle in as should be expected. We too are willing to try and assist with you sending us a modified database, within your requirements, but it is in fact outside the scope of our support (our standards of service make clear we require direct access) and we would need to bill you for out of bound support/consultation. 

As noted, we may be able to take a look at what you have going on and what happened to your upgrade if you create a separate database in your environment, clear the sensitive data you're concerned about (leaving posts in tact) and then provide us direct access to it. If you'd like us to download, import and work with your data in our environment, we can provide a quote, but I'm afraid that's not something included in standard support. 

Hopefully we're able to figure something out for you. ?  

Link to comment
Share on other sites

dear lindy, i understand you point. and you mine. what i don't understand is, why it's not possible to just check the update routine for that code which creates the links, run that against some of the links we posted just to see, if there is a problem. the other problems are the other automatically changed things in the database like the smileys and the embeddings - these we have to solve on ourself in any case as i understood you right.

we will discuss the options we have, which are paying either the 100$ / hour for your support or update and clean up the data afterwards on our own. we will chse the way we think is the best and fastest.

i just asked in this forum to check if other customers have the same problem.

no hard feelings,

ulf

Link to comment
Share on other sites

22 hours ago, bfarber said:

I performed some testing and was able to identify the source of the issue, and will work on a resolution for 4.3.2. Unfortunately, the fix itself will not be retroactive.

wtf! great!!! millions of thanks, @bfarber! otherwise 4.3 contains so much changes to the api that it'll delay our update again. do you have a clue when 4.3.2 will come out?

AND another question: does this only belong to the damaged links or also to the automated smiley-/embedding-changes?

Link to comment
Share on other sites

7 hours ago, bfarber said:

My changes only address the broken links.

I cannot say when 4.3.2 will be out exactly, I'm afraid.

but maybe roundabout and completely nonbinding? how does it feel? rather like 2 weeks than 4 weeks or 2 months?

Link to comment
Share on other sites

The bug in the \system\Text\LegacyParser.php

				$value = preg_replace( '/' . preg_quote( $m, '/' ) . '/i', "<a href='{$m}'>{$m}</a>", $value, 1 );

When we converted  

http://gmo-awareness.com/

the code looks like 

Linkhinweise<br><br>
<a href='http://justlabelit.org/dangerous-weed-killers-are-helping-to-spread-superweeds/'>http://justlabelit.org/dangerous-weed-killers-are-helping-to-spread-superweeds/</a><br>
<a href='http://gmo-awareness.com/'>http://gmo-awareness.com/</a><br>
http://justlabelit.org/<br>
http://righttoknow-gmo.org/states<br>
http://www.naturalnews.com/037466_gm_food_global_elite_organic_gardens.html

When we converted  

http://justlabelit.org/

the code looks like 

Linkhinweise<br><br>
<a href='<a href='http://justlabelit.org/'>http://justlabelit.org/</a>dangerous-weed-killers-are-helping-to-spread-superweeds/'>http://justlabelit.org/dangerous-weed-killers-are-helping-to-spread-superweeds/</a><br>
<a href='http://gmo-awareness.com/'>http://gmo-awareness.com/</a><br>
http://justlabelit.org/<br>
http://righttoknow-gmo.org/states<br>
http://www.naturalnews.com/037466_gm_food_global_elite_organic_gardens.html

We found first 

http://justlabelit.org/

and replaced with 

<a href='http://justlabelit.org/'>http://justlabelit.org/</a>


It's a bug.

If you familiar with php you can fix it yourself.

Link to comment
Share on other sites

12 hours ago, newbie LAC said:

We found first 


http://justlabelit.org/

and replaced with 


<a href='http://justlabelit.org/'>http://justlabelit.org/</a>

 


It's a bug.

If you familiar with php you can fix it yourself. 

It won't work if links was long

4.3.1

104347993_Image2018-04-27at16_31_35.thumb.jpeg.54437837898b380e2af9977dd1665429.jpeg

823924431_Image2018-04-27at16_24_51.thumb.jpeg.7273e1c8a3636e62e35dfe4e288d3827.jpeg

3.4.7

251033104_Image2018-04-27at16_26_51.thumb.jpeg.cc90e26ae13a4232cbfbeab0c366eeb3.jpeg

120071764_Image2018-04-27at16_25_00.thumb.jpeg.bb6c1584ed3af16cc9b5268cef42b080.jpeg

EDIT: I can replicate it with this code ?

<?php
require_once './init.php';
\IPS\Dispatcher\External::i();
//replace localhost/ips4/ with youripssite.tld
$reg = <<<EOF
Si veda il topic: [url="http://localhost/ips4/index.php?showtopic=9260"]http://forum.aracnof...?showtopic=9260[/url]
EOF;
echo '<textarea>';
try
{
	$reg_rules	= \IPS\Text\Parser::parseStatic( \IPS\Text\LegacyParser::parseStatic($reg, null, true), true );
}
catch( \Exception $e )
{
	if( $e->getcode() == 103014 )
	{
		$reg_rules	= preg_replace( "#\[/?([^\]]+?)\]#", '', $reg );
	}
	else
	{
		throw $e;
	}
}
echo $reg_rules;
echo '</textarea>';

EDIT2: LegacyParser replace https://yoursite.tld/ with https://invisioncommunity.com and Parser::parseStatic when try to parseAelement it erase the href attribute.

EDIT3:

before purify, href is defined

image.thumb.png.a266043d2730a21e04c0e239e8253488.png

after purify, href attribute is empty

image.thumb.png.a345cde20939bd1df8581c8efa20abcb.png

@bfarber

EDIT4: \HTMLPurifier_URIParser::parse \system\3rd_party\HTMLPurifier\HTMLPurifier\URIParser.php (the regex not handle the uri)

image.thumb.png.9a6bfb32488c37a6e0ffc75ee5f9558e.png

@taz.de

can you check into sql if you have results for "%href=\"\"%"?

SELECT word_key, word_default, word_custom FROM ipb_core_sys_lang_words WHERE word_default LIKE '%href=\"\"%' OR word_custom LIKE '%href=\"\"%'

SELECT CONCAT('https://yourforum.tld/topic/',t.tid,'-',t.title,'/?do=findComment&comment=',p.pid) FROM ipb_forums_posts p LEFT JOIN ipb_forums_topics t ON t.tid=p.topic_id WHERE p.post LIKE "%href=\"\"%"

change ipb_ with your prefix.

Link to comment
Share on other sites

@BomAle Thanks for your work and your post.

SELECT word_key, word_default, word_custom FROM ipb_core_sys_lang_words WHERE word_default LIKE '%href=\"\"%' OR word_custom LIKE '%href=\"\"%' 

doesn't gives back any results in our development/test-installation which already comntains the broken links (4.2.9) - as described above we have a single " http: " in those broken links. since we would like to update to 4.2.9 with our productive system (because if we would update to 4.3 we have to delay the upgrade again, since it includes several changes to the api). so a solution would be to find out in which file the legacyparser could be found and  correct or replace it and then update our 3.4 version with that file(s).

Link to comment
Share on other sites

  • 1 year later...

Archived

This topic is now archived and is closed to further replies.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...