Invision Community 4: SEO, prepare for v5 and dormant account notifications By Matt Monday at 02:04 PM
AutoItScript Posted April 22, 2015 Posted April 22, 2015 I've got a test CLI script working. It needs a bit of configuration at the top.// CHANGEABLE VARIABLES // If TRUE then no changes will be made but you will see the number of potential changes $testMode = FALSE; // Starting post id (pid) $startPost = 0; // Ending post id - This pid IS processed. Use 9999999 for no end limit. $endPost = 9999999; // The custom class to use - <p class="customclass'>. This can be optionally left for manual SQL cleanup. $customClass = 'doublepara-processed'; $leaveCustomClass = TRUE;I ran it against my test migrated board (1.2 million posts) and it took a couple of minutes - I'm checking posts at the moment and it looks good so far. Interestingly it only started needing to process posts at around the 800k post mark. I'm guessing that's when I upgraded to 3.4. The posts before that all seem to use pure <br> for paragraphs so they aren't touched. In the end it needed to alter 62k posts out of 1.2 million.If you would like to test the script then send me a personal message and I'll give it to you. Strictly on the condition that you are responsible for backing up your board or using a test board. No moaning if it goes wrong!For the techie, the regexs I ended up using trap a lot of edge cases - it's not a simple replace all. If anyone can see a flaw then please shout.// First change double empty paragraphs into single empty paragraphs and protect them $newPost = preg_replace('#(?<=</p>)(\s*)<p> </p>\s*<p> </p>#', '$1<p class="' . $customClass . '"> </p>', $oldPost); // Remove remaining single paragraphs $newPost = preg_replace('#(?<=</p>)(\s*)<p> </p>#', '$1', $newPost); // Restore protected paragraphs (optional) if (!$leaveCustomClass) { $newPost = str_replace('<p class="' . $customClass . '"> </p>', '<p> </p>', $newPost); }
estan Posted April 22, 2015 Author Posted April 22, 2015 If you would like to test the script then send me a personal message and I'll give it to you. Strictly on the condition that you are responsible for backing up your board or using a test board. No moaning if it goes wrong! @AutoItScript I very very much appreciate the help! I will send you a personal message in a minute. I definitely agree to be responsible. I will do a database backup on the live site (once it is migrated to 4.0.2) and I also plan to first run the script on my test board. So, no worries
ZakRhyno Posted April 22, 2015 Posted April 22, 2015 I did a test upgrade from 3.4.7 to 4.0.1. There were few minor issues here and there but overall all looks good.The most major concern so far I have after the upgrade is the double lining/spacing on articles/posts etc. Many posts and articles, where an extra empty line or 2 was put intentionally to better separate paragraphs in 3.4.7, look with way too much extra space after the upgrade to 4.0.1. As I understand it this is because of the difference of the editors and how they handle paragraphs, new lines etc etc.This is a big concern for me. Obviously, I can not go and check one by one many thousands of posts, articles etc. and fix them manually. And they definitely do not look good now. This is maybe the only thing that prevents me to upgrade now. Anyone any ideas for me? @Charles or @bfarber or anyone else from IPS, any thoughts or suggestions here? Any way to deal with this in a smart and automatic way?What about if during upgrade there is an optional checkbox for automatic removal of extra empty lines? Or something like that? Or any way to deal with this after the upgrade? I think with bugs being found and issues still to be iron out it best to just wait till it officially and publicly release and wait to upgrade. If you upgrade before that happens your going to be a tester and will have problems and one understand that by upgrading.
Peter.B Posted June 30, 2015 Posted June 30, 2015 WARNING: there may be some cases where people had intentionally hit enter many times, but I consider the loss of those intentional spaces on a handful of posts a small price to pay for fixing the majority. Oh, and always take a backup of your database before doing anything like this:The three queries in the attached text file run in phpMyAdmin fixed it for >40k of my posts that were affected. Sorry I had to attach it as a text file but a big security error popped up when I tried typing mySQL queries into the editor here (not too surprised really). remove blank paragraphs.txt
steel51 Posted October 9, 2015 Posted October 9, 2015 On 22.4.2015, AutoItScript said: If you would like to test the script then send me a personal message and I'll give it to you. Strictly on the condition that you are responsible for backing up your board or using a test board. No moaning if it goes wrong! I recently updated from Vb 4 to ipb 4 and my old post are messed up. It would be very kind, can u give me the script please?
Bluto Posted November 17, 2015 Posted November 17, 2015 This happened to my board when I did an upgrade test on my local machine. I used the command line to make the changes. 1. BACKUP THE DATABASE! 2. Get into mysql by typing mysql into the prompt. You might have to enter your mysql root password here. 3. Select the database USE [database name] without the brackets 4. Enter what is below (from @Peter.B above), one at a time REPLACING "ipb_" with whatever you used as the prefix for your database. update `ipb_forums_posts` set post = replace(post,"<p> </p>",""); update `ipb_forums_posts` set post = replace(post,"<p> </p>",""); update `ipb_forums_posts` set post = replace(post,"<p></p>",""); Additional note: I found that adding extra spaces in the first one betweet <p> </p> found additional hits. I believe I went out to six spaces. Each additional space, I ran that query again.
Nathan Explosion Posted November 17, 2015 Posted November 17, 2015 You missed an important step - BACKUP THE DATABASE.
Bluto Posted November 17, 2015 Posted November 17, 2015 3 minutes ago, Nathan Explosion said: You missed an important step - BACKUP THE DATABASE. Good catch! Updated!
Teddy Rogers Posted November 21, 2015 Posted November 21, 2015 I've been doing some of these post format corrections in the database since yesterday after I updated the board to 4.1 as a lot of posts/topics are a mess. I have noticed some posts have this problem in their content... </p><br><p> </p><br><p> I tried running the following query on the post column to clear this up - without any success... update `ipb_forums_posts` set post = replace(post,"<br><p> </p><br>",""); ...it is always returning zero rows affected. The string expression to find is exactly as it is found in the database. Any ideas what I am missing here? Ted.
Bluto Posted November 21, 2015 Posted November 21, 2015 8 hours ago, Teddy Rogers said: I've been doing some of these post format corrections in the database since yesterday after I updated the board to 4.1 as a lot of posts/topics are a mess. I have noticed some posts have this problem in their content... </p><br><p> </p><br><p> I tried running the following query on the post column to clear this up - without any success... update `ipb_forums_posts` set post = replace(post,"<br><p> </p><br>",""); ...it is always returning zero rows affected. The string expression to find is exactly as it is found in the database. Any ideas what I am missing here? Ted. I have the same code in my converted board. I've been trying to replace that for about a 1/2 hour with no luck. I noticed when I was replacing an issue with the twitter embeds that I had to use the html name ">" for the ">" which I was replacing. I tried to do that with this query, but I got the same results as you.
Bluto Posted November 21, 2015 Posted November 21, 2015 One thing I noticed. If the queries for removing the <p> </p> would have been completely successful, there would be no <br><p> </p><br> to remove. They would only be <br><br>. So, I think the original queries need to be modified.
Teddy Rogers Posted November 22, 2015 Posted November 22, 2015 19 hours ago, Bluto said: If the queries for removing the <p> </p> would have been completely successful, there would be no <br><p> </p><br> to remove. Yes, I thought exactly the same after I ran that query. I'm not entirely sure why these are being missed, it seems an odd one but then SQL isn't my strong suite so it's likely to be operator error... Ted.
Teddy Rogers Posted November 22, 2015 Posted November 22, 2015 I have no idea why this is a double post? Please delete... Ted.
Bluto Posted November 22, 2015 Posted November 22, 2015 4 hours ago, Teddy Rogers said: Yes, I thought exactly the same after I ran that query. I'm not entirely sure why these are being missed, it seems an odd one but then SQL isn't my strong suite so it's likely to be operator error... Ted. I searched everywhere on Google trying to find out if the query was wrong, just about every site was the same. It did find the correct amount when using this query, but again, it didn't make any changes. update [IPB Prefix]forums_posts set post = replace(post,'<br><p> </p><br>','') where post like '%<br><p> </p><br>%'; I'm wondering if the original queries have something to do with it not making changes with the new queries.
Teddy Rogers Posted November 23, 2015 Posted November 23, 2015 The parameters look to be fine to me... https://msdn.microsoft.com/en-us/library/ms186862.aspx I wondered if it could be something to do with the character-set and have tried different variations of some characters without success... Ted.
Bluto Posted November 23, 2015 Posted November 23, 2015 IDK. I'm going to be testing my upgrade again on my local system (I haven't upgraded to v4 yet) and I run the updated query first to see if the br-p-p-br gets removed. There are also several pieces of software out there which allow someone to modify the database remotely from the software. I believe NavCat is one of them. I might end up using that to search and replace if the query doesn't work. http://www.navicat.com/products
Teddy Rogers Posted November 25, 2015 Posted November 25, 2015 Have you had any success with NavCat? Ted.
Bluto Posted November 25, 2015 Posted November 25, 2015 6 minutes ago, Teddy Rogers said: Have you had any success with NavCat? Ted. I wanted to try a few other things before I gave Navcat a try.
Teddy Rogers Posted December 17, 2015 Posted December 17, 2015 Any joy with Navcat and such? I've tried some other queries without much success so I am open to suggestions, still have a lot of posts that need some TLC... Ted.
ram108 Posted December 17, 2015 Posted December 17, 2015 Nothing helped me too. So, I am not planning to upgrade and have unreadable posts in forum.
EricT Posted December 25, 2015 Posted December 25, 2015 Hi, I ran the sql commands and many rows were affected, but on the post it looks the same than before. I think I have to recache the posts, but how I can do that ? Thank you !
Yamamura Posted July 26, 2018 Posted July 26, 2018 Okay. This is an old topic, but I didn't find a fully working solution here. This sql query is almost correct (it was mentioned earlier — #1, #2) : UPDATE `ibf_forums_posts` SET post = REPLACE(post,'<p> </p>',''); But for this sql query to work, you need to copy "<p> </ p>" from the database table. I recorded a short video for an example: In any case, don't forget to make a backup.
Recommended Posts
Archived
This topic is now archived and is closed to further replies.