Batnik Posted June 16, 2015 Posted June 16, 2015 I installed my Forums without utf8mb4, thats pain in the ass. There are several errors resulting in inconstistent Datebase-Tables, empty Threads etc.The Emojis-Fix from 4.0.8 only fixes the Emojis ...How can i reconvert Everything to utf8mb4 ???As i see there is no Converter for IPS4.
Jim M Posted June 16, 2015 Posted June 16, 2015 Sorry, there is currently no converter between UTF8's. My best suggestion at your current time is to restore your 3.x backup and start over to ensure everything is properly done.
NoGi Posted June 16, 2015 Posted June 16, 2015 I went through this process prior to my IPS4 upgrade and it was a PITA. Had to change all the ones that were missed (due to default being UTF8 instead of UTF8MB4) manually. You'll run into problems with indexes using varchar(255) which in turn will cause problems when IPS release an update that has database changes.Support have raised a bug for me so hopefully they will have a permanent fix for it soon but something to be aware off before you go changing the collation.
mesteele101 Posted June 17, 2015 Posted June 17, 2015 I don't think there was any or little notice to utf8mb4 until the problem with Emojis was brought up by IPS.I'm thinking that this may be a problem as IPB progresses, and it appears that UTF8MB4 is the standard or what everyone should be striving for.I have little experience with MySQL, so I'm not sure. I see conversion scripts for doing this, but I'm very leery of doing this without the support of IPS.Is UTF8MB4 something we should be looking into?
AutoItScript Posted June 17, 2015 Posted June 17, 2015 I don't think there was any or little notice to utf8mb4 until the problem with Emojis was brought up by IPS.I'm thinking that this may be a problem as IPB progresses, and it appears that UTF8MB4 is the standard or what everyone should be striving for.Which is kind of odd as utf8mb4 is the default if you install a fresh copy of IPS4 - it's clearly the way forward. I've converted my DB to utf8mb4 yesterday and it was a real pain to get everything right.
ABGenc Posted June 17, 2015 Posted June 17, 2015 Do we have to convert to UTF8MB4 ? It was told earlier that we could have either . During my tests I have never chosen to go for a conversion to utf8mb4 and have not faced any problem ( yet ) Now a bit confused ...
opentype Posted June 17, 2015 Posted June 17, 2015 (edited) Do we have to convert to UTF8MB4 ?No. The scenarios are:MySQL <= 5.1You cannot use UTF8MB4 at allMySQL >= 5.5UTF8. Certain characters (like certain Emoji) won’t be supported and could cause problems in earlier releases. Therefore the most recent version now warns you about this when you try to include such characters.UTF8MB4: You are free to use any characters. Edited June 17, 2015 by Ralf H. sobrenome, chilihead and ABGenc 3
ABGenc Posted June 17, 2015 Posted June 17, 2015 Thanks @Ralf H. does UTF8MB4 have any disadvantage in terms of speed, db size etc ? I mean would it be adviced to convert during upgrade from 3.4.7 to 4.0 ?
mesteele101 Posted June 17, 2015 Posted June 17, 2015 Which is kind of odd as utf8mb4 is the default if you install a fresh copy of IPS4 - it's clearly the way forward. I've converted my DB to utf8mb4 yesterday and it was a real pain to get everything right.Can you give any information about the conversion> Scripts, location of information?
mesteele101 Posted June 17, 2015 Posted June 17, 2015 I just checked my live site which was and an upgrade from 3.x, and all the tables are utf8. I just checked my development site as it was a fresh install, and there was a mixture of utf8, and utf8mb4. All tables that start with a ipf_ are utf8, and all the others are utf8mb4. Shouldn't all the tables be utf8mb4 ?
AutoItScript Posted June 17, 2015 Posted June 17, 2015 Looks like you've got two installs there. One with a ibf_ prefix.
mesteele101 Posted June 17, 2015 Posted June 17, 2015 Woh, maybe so. Looking iLooks like you've got two installs there. One with a ibf_ prefix.Maybe your right? I'm going to start a fresh install and wipe it clean, and see where I'm at.Thanks...
mesteele101 Posted June 23, 2015 Posted June 23, 2015 Sorry, there is currently no converter between UTF8's. My best suggestion at your current time is to restore your 3.x backup and start over to ensure everything is properly done.That's just not an acceptable answer for those of us that were early adopters of IPB 4. You actually want us to loose months of user data. This should have been taken care of when IPB moved from utf8_unicode_ci to utf8mb4_unicode_ci . We trust IPS to do what is right to keep their clients up to date with the latest releases. If the latest releases IPB has had a default character set change there should be a way for IPS's paid clients to benefit from that change, and advancing along with everyone else on the same playing field.It seems to me that IPS should jump ahead of this and create a conversion script for its clients. This not only fixes future problems, but will also lessen support tickets.Or better yet can we create a ticket for this. The majority of independent IPS clients are comfortable with running MySQL scripts. Kantholy, The Old Man, Batnik and 1 other 4
bradl Posted June 23, 2015 Posted June 23, 2015 I was kicking myself for not choosing mb4 when I first created my small family project (I saw "slower" and balked and did not fully consider the emoji implications for my target users) . Later I realized I couldn't have anyway since Dreamhost runs at MySQL 5.1.n So I can reasonably foresee either moving to a different host or eventually upgrading MySQL or many reasons for wanting to convert an existing community instance. Batnik 1
xtech Posted June 23, 2015 Posted June 23, 2015 The thing we have to think is: "is that so important to use emojis that they worth converting to utf8mb4?" Even some browsers cannot display them properly!If your site runs fine and you do not need emojis, do not convert to utf8mb4: there are also some disadvantages to it also, like the increased database space, inherent to the use of 4 byte character encoding. The full UTF8 implementation needs 4 bytes for some characeters, it's true. But if see the table of the 4-byte characters, i doubt you use them.In a normal UTF8 use, 99.99% of the caracters used are 1, 2 or 3 bytes.According to my experience, Utf8mb4 is a must for the fellows using Windows Server and IIS, because it will solve many character conversion problems arising from the odd php implementation on windows.
Marcher Technologies Posted June 23, 2015 Posted June 23, 2015 I don't understand the fuss. if using utf8_unicode_ci, characters using 4 bytes such as emoji will be refused by the database. if that is an issue, simply update the collation? it's not like it's a different character set, utf8mb4 is a superset of utf8 in mysql, you can go from utf8 to utf8mb4 easily, there's just no going back from utf8mb4 to utf8 without losing data once there is 4-byte data in the database. sobrenome and chilihead 2
opentype Posted June 23, 2015 Posted June 23, 2015 What Marcher said. It’s really a server-level thing that has nothing to do with the actual IPS software. If you choose to self-host, you should be able to maintain your database and set such things yourself or let it do through someone else. IPS only guarantees that 4.0 works both on UTF8 and UTF8mb4. It’s not their job to maintain your server.
AutoItScript Posted June 23, 2015 Posted June 23, 2015 (edited) Unfortunately it's not that simple. Some of the indexes are created with a length based on 3 bytes per char in mind. So when you try and change the collation on those tables you get an error about indexes being too large. To fix it you have to completely delete the indexes on that table, recreate them with a length suitable for 4 bytes per char. Then you can change the collation.I did it on my board by deleting all the indexes and then changing the collation type in conf_global.php. Then when you run the support tool it spits out the commands for recreating the indexes - even with the correct index lengths for utf8mb4. Not for the faint hearted, it took about 5 hours to recreate.Info about the index issue: https://mathiasbynens.be/notes/mysql-utf8mb4 Edited June 23, 2015 by AutoItScript sobrenome and chilihead 2
mesteele101 Posted June 23, 2015 Posted June 23, 2015 How about pulling all the tables in phpmyadmin, and working from that .sql file?Create a new databse, and then import the modified .sql back into the new database? sobrenome 1
Ausy Posted June 23, 2015 Posted June 23, 2015 Just changed all mine through phpmyadmin and didn't get any errors. Seems to be ok. sobrenome 1
AutoItScript Posted June 23, 2015 Posted June 23, 2015 (edited) Did you select the table, click Operations, and then change collation that way? If so, you've not changed the collation of all the columns in the table. That's where the fun is If you use the structure tab to show the columns, you should see the collation there too in tables that contain text fields. Edited June 23, 2015 by AutoItScript NoGi 1
Ausy Posted June 23, 2015 Posted June 23, 2015 Ah yes I see. Will do some on the test install and see.
Batnik Posted June 23, 2015 Author Posted June 23, 2015 I got something for you: # For each database: ALTER DATABASE database_name CHARACTER SET = utf8mb4 COLLATE = utf8mb4_unicode_ci; # For each table: ALTER TABLE table_name CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci; # For each column: ALTER TABLE table_name CHANGE column_name column_name VARCHAR(XXXXX) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci; # (Don’t blindly copy-paste this! The exact statement depends on the column type, maximum length, and other properties. The above line is just an example for a `VARCHAR` column.)
Batnik Posted June 23, 2015 Author Posted June 23, 2015 i can tell you, its a lot of work, i better will wait for a converter from invision! sobrenome 1
mesteele101 Posted June 23, 2015 Posted June 23, 2015 I got something for you: # For each database: ALTER DATABASE database_name CHARACTER SET = utf8mb4 COLLATE = utf8mb4_unicode_ci; # For each table: ALTER TABLE table_name CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci; # For each column: ALTER TABLE table_name CHANGE column_name column_name VARCHAR(XXXXX) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci; # (Don’t blindly copy-paste this! The exact statement depends on the column type, maximum length, and other properties. The above line is just an example for a `VARCHAR` column.)I saw this during my investigation online. I manually preformed the conversion, but got a index length error of some kind. I possibly faild to create the database correctly? sobrenome 1
Recommended Posts