As promised, here is my php script to automatically convert a IPB2 database to utf8.
It does not use IPB2-style queries, but seems to work for me, converting most of my iso-8859-7 database to utf8.
However, it still fails in a few cases, which I have to investigate in the next days.
The reason is probably (as I noticed) that there is some binary content within some text fields (??),
which I can only guess are either already utf8-encoded chars, or I don't know what.
I will have to write to a file the specific cases where txt_convert_charsets fails, to find out what really happened.
As I also explain inside the file, in my case, default charset for remote and local database was utf8 but the database had latin1 charset, latin1_swedish_ci collation and IPB charset was set to 'iso-8859-7'.
Hence it was also erroneously displayed in phpmyadmin.
When I got my local sql backup, the content was really utf8-encoded latin1 chars (I couldn't get anything better from backup/phpmyadmin)!
I then changed all the "CREATE TABLE" and "SET character_set_client" commands from latin1 to utf8, trying to do
ABSOLUTELY NO modification to the content of the database, before running the attached script.
- Serialization is handled quite nicely I think, with a command I found to change the string lengths (without unserializing).
- It is fully automatic - it will examine your database and convert only "char varchar text enum set tinytext mediumtext longtext" fileds.
- Skips fields to be converted if they are a primary key. Any ideas on how to convert these? (small problem-I guess they do not contain utf8 strings).
- Skips conversion if the table does not have a primary key (odd but there existed such tables in my IPB2 database-not any important ones of course).
To run it, just place it at your IPB home directory, and change the database connection details and your source charset.
You can optionally convert html entities after utf8 conversion:it worked for me, but it was the opposite of what bfarber says in the above post: html_entity_decode( $out, ENT_NOQUOTES, "utf-8" ) worked fine (php5.2.6, mysql 5.0, windows xp) while mb_convert_encoding($out, 'utf-8', 'HTML-ENTITIES') failed-I really don't know why!
Feel free to try it, and to correct/improve it if possible, so that at the end we all manage to convert our databases to utf8! ipb2_utf8_convert.php
VERY IMPORTANT: ONLY USE THIS FOR EXPERIMENTATION AND ONLY USING A LOCAL COPY!