Ph-A Posted March 11, 2010 Posted March 11, 2010 IP.Board 3.x have problems with the Russian letters "ш" (code 0xD188) and "И" (code 0xD098). The mapping of these letters, after saving messages in the database banging. The problem occurs only on Russian hosting. As is known, for the correct functioning of the forum, it is necessary database encoding UTF-8. When installing a forum, creating a database and tables in utf8_unicode_ci and setting the locale Forum ru_RU.UTF-8 will still get beaten letters. Peculiarities of Russia's hosting of the fact that our hosters in the file hard-coded windows-1251 encoding in my.cnf. A typical configuration:[client] default-character-set=cp1251 [mysqld] default-character-set=cp1251 default-collation=cp1251_general_ci init-connect="SET NAMES cp1251" skip-character-set-client-handshake If we run the query SHOW VARIABLES LIKE 'character_set%' we obtain the result: character_set_client utf8 character_set_connection utf8 character_set_database utf8 character_set_filesystem binary character_set_results utf8 character_set_server cp1251 character_set_system utf8 character_sets_dir /usr/share/mysql/charsets/ Community virtual hosting, do not have access to edit the my.cnf. But character_set_server cp1251 not allow the forum to work correctly in Unicode Cyrillic. It is possible to make a correct operation of distribution kit regardless of the settings hosting? I know nothing of the number of failures in the application of IP.Board, as the forum. Russia localization of ibresource fix these problems. But what about those who buy a license directly from ips?
bfarber Posted March 11, 2010 Posted March 11, 2010 If ibresource has fixed these problems, you might ask them to relay to us what they did to overcome the problems you're referring to. As I'm sure you can appreciate, we're not intimately familiar with the peculiarities of Russian hosting setups, so I'm not really sure what the problem is just from your description.
IPBSkins Posted March 11, 2010 Posted March 11, 2010 I can not update my forum because of this problem. In order to apply the decision of ibresource, I must have to once again buy a license, but now they have. And updates from ips will not be possible in the future.
phpony Posted March 11, 2010 Posted March 11, 2010 Ph-A, I'm a proud owner of Russian IP.Board script. They made two subject-related changes in scripts. In file /ips_kernel/classDbMysqlClient.php and in /ips_kernel/classDbMysqliClient.php they extended this block of code: //----------------------------------------- // If there's a charset set, run it //----------------------------------------- if( $this->obj['sql_charset'] ) { $this->query( "SET NAMES '{$this->obj['sql_charset']}'" ); } With this lines for mysqli: //----------------------------------------- // If there's a charset set, run it //----------------------------------------- if( $this->obj['sql_charset'] ) { $this->query( "SET NAMES '{$this->obj['sql_charset']}'" ); $this->query( "SET CHARACTER SET '{$this->obj['sql_charset']}'"); $this->query( "SET character_set_connection = " .$this->obj['sql_charset']); $res = mysqli_query($this->connection_id, "SHOW CHARSET LIKE '" . $this->obj['sql_charset'] . "'" ); $charset = mysqli_fetch_row($res); $this->query( "SET collation_connection = " . $charset[2] ); } And this lines for mysql: //----------------------------------------- // If there's a charset set, run it //----------------------------------------- if( $this->obj['sql_charset'] ) { $this->query( "SET NAMES '{$this->obj['sql_charset']}'" ); $this->query( "SET CHARACTER SET '{$this->obj['sql_charset']}'"); $this->query( "SET character_set_connection = " .$this->obj['sql_charset']); $res = mysql_query($this->connection_id, "SHOW CHARSET LIKE '" . $this->obj['sql_charset'] . "'" ); $charset = mysql_fetch_row($res); $this->query( "SET collation_connection = " . $charset[2] ); }
phpony Posted March 11, 2010 Posted March 11, 2010 Buy the way this code contains a possible error - on some servers the query "SHOW CHARSET LIKE.." will return nothing and then the calling of $charset[2] will give you an error message on top of all forum pages. That's the Russian way of coding :D The better way is to use this line: if($charset = mysqli_fetch_row($res)) $this->query( "SET collation_connection = " . $charset[2] ); Instead of this two: $charset = mysqli_fetch_row($res); $this->query( "SET collation_connection = " . $charset[2] ); And the same for lassDbMysqlClient.php.
Ph-A Posted March 11, 2010 Author Posted March 11, 2010 If ibresource has fixed these problems, you might ask them to relay to us what they did to overcome the problems you're referring to. As I'm sure you can appreciate, we're not intimately familiar with the peculiarities of Russian hosting setups, so I'm not really sure what the problem is just from your description. ibresource -- Owners of the Russian version will receive its automatically. There is a desire to use original distribution kit, you have to ask the decision in client area IPS Resources.
Ph-A Posted March 11, 2010 Author Posted March 11, 2010 Ph-A, I'm [s]a proud[/s] owner of Russian IP.Board script. They made two subject-related changes in scripts. I am the owner of the Russian version of the license too. And I
phpony Posted March 11, 2010 Posted March 11, 2010 The problem with letters on a virtual hosting remains. There is only two places where the data loss can be: the mysql connection collation and the option to "remove chr(0xCA) from input" ;) The second one is completely described here: http://forums.ibresource.ru/index.php?showtopic=51483 This is a best part of all topic:тупые американцы удаляют свои невидимые пробелы, которых в нашей таблице кодировок нет! вместо них буквы К и р сколько можно повторять "К" and "р" is for cp-1251, with utf-8 we have a combinations of chars, probably including "ш" and "И" too. Just set this option to "off" and check the forum ;)
IPBSkins Posted March 11, 2010 Posted March 11, 2010 How I can to update my forum without problems? :(
phpony Posted March 11, 2010 Posted March 11, 2010 I've updated mine without any problems. Please, feel free to contact me, тем более что мы вроде как давно знакомы ;)
IPBSkins Posted March 11, 2010 Posted March 11, 2010 This problem must be solved for all Russian customers of IPS, but not for 1-2 people in private.
bfarber Posted March 12, 2010 Posted March 12, 2010 We are not aware of any specific bugs having to do with Russian letters. If there ARE issues, someone will need to collect the details and submit a bug report, or submit a ticket in the client area so that we can investigate. We can't fix a problem we don't have the details of I'm afraid. You guys are talking like this is some issue that's long been communicated to us and we should have fixed it by now, but honestly I've never heard of random Russian characters disappearing before this thread was opened this afternoon. ш И As you can see, the characters post fine - it's not an IPB issue here. Thus, we need more information as to the cause of the problem. Hence why I suggested having ibresource contact us (as I would presume they are familiar with the problem and the solution they have implemented).
IPBSkins Posted March 12, 2010 Posted March 12, 2010 This issue has been discussed several times in summer in the traсker, but has not been resolved. About a week I'll write tikket with access to my hosting. It will be possible to conduct experiments there. Thank you for your time.
Ph-A Posted March 12, 2010 Author Posted March 12, 2010 There is only two places where the data loss can be: the mysql connection collation and the option to "remove chr(0xCA) from input" ;) This doesn't help to thoroughly fix the problem. There are several hosters, where the glitch remains. E.g. (those which I've tested) Zenon and Sweb. Today I purposely installed the test forum with corrected files classDbMysqlClient.php and classDbMysqliClient.php, but the problem remains. I can give access to the forum, ftp and phpMyAdmin.
LastDragon Posted April 1, 2010 Posted April 1, 2010 $this->query( "SET NAMES '{$this->obj['sql_charset']}'" ); $this->query( "SET CHARACTER SET '{$this->obj['sql_charset']}'"); $this->query( "SET character_set_connection = " .$this->obj['sql_charset']); $res = mysql_query($this->connection_id, "SHOW CHARSET LIKE '" . $this->obj['sql_charset'] . "'" ); $charset = mysql_fetch_row($res); $this->query( "SET collation_connection = " . $charset[2] ); See http://dev.mysql.com/doc/refman/5.1/en/charset-connection.html (or old versions) A SET NAMES 'x' statement is equivalent to these three statements: SET character_set_client = x; SET character_set_results = x; SET character_set_connection = x; Setting each of these character set variables also sets its corresponding collation variable to the default correlation for the character set. And: A SET CHARACTER SET x statement is equivalent to these three statements: SET character_set_client = x; SET character_set_results = x; SET collation_connection = @@collation_database; Setting collation_connection also sets character_set_connection to the character set associated with the collation (equivalent to executing SET character_set_connection = @@character_set_database). It is not necessary to set character_set_connection explicitly. Thus You can reduce this code to: $this->query("SET NAMES '{$this->obj['sql_charset']}'"); $this->query("SET CHARACTER SET '{$this->obj['sql_charset']}'");
Ilia Goranov Posted October 18, 2010 Posted October 18, 2010 BTW it took me some time to find it - in IPB 3.0.x and newer you can just add this to the global_config.php$INFO['sql_charset'] = 'utf8';
Mikhail Posted January 8, 2011 Posted January 8, 2011 BTW it took me some time to find it - in IPB 3.0.x and newer you can just add this to the global_config.php$INFO['sql_charset'] = 'utf8'; Илья - спасибо огромное за решение!
KVentz Posted January 8, 2011 Posted January 8, 2011 We are not aware of any specific bugs having to do with Russian letters. If there ARE issues, someone will need to collect the details and submit a bug report, or submit a ticket in the client area so that we can investigate. Such bugs are always appear when there is recoding between UTF-8 and one of the legacy cyrillic encodings (windows-1251, koi-8 and so on). There can be different encodings in: database store, database collation, database output, PHP output, client input, AJAX processing. If at leas one of them uses UTF-8 while another one uses legacy encoding — there is always problem because utf-legacy recoding process is full of bugs. Some people have problems with ш and И, some have problem with Russian capital 'К' which is gust disappears in messages, some have a mess of question marks and special characters instead of the text. Things get even much worse if someone uses special characters of the extended table of cp-1251 encoding: I never saw correct recode of them on the web. This is the nightmare for programmers. There were problems with UTF-8 when most people used win-1251 and AJAX has come. AJAX uses only UTF-8 while all data was stored in win-1251. Another problems were when people migrated from MySQL 3 to 4: MySQL developers changed work with different encodings dramatically. Now there are problems when we migrate from legacy encodings to UTF-8 — correct recording MySQL database from windows-1251 to utf-8 is a kind of magic. Another one problem: there is a special convert_cyr_string PHP-function (thanks to Russian PHP-core developers) which supports recoding from/to: koi8-r, windows-1251, iso8859-5, x-cp866 and x-mac-cyrillic. But it does not support utf-8! The only way to solve encodings problem totally is to set utf-8 absolutely everywhere: as the MySQL default encoding, as the database, each table and each field encoding, in MySQL collation, in PHP sources and in HTML pages. And by the way: there is a special instruction from IBR how to migrate from windows-1251-based IPB 2 to UTF-8-based IPB 3. There are some screenshots at the end: The first question sign on the top-left is on the place of Russian И. The other ones are on the place of Russian ш. And there is the text below: There can be two reasons: 1. You forgot to edit conf_global.php as instructed in step 1. Try to edit and if it doesn't help try to carefully repeat update. 2. You have converted to utf-8 the database which is in utf-8 already. Start updating from step 2 and do not convert DB, just change the SQL queries. This is a very old manual appeared just when IPB 3 went out. So the only thing people should do is to carefully read manuals. :)
bfarber Posted January 10, 2011 Posted January 10, 2011 Yes, the items you describe are almost entirely out of our control. It is important to use character sets appropriate for your site. For instance, if you use windows-1251 that's fine, but you must use it everywhere. Same with UTF-8.
KVentz Posted January 10, 2011 Posted January 10, 2011 if you use windows-1251 that's fine Not really. You can enter some western european characters with diactrics and they will be shown fine (usually stored as HTML entities) untill you will try to quick edit this post using AJAX. You will loose these characters, because they will be converted into cyrillic ones because cyrillic characters in windows-1251 use the places of western diactrics characters (using the same Extended ASCII Codes 127-255). UTF-8 has 65 536 codes for characters and does not have such problem at all: you can mix latin, cyrillic, chinese in one post without any problems.
bfarber Posted January 12, 2011 Posted January 12, 2011 Not really. You can enter some western european characters with diactrics and they will be shown fine (usually stored as HTML entities) untill you will try to quick edit this post using AJAX. You will loose these characters, because they will be converted into cyrillic ones because cyrillic characters in windows-1251 use the places of western diactrics characters (using the same Extended ASCII Codes 127-255). UTF-8 has 65 536 codes for characters and does not have such problem at all: you can mix latin, cyrillic, chinese in one post without any problems. We have a setting in the ACP to disable AJAX functions which can be used in this case (the setting is there solely for character sets that cannot easily translate between UTF-8 and the charset in use, or for when sites do not wish for the content to be converted to HTML entities). This should be a suitable workaround for the issue, and is built into the software. Nevertheless, we cannot do much about irregularities in individual character sets out there. We recommend using UTF-8 if possible, and this is the default character set on IP.Board 3 and above.
KVentz Posted January 13, 2011 Posted January 13, 2011 We recommend using UTF-8 if possible, So do I. :) And IBR said anyone who want to upgrade to IP.Board 3 have to convert anything to utf-8. Just to avoid any troubles with encodings. I'm happy all these problems with different cyrillic encodings were finally ended!
Basil555 Posted March 9, 2011 Posted March 9, 2011 I would suggest make "UTF8" default char set and collation at the Tables level for Invision Power Board as it is for Word Press and other CMS engines. It is "modern" and stable approach. The issue from this topic appeared at my Board when I've backuped it and restored (Win1251 is used for the DB at the present moment).
Recommended Posts
Archived
This topic is now archived and is closed to further replies.