Jump to content

SEO not working anymore on latin caracters


Lauren3

Recommended Posts

Since 3.4.x, the SEO URLs are not converted properly on latin caracters, ex.

Title :

Les universités québécoises débarquent en France !

is converted to :

.../topic/122523-les-universits-qubcoises-dbarquent-en-france/

instead of :

.../topic/122523-les-universites-quebecoises-debarquent-en-france/

It worked on previous versions of IPB.

Link to comment
Share on other sites

  • Management

It's not a bug.

We no longer transliterate the URL if you're not using UTF-8. This is because a character like å may be 'ae' in one language but 'ao' in another, so our default transliteration was incorrect for a lot of people in 3.3.

If you're using UTF-8 then the URLs don't need to be transliterated and can be used as normal.

Link to comment
Share on other sites

It's not a bug.

If you're using UTF-8 then the URLs don't need to be transliterated and can be used as normal.

The links to the URLs where UTF-8 is used looks awful when copied from the address bar and added somewhere. See examples:

http://de.wikipedia.org/wiki/Universit%C3%A4t_Z%C3%BCrich
http://ru.wikipedia.org/wiki/%D0%A6%D1%8E%D1%80%D0%B8%D1%85%D1%81%D0%BA%D0%B8%D0%B9_%D1%83%D0%BD%D0%B8%D0%B2%D0%B5%D1%80%D1%81%D0%B8%D1%82%D0%B5%D1%82
http://el.wikipedia.org/wiki/%CE%A0%CE%B1%CE%BD%CE%B5%CF%80%CE%B9%CF%83%CF%84%CE%AE%CE%BC%CE%B9%CE%BF_%CF%84%CE%B7%CF%82_%CE%96%CF%85%CF%81%CE%AF%CF%87%CE%B7%CF%82

http://he.wikipedia.org/wiki/%D7%90%D7%95%D7%A0%D7%99%D7%91%D7%A8%D7%A1%D7%99%D7%98%D7%AA_%D7%A6%D7%99%D7%A8%D7%99%D7%9A

1. It looks like spam.

2. A lot of services are limited for the length of URL, so that you cannot add this messy line there. UTF in URL are not always accepted as wel.

3. Nobody can read this line. Can you see what is this about before clicking on it?

4. Nobody can remember the URL and just give it to other person.

There is so called tranliteration and ISO tables for other languages that can be used to translate any language in latin. It just one small and very simple function that contains one two-dimensional array. For Russian and Ukranian it looks like this:

    static public function translit($str) {
        $tr = array(
            "А" => "a", "Б" => "b", "В" => "v", "Г" => "g",
            "Д" => "d", "Е" => "e", "Ё" => "e", "Ж" => "zh", "З" => "z", "И" => "i",
            "Й" => "j", "К" => "k", "Л" => "l", "М" => "m", "Н" => "n",
            "О" => "o", "П" => "p", "Р" => "r", "С" => "s", "Т" => "t",
            "У" => "u", "Ф" => "f", "Х" => "h", "Ц" => "ts", "Ч" => "ch",
            "Ш" => "sh", "Щ" => "sch", "Ъ" => "", "Ы" => "y", "Ь" => "",
            "Э" => "e", "Ю" => "yu", "Я" => "ya", "а" => "a", "б" => "b",
            "в" => "v", "г" => "g", "д" => "d", "е" => "e", "ё" => "e", "ж" => "zh",
            "з" => "z", "и" => "i", "й" => "j", "к" => "k", "л" => "l",
            "м" => "m", "н" => "n", "о" => "o", "п" => "p", "р" => "r",
            "с" => "s", "т" => "t", "у" => "u", "ф" => "f", "х" => "h",
            "ц" => "ts", "ч" => "ch", "ш" => "sh", "щ" => "sch", "ъ" => "",
            "ы" => "y", "ь" => "", "э" => "e", "ю" => "yu", "я" => "ya", "«" => "", "»" => "",
"І" => "I", "Є" => "E", "Ї" => "I", "Ґ" =>"G",
"і" => "i", "є" => "e", "ї" => "i", "ґ" =>"g"
        );
        return strtr($str, $tr);
    }    
 

Called in core.php in the function makeSeoTitle() like this:


$text = self::translit($text);

does the trick. Nothing more. Just 1-2 minutes to have clean and smart URLs.

The most free open source applications have these ISO translations implemented. The cause is that the there are a lot of non-English people in the development and they can see and understand that UTF8 in URLs is nothing for non-English people. I believe that this is ignored by IPS because they are all English speaking only ;)

Link to comment
Share on other sites

The links to the URLs where UTF-8 is used looks awful when copied from the address bar and added somewhere. See examples:

http://de.wikipedia.org/wiki/Universit%C3%A4t_Z%C3%BCrich
http://ru.wikipedia.org/wiki/%D0%A6%D1%8E%D1%80%D0%B8%D1%85%D1%81%D0%BA%D0%B8%D0%B9_%D1%83%D0%BD%D0%B8%D0%B2%D0%B5%D1%80%D1%81%D0%B8%D1%82%D0%B5%D1%82
http://el.wikipedia.org/wiki/%CE%A0%CE%B1%CE%BD%CE%B5%CF%80%CE%B9%CF%83%CF%84%CE%AE%CE%BC%CE%B9%CE%BF_%CF%84%CE%B7%CF%82_%CE%96%CF%85%CF%81%CE%AF%CF%87%CE%B7%CF%82

http://he.wikipedia.org/wiki/%D7%90%D7%95%D7%A0%D7%99%D7%91%D7%A8%D7%A1%D7%99%D7%98%D7%AA_%D7%A6%D7%99%D7%A8%D7%99%D7%9A

1. It looks like spam.

2. A lot of services are limited for the length of URL, so that you cannot add this messy line there. UTF in URL are not always accepted as wel.

3. Nobody can read this line. Can you see what is this about before clicking on it?

4. Nobody can remember the URL and just give it to other person.

There is so called tranliteration and ISO tables for other languages that can be used to translate any language in latin. It just one small and very simple function that contains one two-dimensional array. For Russian and Ukranian it looks like this:

    static public function translit($str) {
        $tr = array(
            "А" => "a", "Б" => "b", "В" => "v", "Г" => "g",
            "Д" => "d", "Е" => "e", "Ё" => "e", "Ж" => "zh", "З" => "z", "И" => "i",
            "Й" => "j", "К" => "k", "Л" => "l", "М" => "m", "Н" => "n",
            "О" => "o", "П" => "p", "Р" => "r", "С" => "s", "Т" => "t",
            "У" => "u", "Ф" => "f", "Х" => "h", "Ц" => "ts", "Ч" => "ch",
            "Ш" => "sh", "Щ" => "sch", "Ъ" => "", "Ы" => "y", "Ь" => "",
            "Э" => "e", "Ю" => "yu", "Я" => "ya", "а" => "a", "б" => "b",
            "в" => "v", "г" => "g", "д" => "d", "е" => "e", "ё" => "e", "ж" => "zh",
            "з" => "z", "и" => "i", "й" => "j", "к" => "k", "л" => "l",
            "м" => "m", "н" => "n", "о" => "o", "п" => "p", "р" => "r",
            "с" => "s", "т" => "t", "у" => "u", "ф" => "f", "х" => "h",
            "ц" => "ts", "ч" => "ch", "ш" => "sh", "щ" => "sch", "ъ" => "",
            "ы" => "y", "ь" => "", "э" => "e", "ю" => "yu", "я" => "ya", "«" => "", "»" => "",
"І" => "I", "Є" => "E", "Ї" => "I", "Ґ" =>"G",
"і" => "i", "є" => "e", "ї" => "i", "ґ" =>"g"
        );
        return strtr($str, $tr);
    }    
 

Called in core.php in the function makeSeoTitle() like this:

does the trick. Nothing more. Just 1-2 minutes to have clean and smart URLs.

The most free open source applications have these ISO translations implemented. The cause is that the there are a lot of non-English people in the development and they can see and understand that UTF8 in URLs is nothing for non-English people. I believe that this is ignored by IPS because they are all English speaking only ;)

Not ignored.

But you actually just proved the point Matt made above....

That is your languages transliteration.

that said.... Matt... just stick the table into the language pack like the __stopwords__ versus cludging it into a core file hardcoded in a static class?

Link to comment
Share on other sites

that said.... Matt... just stick the table into the language pack like the __stopwords__ versus cludging it into a core file hardcoded in a static class?

Oh no, this is not what I meant. Of course this cannot been placed into the core files. I just wanted to show how easy SEO URLs issue can be solved for each language. I have mentioned the issue more than one year ago and have suggested this solution. It works for me fine on the huge community. If Matt somewhere talks about implementation of transliteration into the language files then this will be great. This the only patch I always use on all my Russian and German projects and I would be very glad to get rid of patches. :unsure:

Link to comment
Share on other sites

Oh no, this is not what I meant. Of course this cannot been placed into the core files. I just wanted to show how easy SEO URLs issue can be solved for each language. I have mentioned the issue more than one year ago and have suggested this solution. It works for me fine on the huge community. If Matt somewhere talks about implementation of transliteration into the language files then this will be great. This the only patch I always use on all my Russian and German projects and I would be very glad to get rid of patches. :unsure:

I was not actually saying anything in regards to your post with that last sentence.... it *was* actually cludged in a core file hardcoded in a static class >=3.3. :smile:

I think personally that was the cause of many bug reports, if the user could fix it for their language without file edits time and again, I doubt anyone would bother reporting a bug when it is wrong for their language, just modify it for the lang and done.

Link to comment
Share on other sites

Anyway, again Matt & Marcher are ridiculing the customers.

Either IPS does what we ask (transliterate) or it will simply be the one and only board not transliterating LATIN1.

Please.... please... please tell me where I am ridiculing anyone?

I simply provided information..... I can certainly stop doing that at all if it offends.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...