We are an American company, doing business world wide, we have a lot of members coming on board in Russia, Estonia, Latvia, Africa and lots of other places. They are all asking for us to translate the web pages. So I programmed in a language translator, where we can give access to the people we have doing the translation, they see something like this when they login:

Language tag name: main_page_title
Current Value: (text area web form:)[value]

{2 line breaks}
Current English Value: [value here so they know what to translate]
{2 line breaks}

Save button

That works perfectly for any that use the latin character set. However, for the languages using the Cyrillic language set, currently Russian, I am having this issue:

When it saves it to the mysql table, it changes it to html encoded entries, such as this: & #1053;& #1072;& #1078;

Entered in textarea web form:

Нажмите здесь, если совпадает с вашей биллинг адресом вашего адреса отгрузки

Value seen in table:(I had to put a blank space after the & before the # sign, otherwise this page converted it BACK to Cryillic Language...: & #1053;& #1072;& #1078;& #1084;& #1080;& #1090;& #1077; & #1079;& #1076;& #1077;& #1089;& #1100;, & #1077;& #1089;& #1083;& #1080; & #1089;& #1086;& #1074;& #1087;& #1072;& #1076;& #1072;& #1077;& #1090; & #1089; & #1074;& #1072;& #1096;& #1077;& #1081; & #1073;& #1080;& #1083;& #1083;& #1080;& #1085;& #1075; & #1072;& #1076;& #1088;& #1077;& #1089;& #1086;& #1084; & #1074;& #1072;& #1096;& #1077;& #1075;& #1086; & #1072;& #1076;& #1088;& #1077;& #1089;& #1072; & #1086;& #1090;& #1075;& #1088;& #1091;& #1079;& #1082;& #1080; If I go to phpMyAdmin and edit it manually and put the Cyrillic character set in there as above then save it, it does save it properly, however, when the Perl gets it and displays it anywhere, even in a dump, all it has is this:

??????? ?????, ???? ????????? ? ????? ??????? ??????? ?????? ?????? ????????

Can someone tell me how to get this working proplerly?

First how I can get my text area form to not encode the characters into the &#someNumber; and save properly into the table, then how to get Perl to not see it ony as ???? characters.

Here is what I have tried and does not work...
I made sure when I change the language to Russian, I set this:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta http-equiv="Content-language" content="RU">

I did this to the Perl code:

binmode(STDOUT, ":utf8");

That did not work, so I un-did it.
It changed any accented character in the latin set to some Captial A with an accent. Was very odd..., it did not work on anything.

then I tried this:

use open ':utf8';
use utf8;
use Encode;
binmode(STDOUT, ":utf8");

I have tried everything I have read that had suggestions and I could not find it, even the Perl Monks have not had any more advice, I tried everything they suggested too.

I am getting desperate as the members are getting anxious about it.

Please let me know if you know of a solution. I have no idea if this is a MySql only issue, a Perl Only issue, a server only issue or all 3 or any combination of them.

There MUST be a way to get this to work, I see other websites have a russian language set for their translation pages that they can use latin set languages as well as japanese and russian.

Thank you in advance for any advice you have even if it is something small you can think of.

Richard Jones

Been working on it toying with different ideas, and I have come up with an idea, as to what is causing the ??? marks...

When I run this in my perl, just before I run a query:

$dbh->do(qq~set CHARACTER SET cp1251~);

Then it changes from ? marks to this:

Value in database: Нажмите здесь, если совпадает с вашей биллинг адресом вашего адреса отгрузки Value displayed on page: Ãà æìèòå çäåñü, åñëè ñîâïà äà åò ñ âà øåé áèëëèíã à äðåñîì âà øåãî à äðåñà îòãðóçêè According to this page:
http://www.collation-charts.org/mysql60/mysql604.cp1251_general_ci.html
cp1251 does contain those character sets. so I am not sure why that does not work, because if I set the character set to cp1251_general_ci I still see this: ??????? ?????, ???? ????????? ? ????? ??????? ??????? ?????? ?????? ???????? very curious to me... not sure what that is happening.

So that does change it, it does not fix it, but it changes it.
If I set the character set to:
koi8r it changes: Нажмите здесь, если совпадает с вашей биллинг адресом вашего адреса отгрузки to: îÃÖÃÉÔÅ ÚÄÅÓØ, ÅÓÌÉ ÓÃ×ÃÃÄÃÅÔ Ó ×ÃÛÅÊ ÂÉÌÌÉÎÇ ÃÄÒÅÓÃà ×ÃÛÅÇà ÃÄÒÅÓà ÃÔÇÒÕÚËÉ So that also changes it... not ? marks but still not right.
according to that coalition page, it also contains the correct character set mapping that what is stored in the table uses.
http://www.collation-charts.org/mysql60/mysql604.koi8r_general_ci.html

Same thing there, if I use the actual file name: koi8r_general_ci then there are still ?? marks.

I am very desperate for assistance on this, so please if anyone out there knows what I am doing right...

Thank you so much,
Richard

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.