1,105,334 Community Members

Porblem in UTF-8 support on CSV/EXCEL

Member Avatar
millanskie
Junior Poster in Training
63 posts since Dec 2006
Reputation Points: 0 [?]
Q&As Helped to Solve: 1 [?]
Skill Endorsements: 0 [?]
 
0
 

Hi,

First Problem:

I have a PHP script that export CSV with Japanese text, for example I download the CSV on the website then save to desktop of my computer and Open it on NOTEPAD... that works fine I could still see all the japanese text OK with no problem, but if I open the CSV file in Microsoft EXCEL, all the japanese text are busted and no longer readable.

Do you know a solution for this, how could I make it work OK on MS Excel, because the user want it to view the CSV file in Excel.


And my second Problem is:

I have another PHP script that export data with Japanese text from database to MS Excel, if I type the Japanese text on the script it works fine but if the text came from database, my problem starts because when i Download and open it in MS Excel the japanese text That I type directly in the script as a header of the excel file is OK but the contents in japanese text populated from the database is not readable in Japanese form....

Do you have any solution for my problem....


Thanks

Member Avatar
digital-ether
Nearly a Posting Virtuoso
1,264 posts since Sep 2005
Reputation Points: 399 [?]
Q&As Helped to Solve: 102 [?]
Skill Endorsements: 11 [?]
Team Colleague
 
0
 

I have the same problems with encoding in a different content - XML files.

For your prob..
Does excel support utf-8 fully?

Question I'm asking myself, how does excel determine encoding?

When saving to the database, is the database data stored as UTF-8?

PHP functions are also non-utf8 aware. For example: strlen() on a UTF-8 string would return the number of bytes in the string, not the number of UTF-8 chars which would be less since a single char is encoded in multiple bytes..

Does your code use some PHP string functions?

Member Avatar
millanskie
Junior Poster in Training
63 posts since Dec 2006
Reputation Points: 0 [?]
Q&As Helped to Solve: 1 [?]
Skill Endorsements: 0 [?]
 
0
 

Hi, thanks for the reply. I already found a solution on my problem, I use this kind of php function to solve my problem

$str = mb_convert_encoding($str, "SJIS","UTF-8");

this solve all of my encoding problem....

thanks

Question Answered as of 7 Years Ago by digital-ether
Member Avatar
digital-ether
Nearly a Posting Virtuoso
1,264 posts since Sep 2005
Reputation Points: 399 [?]
Q&As Helped to Solve: 102 [?]
Skill Endorsements: 11 [?]
Team Colleague
 
0
 

Awesome.

Just a note: in order to use the function b_convert_encoding, your php installation needs to have the mb_string extension loaded. Not all PHP installations will have this.

I believe there is a sourceforge.net project however that has a mirror of the mb_string functions in PHP.

This will allow you to use the mb_string functions on "any" PHP installation.

Member Avatar
sohel62
Newbie Poster
1 post since Feb 2008
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

Have you tried when your text contain ㈱①②~.

$str = mb_convert_encoding($str, "SJIS","UTF-8");

not work for them.

Best of luck
Sohel Taslim

Member Avatar
nakhan
Newbie Poster
1 post since Jun 2009
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
1
 

Hi Guys,

It seems like everyone is having this issue. This is very common problem and i have been looking for its solution for a long time too. I have't tried the mb_... function and i am not sure if this will help for all languages but here is something i have found and have tested it to make sure it works for almost all languages.

This is problem with Excel NOT the data format that is exported. Here are the steps

1- download the exported CSV/XLS file from your website.
2- Open Excel 2007
3- Open a new file
4- Click the Data Menu option
5- Click "From Text" button
6- Select the file you downloaded
7- Make sure "Delimited" is selected and Press Next
8- Check the delimiter characters that you know are in your file like Comma or Tab or whatever is in your case. You can select more than one
9- Proceed to Next Step and Finich
10- Your excel file will be ready with all you data displayed correctly

I hope that helps someone who has been frustrated on this.

Regards
Nauman

Member Avatar
gokulakrishnae
Newbie Poster
1 post since Jan 2010
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

Hey guys.There is a simple PHP Class "php-excel" which supports multibyte charecters.
php-excel aims to be the most simple and lightweight approach to convert a matrix-like, two-dimensional array from PHP to Microsoft Excel.
For further details and code samples please go to the following link

http://code.google.com/p/php-excel/

Regards
Gokul

Member Avatar
dracon64
Newbie Poster
1 post since Feb 2010
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

Thanks for this. Works really fine for me

Member Avatar
spats
Newbie Poster
1 post since Apr 2010
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

Fixed this issue, below is how

Using any of these(Windows-1252,CP1252, Windows-1256) as source encoding for text in iconv, converts characters properly to UTF-8

iconv("Windows-1252", "UTF-8", $text);

looks like excel stores csv in Windows-1252 encoding(i guess whatever machine default encoding)


Strange thing is mb_detect_encoding detect input string as ISO-8859-1, where as iconv converts properly only when we use any of the above mentioned encoding as source encoding.

Member Avatar
manolisvl46
Light Poster
30 posts since Feb 2010
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

hallo there everyone!!

i tried all of the possibilities found on the net about your question and still nothing..
has anyone found anything on how to
decodes his variable contents into UTF-8 ...on a csv of excel file...???

Member Avatar
Joshua.it
Newbie Poster
1 post since Apr 2010
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

Hello to everybody :)
i have a problem...when i import html into a csv files, the characters it modify, and it appear new strange characters!
Could anyone help me to solve this problem?
Thanks in advance
Best regards

Member Avatar
manolisvl46
Light Poster
30 posts since Feb 2010
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

solved my problem

$line = iconv("UTF-8","ISO-8859-7",$sheet);

i hvae tried many times but the right combination for the greeks to utf8 is that one...
try it on mates...

Member Avatar
gwencalon
Newbie Poster
2 posts since Jan 2011
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

I had a UTF8 (without BOM) csv file generated by a Perl script. Opening it in Excel, certain characters wouldn't appear properly. Adding a BOM (byte order mark) proved to be a workaround for this issue.

So, if you're having similiar problems, ultimately you may try encoding your file in UTF8 (with BOM -> without BOM, Excel doesn't seem to recognize the UTF8 encoding). If you have no other way, just print this \x{FEFF} at the begining of the file.

This thread is a bit old, I know :), but I think it may help someone who comes across it and still couldn't solve the issue in any other way.

Member Avatar
gwencalon
Newbie Poster
2 posts since Jan 2011
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

PS: I'd just like to add to my previous post that it worked under the specifics of my project (Perl script running in Linux environment and writing data - coming from a utf8 database - to a csv file) and for the desired purpose (display all text properly in windows machines when opening the file in ms excel) - it is not a universal solution, however, so it may not suit your needs.

I know this is a PHP thread but... as for the Perl script, since it may be useful for someone reading this, I'm also setting this up:

Header:

use encoding 'UTF8';

When connecting to the database:

$conn = DBI->connect("$url user=$username password=$password");
$conn->{pg_enable_utf8} = 1;
$conn->do("SET client_encoding TO 'UTF8'");

When opening the file:

open (FILE_DAT, ">>:encoding(utf8)",$myfilename)
binmode(FILE_DAT, ":utf8"); 
print FILE_DAT "\x{FEFF}";

Some of these may not be required at all (probably a lot of redundancy here). I may do some further testing to check it out, but it may be a hint for you out there in despair with encoding issues ;)

Member Avatar
fahim_cse
Newbie Poster
1 post since Feb 2011
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

Have you tried when your text contain ㈱①②~.

$str = mb_convert_encoding($str, "SJIS","UTF-8");

not work for them.

Use
$str = mb_convert_encoding($str, "UTF8", "JIS, eucjp-win, sjis-win");

this work for me

Member Avatar
ruddyfrompt
Newbie Poster
1 post since Mar 2012
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

//these characters will make correct encoding to excel
echo chr(255).chr(254).iconv("UTF-8", "UTF-16LE//IGNORE", $data);

You
This question has already been solved: Start a new discussion instead
Post:
Start New Discussion
Tags Related to this Article