Member Avatar
diafol

Hi all. Having more problems with utf-8 and Iñtërnâtiônàlizætiøn.

I'm running php (xampp) on Windows 7 (but it's also happening on my remote Linux site) and keep on getting nonsense with non-ASCII chars when they come from include files, e.g.


INCLUDE FILE: 'inc.php'

<?php
$simple = "Iñtërnâtiônàlizætiøn";
?>

MAIN FILE

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" dir="ltr" lang="cy-GB" xml:lang="cy-GB">
<head>
<meta name="language" content="Welsh" />
<meta http-equiv="Content-Language" content="cy-GB" />
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
</head>
<body>
<?php 
include_once("inc.php"); 
echo $simple . "";
echo "Straight: Iñtërnâtiônàlizætiøn";
?>
</body>
</html>
</body>
</html>

I get this:


I've tried putting the utf-8 header in various placed to no avail.

Has anybody seen this before or got an idea? I've been searching for days now. I think I'm going bonkers.

//EDIT - all pages are save as UTF-8 without BOM.

It tests perfect here. It's got to be a server directive. Do you have root or equivalent access to the server?

Member Avatar
diafol

Hmmm. I'll have a look. Thanks for the quick reply.

//EDIT
I' think I've cracked it, the original include file was an ANSI, but I changed it to a UTF-8 without BOM. Unfortunately, I didn't check the state of the non-ansi chars as I was using a different editor. Blast. What a fool.

I'm gonna do some more digging, but I bet that's the problem.

//EDIT AGAIN

Yes that was it. I copied the text from the editor pasted it into Notepad++ and saved as UTF-8 without BOM and now it's perfect.
Can't believe I fell into that. ANyway, thanks tinymark, it was good to know that the 'norm' worked and that it wasn't some weirdness from php itself.

Member Avatar
TechySafi

Didn't get it, you wanna echo those danish characters? If so I think you should write

&Oslash; instead of Ø
&aelig; instead of æ

google for this.
And this line may help

<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
Member Avatar
diafol

@Techysafi
No that's HTML. I was dealing with php. If you write in a non-English language, using html encoding will drive you insane.

If you write in a non-English language, using html encoding will drive you insane.

me too I have this problems all the time

Member Avatar
diafol

> me too I have this problems all the time

Use a string file with an array of statements. As most of my sites are bi/multi-lingual, this is the only sane way to go (unless you use gettext or similar).