We're a community of 1077K IT Pros here for help, advice, solutions, professional growth and fun. Join us!
1,076,304 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Start New Discussion Reply to this Discussion

Display encoded file uploaded

Good day,

I wish to print out the content of the files uploaded by the user from a input file type. It is passed to backend using post method.
I have done some research but unforturnately I still haven't found a solution that satisfy me. The answer that are close to what I search for is http://be.php.net/manual/vote-note.php?id=91051&page=function.mb-detect-encoding&vote=up
I can get the encoding type of the files and decode it relatively but in the end, other language such as chinese char content won't be shown correctly.

Hopefully someone having same expirience with me previously can lend a hand to me. Thanks in advance.

2
Contributors
4
Replies
3 Days
Discussion Span
2 Months Ago
Last Updated
25
Views
lps
Posting Whiz in Training
208 posts since Jul 2011
Reputation Points: 13
Solved Threads: 43
Skill Endorsements: 3

Show your display page - have you set meta tag or php header to UTF-8?

diafol
Keep Smiling
Moderator
10,668 posts since Oct 2006
Reputation Points: 1,628
Solved Threads: 1,514
Skill Endorsements: 57

below is the code I done to dump out the content:

if(isset($_FILES['files'])){
    header('Content-Type: text/html; charset=utf-8');
    // Unicode BOM is U+FEFF, but after encoded, it will look like this.
    define ('UTF32_BIG_ENDIAN_BOM'   , chr(0x00) . chr(0x00) . chr(0xFE) . chr(0xFF));
    define ('UTF32_LITTLE_ENDIAN_BOM', chr(0xFF) . chr(0xFE) . chr(0x00) . chr(0x00));
    define ('UTF16_BIG_ENDIAN_BOM'   , chr(0xFE) . chr(0xFF));
    define ('UTF16_LITTLE_ENDIAN_BOM', chr(0xFF) . chr(0xFE));
    define ('UTF8_BOM'               , chr(0xEF) . chr(0xBB) . chr(0xBF));

    $file = ($_FILES['files']);

    $filename = ($file['tmp_name']);
    $text = file_get_contents($filename);
    echo "<pre>";
    $first2 = substr($text, 0, 2);
    $first3 = substr($text, 0, 3);
    $first4 = substr($text, 0, 3);

    if ($first3 == UTF8_BOM){
        echo str_replace(UTF8_BOM, "", $text);
        $code = "UTF-8";
    }elseif ($first4 == UTF32_BIG_ENDIAN_BOM){
        echo str_replace(UTF32_BIG_ENDIAN_BOM, "", $text);
        $code = "UTF-32";
    }elseif ($first4 == UTF32_LITTLE_ENDIAN_BOM){
        echo str_replace(UTF32_LITTLE_ENDIAN_BOM, "", $text);
        $code = "UTF-32";
    }elseif ($first2 == UTF16_BIG_ENDIAN_BOM){
        echo str_replace(UTF16_BIG_ENDIAN_BOM, "", $text);
        $code = "UTF-16";
    }elseif ($first2 == UTF16_LITTLE_ENDIAN_BOM){
        echo str_replace(UTF16_LITTLE_ENDIAN_BOM, "", $text);
        $code = "UTF-16";
    }else{
        echo $text;
    }
    echo "</pre>";
}
lps
Posting Whiz in Training
208 posts since Jul 2011
Reputation Points: 13
Solved Threads: 43
Skill Endorsements: 3

I have little experience with chinese text, although I've spent a lot of time with other encoding issues. Could you link to a typical file so that we could try to replicate the problem? No point posting the contents as that may not include the BOM, if there is one.

diafol
Keep Smiling
Moderator
10,668 posts since Oct 2006
Reputation Points: 1,628
Solved Threads: 1,514
Skill Endorsements: 57

The example file I use for testing is this: https://dl.dropbox.com/u/95553471/little%20endian%2016.txt

lps
Posting Whiz in Training
208 posts since Jul 2011
Reputation Points: 13
Solved Threads: 43
Skill Endorsements: 3

Post: Markdown Syntax: Formatting Help
 
You
View similar articles that have also been tagged:
 
© 2013 DaniWeb® LLC
Page rendered in 0.0803 seconds using 2.71MB