How to detect effeciently file encoding of a text file at runtime?

Question

kevintse 0 Light Poster

15 Years Ago

Well I am writing a program for Windows Mobile phone, I need to read text files in the program, the character sets (charset) of the files I am going to read are unknown, here I need to convert whatever text(Actually I probably only need to handle UTF-8, UTF-16 BE, GBK, BIG5) to Unicode(UTF-16 LE) to properly display them, the conversion proper is quite simple using the Windows API, but I don't know how to do the detection of the file encodings at runtime?

Anyone any ideas?
Your replies will be greatly appreciated!

-- Kevin Tse

api c++ windows-api

2 Contributors
2 Replies
314 Views
18 Hours Discussion Span
Latest Post 15 Years Ago Latest Post by kevintse

All 2 Replies

Ancient Dragon 5,243 Achieved Level 70

15 Years Ago

Here is just one of may articles you can find about that topic.

Here are some others.

Salem commented: Works for me :) +36

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

kevintse 0 Light Poster · Answer 1 · 2009-07-19T08:34:34+00:00

Thank you.
The article you pointed me to was just showing how to read Unicode and ANSI files, which I already knew. I knew I could easily know the encoding of files with BOMs(Byte Order Mark), but there are no BOMs for ANSI files, like GBK, BIG5, and there can even be UTF-8, UTF-16 without BOMs.

Anyway, the article gives me a clue that I can ALWAYS test if there are BOMs, UTF-8, UTF-16 LE and UTF-16 BE may have BOMs, so I can take all others without BOMs as ANSI files, though the assumption is not accurate, it can do what I want most of the time, I think.

How to detect effeciently file encoding of a text file at runtime?

Recommended Answers Collapse Answers

All 2 Replies

Recommended Answers