How to use wifstream to read a unicode file..

Question

smaity 0 Newbie Poster

19 Years Ago

Hi All,
Currently I am doing one project related to unicode file reading & writing. I got my result using CFile & using WcharToMultibyte,MultiByteToWchar conversion functions. But I have a doubt whether those functions work fine if the unichar is more than 2 bytes.

Now I want to read the file(UTF8,16(BE),16(LE)) using wifstream..

Can anyone help me???

c++

3 Contributors
6 Replies
836 Views
2 Days Discussion Span
Latest Post 19 Years Ago Latest Post by Ancient Dragon

All 6 Replies

WolfPack 491 Posting Virtuoso

19 Years Ago

unichar can be more than 2 bytes? I thought it was always 2 bytes.

Ancient Dragon 5,243 Achieved Level 70

19 Years Ago

unichar can be more than 2 bytes? I thought it was always 2 bytes.

The size of wchar_t is operating system dependent. On MS-Windows wchar_t is defined as unsigned short. *nix computers it is unsigned long. And the UNICODE standards say that they intend to have 64-bit wchr_t.

That becomes a very big problem when attempting to port a UNICODE file between operating systems.

smality: No sure if this will help or not.

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

smaity 0 Newbie Poster · Answer 1 · 2005-12-08T10:41:10+00:00

The size of wchar_t is operating system dependent. On MS-Windows wchar_t is defined as unsigned short. *nix computers it is unsigned long. And the UNICODE standards say that they intend to have 64-bit wchr_t.
That becomes a very big problem when attempting to port a UNICODE file between operating systems.
smality: No sure if this will help or not.

thank you Ancient ..for providing the link, but its not enough..there is no clear idea about conversion..
this time i am trying use wistream...i willl read byte by byte......... and after getting the BOM ..then i will read all the bytes for a unichar...but if i get the byte then how to convert it back to unichar to show in textbox or listControl..

Do you have any idea regarding wistream application

thanks..

Ancient Dragon 5,243 Achieved Level 70 Team Colleague Featured Poster · Answer 2 · 2005-12-08T16:49:23+00:00

I don't use c++ streams for UNICODE for the reasons you describe -- its a lot easier to use C's FILE, fopen() in binary mode, fread() and fwrite(). You don't have to worry about conversion that way. That works providing you don't want to transport the file from one operating system to another and you don't want to use another editor such as Notepad.exe to read it.

If you still want to use wfstreams, you can use mbstowcs() to convert from char* to wchar_t*, or wcstombs() to convert the other direction.

smaity 0 Newbie Poster · Answer 3 · 2005-12-09T15:06:56+00:00

If you still want to use wfstreams, you can use mbstowcs() to convert from char* to wchar_t*, or wcstombs() to convert the other direction.

But i got to know that wifstream/wistream uses wchar_t whiich is of 2 byte in windows system. Now the problem is that if the unicode character is more than 2 bytes (surrogates) then it is not possible to read or show unicode characters..
VC compiler is not designed in that way..

Thanks,

Ancient Dragon 5,243 Achieved Level 70 Team Colleague Featured Poster · Answer 4 · 2005-12-09T20:37:36+00:00

you will probably have to write your own conversion functions that compress those 32-bit characters into 16 or 8 bit characters. But that may not work if the data requires all (or most) 32 bits to store each character, such as needed by many of the eastern languages (Chines, Japanese, etc).

How to use wifstream to read a unicode file..

Recommended Answers Collapse Answers

All 6 Replies

Recommended Answers