944,147 Members | Top Members by Rank

Ad:
  • C++ Discussion Thread
  • Unsolved
  • Views: 15594
  • C++ RSS
Dec 7th, 2005
0

How to use wifstream to read a unicode file..

Expand Post »
Hi All,
Currently I am doing one project related to unicode file reading & writing. I got my result using CFile & using WcharToMultibyte,MultiByteToWchar conversion functions. But I have a doubt whether those functions work fine if the unichar is more than 2 bytes.

Now I want to read the file(UTF8,16(BE),16(LE)) using wifstream..

Can anyone help me???
Similar Threads
Reputation Points: 10
Solved Threads: 0
Newbie Poster
smaity is offline Offline
3 posts
since Dec 2005
Dec 7th, 2005
0

Re: How to use wifstream to read a unicode file..

unichar can be more than 2 bytes? I thought it was always 2 bytes.
Moderator
Reputation Points: 572
Solved Threads: 115
Mentally Challenged Mod.
WolfPack is offline Offline
1,559 posts
since Jun 2005
Dec 7th, 2005
0

Re: How to use wifstream to read a unicode file..

Quote originally posted by WolfPack ...
unichar can be more than 2 bytes? I thought it was always 2 bytes.

The size of wchar_t is operating system dependent. On MS-Windows wchar_t is defined as unsigned short. *nix computers it is unsigned long. And the UNICODE standards say that they intend to have 64-bit wchr_t.

That becomes a very big problem when attempting to port a UNICODE file between operating systems.

smality: No sure if this will help or not.
Sponsor
Team Colleague
Featured Poster
Reputation Points: 5608
Solved Threads: 2283
Retired and Enjoying Life
Ancient Dragon is offline Offline
21,963 posts
since Aug 2005
Dec 8th, 2005
0

Re: How to use wifstream to read a unicode file..

Quote originally posted by Ancient Dragon ...
The size of wchar_t is operating system dependent. On MS-Windows wchar_t is defined as unsigned short. *nix computers it is unsigned long. And the UNICODE standards say that they intend to have 64-bit wchr_t.

That becomes a very big problem when attempting to port a UNICODE file between operating systems.

smality: No sure if this will help or not.
thank you Ancient ..for providing the link, but its not enough..there is no clear idea about conversion..
this time i am trying use wistream...i willl read byte by byte......... and after getting the BOM ..then i will read all the bytes for a unichar...but if i get the byte then how to convert it back to unichar to show in textbox or listControl..

Do you have any idea regarding wistream application

thanks..
Reputation Points: 10
Solved Threads: 0
Newbie Poster
smaity is offline Offline
3 posts
since Dec 2005
Dec 8th, 2005
0

Re: How to use wifstream to read a unicode file..

I don't use c++ streams for UNICODE for the reasons you describe -- its a lot easier to use C's FILE, fopen() in binary mode, fread() and fwrite(). You don't have to worry about conversion that way. That works providing you don't want to transport the file from one operating system to another and you don't want to use another editor such as Notepad.exe to read it.


If you still want to use wfstreams, you can use mbstowcs() to convert from char* to wchar_t*, or wcstombs() to convert the other direction.
Sponsor
Team Colleague
Featured Poster
Reputation Points: 5608
Solved Threads: 2283
Retired and Enjoying Life
Ancient Dragon is offline Offline
21,963 posts
since Aug 2005
Dec 9th, 2005
0

Re: How to use wifstream to read a unicode file..

Quote originally posted by Ancient Dragon ...
If you still want to use wfstreams, you can use mbstowcs() to convert from char* to wchar_t*, or wcstombs() to convert the other direction.
But i got to know that wifstream/wistream uses wchar_t whiich is of 2 byte in windows system. Now the problem is that if the unicode character is more than 2 bytes (surrogates) then it is not possible to read or show unicode characters..
VC compiler is not designed in that way..

Thanks,
Reputation Points: 10
Solved Threads: 0
Newbie Poster
smaity is offline Offline
3 posts
since Dec 2005
Dec 9th, 2005
0

Re: How to use wifstream to read a unicode file..

you will probably have to write your own conversion functions that compress those 32-bit characters into 16 or 8 bit characters. But that may not work if the data requires all (or most) 32 bits to store each character, such as needed by many of the eastern languages (Chines, Japanese, etc).
Sponsor
Team Colleague
Featured Poster
Reputation Points: 5608
Solved Threads: 2283
Retired and Enjoying Life
Ancient Dragon is offline Offline
21,963 posts
since Aug 2005

This thread is more than three months old

No one has posted to this discussion for at least three months. Please let old threads die and do not reply to them unless you feel you have something new and valuable to contribute that absolutely must be added to make the discussion complete. Otherwise, please start a new thread in this forum instead.
Message:
Previous Thread in C++ Forum Timeline: C++ Graphics (graphics.h)
Next Thread in C++ Forum Timeline: complex declarations & other doubts





About Us | Contact Us | Advertise | Acceptable Use Policy
Forum Index | Build Custom RSS Feed


Follow us on Twitter


© 2011 DaniWeb® LLC