I've searched all over for an answer to this, including this forum, so sorry if I missed something,
anyway, I'd like to get a numerical code from extended characters like ß or ü and so on.
I don't use them very much myself, as I'm a native English language user! But they pop up enough that I should be able to support them if they arise.
I have found some information on long chars but I didn't manage to find a resource I could understand enough to actually use.

char c;
	int i;
	c = 'h';
	i = c;
	std::cout << i << "\n";

i is now equal to 104, the standard ascii number.
How can I consistently get the same number from one of the extended characters, and convert back again if needed?

If you hury up you can correect the code tags.


Notice no spaces and its cplusplus not c++

Are you talking about converting UNICODE wchar_t* to char*? Here is a thread that shows one way to do it.

If this is a serious project (as opposed to something for learning Unicode), I'd suggest ICU. Managing Unicode is a bitch without a good library.

Hi, thanks for your speedy answers! I'm having a look at ICU. I'll mark the thread "solved" in a day or so just in case any other good ideas turn up.

Hi again, ICU seems enormously complex, is it overkill when all I need is a number to character and back again conversion? Or is this the only way?

ICU is enormously complex because Unicode is enormously complex.

when all I need is a number to character and back again conversion?

Let's assume you want to do it manually. You'd need to support at least UTF-8, UTF-16 (including surrogates), and UTF-32. The process is different for converting each of those into a code point. Now, in all honesty that's not especially difficult. It's more difficult than calling a library function, but straightforward, in my opinion.

The hard part comes when you realize that you're probably not just converting a character to a code point, you're likely introducing general Unicode support including I/O and comparisons, which opens up a can of worms like normalization (and normalization is stupidly complex if you're thinking about doing it manually).

OK, thanks people for your insight, I appreciate it!