Books on Character Encoding?

Question

furalise 0 Junior Poster in Training

11 Years Ago

Hi there everyone.
I'm trying to get a start on learning about Character encoding. I'm trying to understand what it is and how it is used and how to interpret it etc etc... I know there is allot of web resources out there on this kind of thing but I was hoping to use a book which will explain this in english.

Not a super advanced bood like the Orielly version, but a beginners book, something like at the 'For Dummies' level. I'm unable to find one so far though I thought someone out there may have a good idea if one exists.
I'm programming in C++ if that is at all relevent.

Thank you..

2 Contributors
7 Replies
270 Views
5 Days Discussion Span
Latest Post 11 Years Ago Latest Post by furalise

All 7 Replies

deceptikon 1,790 Code Sniper

11 Years Ago

There's not really a 'for Dummies' level book because the topic is very complex. The O'Reilly book is probably your best bet for an introduction, and if it's too difficult you're probably lacking in prerequisite education.

Perhaps if you point out some parts of the book that are troublesome, we can clarify things or point you toward a resource for further learning.

deceptikon 1,790 Code Sniper

11 Years Ago

There were always 8 bits in a byte.

That's not universally true. To the best of my knowledge there aren't any platforms where a byte is less than 8 bits, but there certainly are platforms where it's more.

Why create an encoding standard that is stored with multiple 0's when say this could be done with 16 bits or 8 bits..

Storage cost is only a part of the problem. When you have complex encoding methods, it takes time to process them. UTF-32 is the ideal when you can afford the space because it's a lot faster than encoding and decoding UTF-16 or UTF-8 compression. UTF-16 and UTF-8 were designed to reduce storage cost, and the price of that is reduced performance.

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

furalise 0 Junior Poster in Training · Answer 1 · 2013-09-01T09:37:16+00:00

Thanks very much. I am reading through the orelly book now which is quiet extensive so I have not doubt there will be questions.

furalise 0 Junior Poster in Training · Answer 2 · 2013-09-05T13:44:14+00:00

I am a little confused about something. I now understand what UTF-32 -8 and -16 is but what I can't understand is why UTF-32 was invented in the first place if UTF-8 would be a easily substitutable by using multiple bytes as opposed to allot of empty wydes in UTF-32 for example.

Surely there was nothing stopping UTF-8 from the beginning.. Does anyone know why this didn't happen?

Thanks

deceptikon 1,790 Code Sniper Team Colleague Featured Poster · Answer 3 · 2013-09-05T13:58:51+00:00

Because UTF-8 didn't exist in the beginning? It was proposed years later.

furalise 0 Junior Poster in Training · Answer 4 · 2013-09-05T14:39:02+00:00

I understand that, but it doesn't make sense that it wasn't introduced first. There were always 8 bits in a byte. Why create an encoding standard that is stored with multiple 0's when say this could be done with 16 bits or 8 bits.. Maybe it was simply to differentiate software in the beginning (a microsoft way of thinking).

furalise 0 Junior Poster in Training · Answer 5 · 2013-09-06T02:47:58+00:00

furalise 0 Junior Poster in Training

11 Years Ago

Thanks.. That explains it well.

Books on Character Encoding?

Recommended Answers Collapse Answers

All 7 Replies

Recommended Answers