Unicode ord and unichr

Question

Peagles 0 Newbie Poster

18 Years Ago

Hi!

How can i derive the used integer from variable a, when i use following code:

a = unichr(275).encode('utf8')

When i try this:

print ord(a)

It raises an error...

python

4 Contributors
6 Replies
1K Views
2 Days Discussion Span
Latest Post 18 Years Ago Latest Post by Ene Uran

vegaseat 1,735 DaniWeb's Hypocrite

18 Years Ago

This is going to be tough, because "print a" gives you a string consisting of two hex values --> '\xc4\x93'. The function ord() seems to only handle single characters.

This works ...

>>> b = unichr(275)
>>> b
u'\u0113'
>>> ord(b)
275

Is there a way to decode('utf8')? Sorry, I just don't work with unicode much.

ghostdog74 57 Junior Poster

18 Years Ago

you could try this:

>>> a = unichr(275).encode('utf8')
>>> b = a.decode('utf8')
>>> b
u'\u0113'
>>> ord(b)
275

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

Peagles 0 Newbie Poster · Answer 1 · 2006-09-20T16:36:02+00:00

Ok thanx, this has helped me understand the problem and partially solve it, however, there is an additional problem.

When i concatate several unicoded characters like this:

a = unichr(88).encode('utf8')
b = unichr(257).encode('utf8')
c = unichr(109).encode('utf8')
d = unichr(258).encode('utf8')
print a,b,c,d
e = a+b+c+d
print e
for i in e:
    print i

How can i retrieve the original integers (88,257,109,258) from string e?
Since all characters above 255 contain two characters, how can i than determine which characters belong together, so i can decode them together (using ord()).

In other words, can i split the string in a certain way, so that it contains 'whole' characters.
I know that flash actionscript can to this, this advanced language must be able to do the same.

ghostdog74 57 Junior Poster · Answer 2 · 2006-09-20T19:43:53+00:00

Ok thanx, this has helped me understand the problem and partially solve it, however, there is an additional problem.
When i concatate several unicoded characters like this:
a = unichr(88).encode('utf8')
b = unichr(257).encode('utf8')
c = unichr(109).encode('utf8')
d = unichr(258).encode('utf8')
print a,b,c,d
e = a+b+c+d
print e
for i in e:
    print i
How can i retrieve the original integers (88,257,109,258) from string e?
Since all characters above 255 contain two characters, how can i than determine which characters belong together, so i can decode them together (using ord()).
In other words, can i split the string in a certain way, so that it contains 'whole' characters.
I know that flash actionscript can to this, this advanced language must be able to do the same.

....
dec = e.decode('string_escape').decode('utf8')
for i in dec: print ord(i),
...

Peagles 0 Newbie Poster · Answer 3 · 2006-09-20T21:41:27+00:00

Peagles 0 Newbie Poster

18 Years Ago

Thank u so much!

Ene Uran 638 Posting Virtuoso · Answer 4 · 2006-09-20T22:57:12+00:00

This seems to work too:

a = unichr(88).encode('utf8')
b = unichr(257).encode('utf8')
c = unichr(109).encode('utf8')
d = unichr(258).encode('utf8')
print a,b,c,d
e = a+b+c+d
print e
print
for c in e:
    print c
print
for i in e.decode('utf8'):
    print ord(i)