0

Hi!

How can i derive the used integer from variable a, when i use following code:

a = unichr(275).encode('utf8')

When i try this:

print ord(a)

It raises an error...

4
Contributors
6
Replies
9
Views
10 Years
Discussion Span
Last Post by Ene Uran
0

This is going to be tough, because "print a" gives you a string consisting of two hex values --> '\xc4\x93'. The function ord() seems to only handle single characters.

This works ...

>>> b = unichr(275)
>>> b
u'\u0113'
>>> ord(b)
275

Is there a way to decode('utf8')? Sorry, I just don't work with unicode much.

1

you could try this:

>>> a = unichr(275).encode('utf8')
>>> b = a.decode('utf8')
>>> b
u'\u0113'
>>> ord(b)
275
0

Ok thanx, this has helped me understand the problem and partially solve it, however, there is an additional problem.

When i concatate several unicoded characters like this:

a = unichr(88).encode('utf8')
b = unichr(257).encode('utf8')
c = unichr(109).encode('utf8')
d = unichr(258).encode('utf8')
print a,b,c,d
e = a+b+c+d
print e
for i in e:
    print i

How can i retrieve the original integers (88,257,109,258) from string e?
Since all characters above 255 contain two characters, how can i than determine which characters belong together, so i can decode them together (using ord()).

In other words, can i split the string in a certain way, so that it contains 'whole' characters.
I know that flash actionscript can to this, this advanced language must be able to do the same.

0

Ok thanx, this has helped me understand the problem and partially solve it, however, there is an additional problem.

When i concatate several unicoded characters like this:

a = unichr(88).encode('utf8')
b = unichr(257).encode('utf8')
c = unichr(109).encode('utf8')
d = unichr(258).encode('utf8')
print a,b,c,d
e = a+b+c+d
print e
for i in e:
    print i

How can i retrieve the original integers (88,257,109,258) from string e?
Since all characters above 255 contain two characters, how can i than determine which characters belong together, so i can decode them together (using ord()).

In other words, can i split the string in a certain way, so that it contains 'whole' characters.
I know that flash actionscript can to this, this advanced language must be able to do the same.

....
dec = e.decode('string_escape').decode('utf8')
for i in dec: print ord(i),
...
0

This seems to work too:

a = unichr(88).encode('utf8')
b = unichr(257).encode('utf8')
c = unichr(109).encode('utf8')
d = unichr(258).encode('utf8')
print a,b,c,d
e = a+b+c+d
print e
print
for c in e:
    print c
print
for i in e.decode('utf8'):
    print ord(i)
This question has already been answered. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.