User Name Password Register
DaniWeb IT Discussion Community
All
What is DaniWeb IT Discussion Community?
You're currently browsing the Python section within the Software Development category of DaniWeb, a massive community of 392,078 software developers, web developers, Internet marketers, and tech gurus who are all enthusiastic about making contacts, networking, and learning from each other. In fact, there are 4,049 IT professionals currently interacting right now! Registration is free, only takes a minute and lets you enjoy all of the interactive features of the site.
Please support our Python advertiser:
Views: 1831 | Replies: 6
Reply
Join Date: Sep 2006
Posts: 3
Reputation: Peagles is an unknown quantity at this point 
Rep Power: 0
Solved Threads: 0
Peagles Peagles is offline Offline
Newbie Poster

Unicode ord and unichr

  #1  
Sep 18th, 2006
Hi!

How can i derive the used integer from variable a, when i use following code:
a = unichr(275).encode('utf8')

When i try this:
print ord(a)
It raises an error...
Last edited by Peagles : Sep 18th, 2006 at 12:57 pm.
AddThis Social Bookmark Button
Reply With Quote  
Join Date: Oct 2004
Location: Mojave Desert
Posts: 2,394
Reputation: vegaseat will become famous soon enough vegaseat will become famous soon enough 
Rep Power: 9
Solved Threads: 172
Moderator
vegaseat's Avatar
vegaseat vegaseat is offline Offline
Kickbutt Moderator

Re: Unicode ord and unichr

  #2  
Sep 19th, 2006
This is going to be tough, because "print a" gives you a string consisting of two hex values --> '\xc4\x93'. The function ord() seems to only handle single characters.

This works ...
>>> b = unichr(275)
>>> b
u'\u0113'
>>> ord(b)
275
Is there a way to decode('utf8')? Sorry, I just don't work with unicode much.
Last edited by vegaseat : Sep 19th, 2006 at 4:12 pm.
May 'the Google' be with you!
Reply With Quote  
Join Date: Apr 2006
Posts: 137
Reputation: ghostdog74 is an unknown quantity at this point 
Rep Power: 3
Solved Threads: 26
ghostdog74 ghostdog74 is offline Offline
Junior Poster

Re: Unicode ord and unichr

  #3  
Sep 19th, 2006
you could try this:

>>> a = unichr(275).encode('utf8')
>>> b = a.decode('utf8')
>>> b
u'\u0113'
>>> ord(b)
275
Reply With Quote  
Join Date: Sep 2006
Posts: 3
Reputation: Peagles is an unknown quantity at this point 
Rep Power: 0
Solved Threads: 0
Peagles Peagles is offline Offline
Newbie Poster

Re: Unicode ord and unichr

  #4  
Sep 20th, 2006
Ok thanx, this has helped me understand the problem and partially solve it, however, there is an additional problem.

When i concatate several unicoded characters like this:

a = unichr(88).encode('utf8')
b = unichr(257).encode('utf8')
c = unichr(109).encode('utf8')
d = unichr(258).encode('utf8')
print a,b,c,d
e = a+b+c+d
print e
for i in e:
    print i

How can i retrieve the original integers (88,257,109,258) from string e?
Since all characters above 255 contain two characters, how can i than determine which characters belong together, so i can decode them together (using ord()).

In other words, can i split the string in a certain way, so that it contains 'whole' characters.
I know that flash actionscript can to this, this advanced language must be able to do the same.
Last edited by Peagles : Sep 20th, 2006 at 6:41 am.
Reply With Quote  
Join Date: Apr 2006
Posts: 137
Reputation: ghostdog74 is an unknown quantity at this point 
Rep Power: 3
Solved Threads: 26
ghostdog74 ghostdog74 is offline Offline
Junior Poster

Re: Unicode ord and unichr

  #5  
Sep 20th, 2006
Originally Posted by Peagles View Post
Ok thanx, this has helped me understand the problem and partially solve it, however, there is an additional problem.

When i concatate several unicoded characters like this:

a = unichr(88).encode('utf8')
b = unichr(257).encode('utf8')
c = unichr(109).encode('utf8')
d = unichr(258).encode('utf8')
print a,b,c,d
e = a+b+c+d
print e
for i in e:
    print i

How can i retrieve the original integers (88,257,109,258) from string e?
Since all characters above 255 contain two characters, how can i than determine which characters belong together, so i can decode them together (using ord()).

In other words, can i split the string in a certain way, so that it contains 'whole' characters.
I know that flash actionscript can to this, this advanced language must be able to do the same.



....
dec = e.decode('string_escape').decode('utf8')
for i in dec: print ord(i),
...
Last edited by ghostdog74 : Sep 20th, 2006 at 9:45 am.
Reply With Quote  
Join Date: Sep 2006
Posts: 3
Reputation: Peagles is an unknown quantity at this point 
Rep Power: 0
Solved Threads: 0
Peagles Peagles is offline Offline
Newbie Poster

Re: Unicode ord and unichr

  #6  
Sep 20th, 2006
Thank u so much!
Reply With Quote  
Join Date: Aug 2005
Posts: 1,004
Reputation: Ene Uran is an unknown quantity at this point 
Rep Power: 5
Solved Threads: 64
Ene Uran's Avatar
Ene Uran Ene Uran is offline Offline
Veteran Poster

Re: Unicode ord and unichr

  #7  
Sep 20th, 2006
This seems to work too:
a = unichr(88).encode('utf8')
b = unichr(257).encode('utf8')
c = unichr(109).encode('utf8')
d = unichr(258).encode('utf8')
print a,b,c,d
e = a+b+c+d
print e
print
for c in e:
    print c
print
for i in e.decode('utf8'):
    print ord(i)
drink her pretty
Reply With Quote  
Reply

Only community members can participate in forum threads. You must register or log in to contribute.

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)

 

DaniWeb Python Marketplace
Thread Tools Display Modes

Similar Threads
Other Threads in the Python Forum

All times are GMT -4. The time now is 12:16 pm.
Forum system based on vBulletin Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
©2003 - 2008 DaniWeb® LLC