Problem to handle the Latin characters in a dictionary

Question

deonis 0 Light Poster

14 Years Ago

Hello ! I'm trying to create the dictionary which would handle Latin characters like (Å), µ and so on. Apparently I'm require some sort of encoding to Unicode. Is there a way to handle this problem? see code bellow

wordDic = {
        '_chemical_formula_moiety':          'chemical formula                    ',
        '_chemical_formula_weight':          'Fw                                     ',
        '_symmetry_space_group_name_H-M':    'space group                            ',
        '_cell_length_a':                    'a (Å)"                                ',
        '_cell_length_b':                    'b (Å)"                                 ',
        '_cell_length_c':                    'c (Å)"                                ',
        '_cell_angle_alpha':                 '(deg)                                 ',
        '_cell_angle_beta':                  '(deg)                                 ',
        '_cell_angle_gamma':                 '(deg)                                 ',
        '_cell_volume':                      'V (3)                           ',
        '_cell_formula_units_Z':             'Z                                         ',
        '_cell_measurement_temperature':     'T (K)                                       ',
        '_exptl_crystal_density_diffrn':     'calcd (g cm-3)                           ',
        '_exptl_absorpt_coefficient_mu':     ' µ (mm-1)                                     ',
        '_diffrn_radiation_wavelength':      'wavelength ()                          ',
        '_diffrn_reflns_theta_min':          'range (deg)                          ',
        '_refine_ls_R_factor_all':           'R1 [all data]                          ',
        '_refine_ls_R_factor_gt':            'R1a [I > 2s(I)]                        ',
        '_refine_ls_wR_factor_ref':          'wR2 [all data]                         ',
        '_refine_ls_wR_factor_gt':           'wR2b [I > 2s(I)]                         ',
        'refine_ls_goodness_of_fit_ref':     'GOF                                        ',
        'ship': 'slip'}
        wordDic = unicode(wordDic,'latin-1')

python

4 Contributors
8 Replies
195 Views
1 Day Discussion Span
Latest Post 14 Years Ago Latest Post by deonis

All 8 Replies

jlm699 320 Veteran Poster

14 Years Ago

Here, this should give you some insight into how to define an encoding style for your whole python script.

vernondcole 0 Newbie Poster

14 Years Ago

You can generate any unicode code point (in any language) by looking up the character code in the unicode chart, and keying the number into the string. For example, the code for Greek upper case Pi is 03A0 (the values are in hexidecimal) and lower case is 03C0.
so you can write;

# this is for Python 2.6
x = u'The lower case of \u03a0 is \u03c0'
print x

# in Python 3.1 all strings are unicode,  so:
x = 'The lower case of \u03a0 is \u03c0'
print(x)

You can find the code values at
http://unicode.org/charts/

Now, the catch is that you must be running your code in an environment which can actually print the characters you select. The above code works fine inside the interactive window of the pywin32 editor -- which is running in a Windows GUI window. Now if I try the same thing in an interactive command from a 'DOS' command window (which does not have a Greek encoding) I get:

>>> print(x)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "c:\python31\lib\encodings\cp437.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u03a0' in position
18: character maps to <undefined>

Good Luck.

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

deonis 0 Light Poster · Answer 1 · 2009-12-29T06:41:37+00:00

deonis 0 Light Poster

14 Years Ago

Thanks it was very helpful !!!!!!!!!

deonis 0 Light Poster · Answer 2 · 2009-12-29T09:41:18+00:00

It seems like if I place # coding: latin-1 in a beginning of my code it solves a problem with handling Latin character but I still have a problem with Greek ones. Is there universal way to handle both of them at the same time. Also I need to be able to write this Greek characters into a text file !

Any help??

deonis 0 Light Poster · Answer 3 · 2009-12-29T09:44:19+00:00

deonis 0 Light Poster

14 Years Ago

Any Help ????????

Edited 14 Years Ago by deonis because: n/a

jlm699 320 Veteran Poster · Answer 4 · 2009-12-29T19:48:05+00:00

I'm not an expert with different encodings but isn't the "universal" encoding unicode?

d5e5 109 Master Poster · Answer 5 · 2009-12-29T22:35:27+00:00

Have you tried # -*- coding: utf-8 -*- ? According to this Unicode HOWTO page the UTF-8 encoding can handle any unicode code point. As vernondcole says, when testing you can't rely on IDLE or DOS to display the characters correctly, no matter what encoding you use. To test, write the output to a text file and open it with a text editor that can handle utf-8 encoded files. I use ActiveState's Komodo Edit.

deonis 0 Light Poster · Answer 6 · 2009-12-29T23:57:20+00:00

Thanks Guys I very much appreciate you help!!!! Finally my problem was with the text editor I use to write my code. Apparently, DrPython does not support a Greek characters and generate an error while running:

x = u'The lower case of \u03a0 is \u03c0'
print x

UnicodeEncodeError: 'ascii' codec can't encode character u'\u03a0' in position 18: ordinal not in range(128)
at the same time running a same code in IDLE displays the next output:
"The lower case of Π is π"
Thanks allot ones again !!!!!!!!!!

Problem to handle the Latin characters in a dictionary

Recommended Answers Collapse Answers

All 8 Replies

Recommended Answers