Hello All. I am quite new to programming, I have a question in regards to creating a script that reads data from a file and then prints how many time each value ( number in my case occurs).

I was wondering if you could help me write a similar script, however, unlike that gentleman who posted my data set contains numbers with decimals for example:
8.82728929832475D-004
0.373223841916055
4.06291083143060D-002
0.102697923377643
5.90299902998276D-002
2.20778238968052D-002
1.50498329458987D-002
0.269732448122138
2.15552885207953D-002
3.32530592527098D-002
0.275937019332351

and I have more than ten values ( think this is the amount he had), i have about 600 plus values.

Much Thanks In Advance.

Recommended Answers

All 8 Replies

Here's how you could iterate through a file:

fh = open( 'myFile.txt', 'r' )
inp = fh.readlines()
fh.close()

for each_line in inp:
   # Do something here

All you'll need to do is make a counter and increment it when you find what you're looking for.

I suggest you take a look at The Natural Language Toolkit's Pyhon book. It's an excellent newbie intro to Python. here's the counter they demonstrate:

phrase = 'colorless green ideas sleep furiously'
count = {}
for letter in phrase:
  if letter not in count:
     count[letter] = 0
  count[letter] += 1

Output =
>>> count
{'a': 1, ' ': 4, 'c': 1, 'e': 6, 'd': 1, 'g': 1, 'f': 1, 'i': 2, 'l': 4, 'o': 3, 'n': 1, 'p': 1, 's': 5, 'r': 3, 'u': 2, 'y': 1}

Slight variation of ChrisP_Buffalo's suggestion:

# create number:frequency dictionary

num_freq = {}
for num in file("floats.txt"):
    #print num, type(num)
    # strip trailing new lines
    num = num.rstrip()
    num_freq[num] = num_freq.get(num, 0) + 1

#print num_freq

print '-'*30

# create sorted list of (freq, number) tuples
# high freq first, show freq then number
vals = sorted([(v, k) for k, v in num_freq.items()], reverse=True)
for val in vals:
    print val[0], val[1]

"""
this data file:
8.82728929832475D-004
0.373223841916055
0.269732448122138
4.06291083143060D-002
0.102697923377643
5.90299902998276D-002
2.20778238968052D-002
1.50498329458987D-002
0.269732448122138
2.15552885207953D-002
3.32530592527098D-002
0.275937019332351
0.102697923377643

would show:
2 0.269732448122138
2 0.102697923377643
1 8.82728929832475D-004
1 5.90299902998276D-002
1 4.06291083143060D-002
1 3.32530592527098D-002
1 2.20778238968052D-002
1 2.15552885207953D-002
1 1.50498329458987D-002
1 0.373223841916055
1 0.275937019332351

"""

Thanks Ene, this is exactly what I need. However, when I run the script ,as is,I get this error:

Traceback (most recent call last):
File "new2.py", line 30, in ?
vals = sorted([(v, k) for k, v in num_freq.items()], reverse=True)
NameError: name 'sorted' is not defined

Sorted should be built-in to your Python dist. What version are you using? Try opening an interpreter and typing in sorted([1,3,2,5,4,7,6]) to see if you get the same error or if it returns the sorted list.

Sorted should be built-in to your Python dist. What version are you using? Try opening an interpreter and typing in sorted([1,3,2,5,4,7,6]) to see if you get the same error or if it returns the sorted list.

I indeed do get the same error. I am using version 2.2.3. Is there any way for me to get around this?

I indeed do get the same error. I am using version 2.2.3. Is there any way for me to get around this?

The following works in Python 2.3 and may work in 2.2.

alist = [1,3,2,5,4,7,6]
alist.sort()
alist.reverse()

Note that methods sort() and reverse() reorder list object alist in place.

I indeed do get the same error. I am using version 2.2.3. Is there any way for me to get around this?

It might be time to upgrade to the current version 2.5.2. Your version is very, very old!

In the mean time try to use solsteel's approach.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.