hi guys im stuck in this question would anyone care to help me(its a csv file)?? How many times does the least common string appear in the field [gemstone]?

valid gemstone owing mark channels code
No Diamond 26.06 20 218 (KJQ:10E)2
Yes Diamond 15.43 25 36 (DRX:25H)7
No Sapphire 11.44 51 141 (XKL:31L)4
No Zircon 79.68 23 60 (BYA:26Ix8
No Moonstone 79.41 11 67 (BWC:79L)4
Yes Garnet 109.69 17 215 (ECO:67B)5
No Pearl 61.16 30 128 (GAZ:36A)6
Yes Opal 44.2 9 162 (ZPK:85T)8
No Emerald 103.13 70 181 (GYN:99F)3
No Ruby 59.09 3 48 (UKI:10I)6
No Amethyst 36.94 35 71 (IUQ:42Z)3
No Diamond 5.45 79 28 (EHX:17T)8
No Amethyst 102.26 11 224 (ZTG:03L)8
No Sapphire 13.01 7 62 (ZBY:68T)9
No Garnet 121.96 93 43 (KHP:59H)3
Yes Garnet 47.13 31 156 (LnQ:15E)1
No Emerald 119.71 30 219 (IWZ:72J)4
No Zircon 38.99 120 98 (VTK53Q)1
No Pearl 91.76 135 80 (STE:05N)8
No Garnet 26.18 13 154 (KTD:90A)1

First a list of all the gemstones:

f=open("your file name")
gemstones=[]
for i in f: gemstones.append(i.split(',')[1])
f.close()

now a dictionary where key is gemstone and value is count

gcount={}
for g in gemstones: gcount[g]=gcount.get(g,0)+1

Edited 3 Years Ago by rrashkin

This might give you some hints:

'''
file gemstone.csv looks like this:
valid gemstone owing mark channels code
No Diamond 26.06 20 218 (KJQ:10E)2
Yes Diamond 15.43 25 36 (DRX:25H)7
No Sapphire 11.44 51 141 (XKL:31L)4
No Zircon 79.68 23 60 (BYA:26Ix8
No Moonstone 79.41 11 67 (BWC:79L)4
Yes Garnet 109.69 17 215 (ECO:67B)5
No Pearl 61.16 30 128 (GAZ:36A)6
Yes Opal 44.2 9 162 (ZPK:85T)8
No Emerald 103.13 70 181 (GYN:99F)3
No Ruby 59.09 3 48 (UKI:10I)6
No Amethyst 36.94 35 71 (IUQ:42Z)3
No Diamond 5.45 79 28 (EHX:17T)8
No Amethyst 102.26 11 224 (ZTG:03L)8
No Sapphire 13.01 7 62 (ZBY:68T)9
No Garnet 121.96 93 43 (KHP:59H)3
Yes Garnet 47.13 31 156 (LnQ:15E)1
No Emerald 119.71 30 219 (IWZ:72J)4
No Zircon 38.99 120 98 (VTK53Q)1
No Pearl 91.76 135 80 (STE:05N)8
No Garnet 26.18 13 154 (KTD:90A)1
'''

from collections import Counter
import pprint

with open("gemstone.csv") as fin:
    gem_list = []
    for row in fin:
        gem_list.append(row.split()[1])

# eliminate header/title row
gem_list = gem_list[1:]

gem_cntr = Counter(gem_list).most_common()
pprint.pprint(gem_cntr)

print('-'*24)

# last element
print(gem_cntr[-1:])

'''
[('Garnet', 4),
 ('Diamond', 3),
 ('Zircon', 2),
 ('Pearl', 2),
 ('Amethyst', 2),
 ('Sapphire', 2),
 ('Emerald', 2),
 ('Opal', 1),
 ('Moonstone', 1),
 ('Ruby', 1)]
------------------------
[('Ruby', 1)]
'''

i believe that fuction counts the characters in the word Gemstone, but thats not what the quesion was asking, gemstone is a field name

wb = open('filename.csv','r')



linecount =0
firstline = True
for line in wb:                          # Here this code ensures to jump the file line read so it ignores the key and focuses on the values
    if firstline:
        firstline = False
        continue
 # Here the keys are put in order and stripping any spaces and giving them positions
    linecount +=1
    field = line.split(',')

    gemstone = field[1].strip()

this is what ive done so far i just need a fuction that counts least common string in the field gemstone

thank you very much

its still not working as i get

[('11.44,51,141,(XKL:31L)4', 1),
 ('26.18,13,154,(KTD:90A)1', 1),
 ('79.68,23,60,(BYA:26Ix8', 1),
 ('26.06,20,218,(KJQ:10E)2', 1),
 ('102.26,11,224,(ZTG:03L)8', 1),
 ('103.13,70,181,(GYN:99F)3', 1),
 ('91.76,135,80,(STE:05N)8', 1),
 ('38.99,120,98,(VTK53Q)1', 1),
 ('36.94,35,71,(IUQ:42Z)3', 1),
 ('121.96,93,43,(KHP:59H)3', 1),
 ('79.41,11,67,(BWC:79L)4', 1),
 ('13.01,7,62,(ZBY:68T)9', 1),
 ('15.43,25,36,(DRX:25H)7', 1),
 ('119.71,30,219,(IWZ:72J)4', 1),
 ('109.69,17,215,(ECO:67B)5', 1),
 ('59.09,3,48,(UKI:10I)6', 1),
 ('5.45,79,28,(EHX:17T)8', 1),
 ('47.13,31,156,(LnQ:15E)1', 1),
 ('61.16,30,128,(GAZ:36A)6', 1),
 ('44.20,9,162,(ZPK:85T)8', 1)]
------------------------
[('44.20,9,162,(ZPK:85T)8', 1)]

Sneekula's answer should pretty well solve your problem. Are you sure you have read it?
Opal, Ruby and Moonstone appear once and Diamond appears 3 times.

This article has been dead for over six months. Start a new discussion instead.