User Name Password Register
DaniWeb IT Discussion Community
All
What is DaniWeb IT Discussion Community?
You're currently browsing the Python section within the Software Development category of DaniWeb, a massive community of 426,255 software developers, web developers, Internet marketers, and tech gurus who are all enthusiastic about making contacts, networking, and learning from each other. In fact, there are 2,041 IT professionals currently interacting right now! Registration is free, only takes a minute and lets you enjoy all of the interactive features of the site.
Please support our Python advertiser: Programming Forums

How to sort word (from file) frequancy in decrease order? I need help

Join Date: Jan 2008
Posts: 602
Reputation: ZZucker is an unknown quantity at this point 
Rep Power: 2
Solved Threads: 21
ZZucker's Avatar
ZZucker ZZucker is offline Offline
Practically a Master Poster

Re: How to sort word (from file) frequancy in decrease order? I need help

  #2  
Mar 14th, 2008
There are mistakes like your last line should be i = i + 1. Also string functions are builtin since version 2.2, module re is not needed.

Here is one way to do this with version 2.5
  1. # count words in a text and show the first ten items
  2. # by decreasing frequency
  3.  
  4. # sample text for testing
  5. text = """\
  6. My name is Fred Flintstone and I am a famous TV
  7. star. I have as much authority as the Pope, I
  8. just don't have as many people who believe it.
  9. """
  10.  
  11. word_freq = {}
  12.  
  13. word_list = text.split()
  14.  
  15. for word in word_list:
  16. # word all lower case
  17. word = word.lower()
  18. # strip any trailing period or comma
  19. word = word.rstrip('.,')
  20. # build the dictionary
  21. count = word_freq.get(word, 0)
  22. word_freq[word] = count + 1
  23.  
  24. # create a list of (freq, word) tuples
  25. freq_list = [(freq, word) for word, freq in word_freq.items()]
  26.  
  27. # sort the list by the first element in each tuple (default)
  28. freq_list.sort(reverse=True)
  29.  
  30. for n, tup in enumerate(freq_list):
  31. # print the first ten items
  32. if n < 10:
  33. freq, word = tup
  34. print freq, word
  35. # or
  36. #print word, freq
  37.  
  38. """
  39. my output -->
  40. 3 i
  41. 3 as
  42. 2 have
  43. 1 who
  44. 1 tv
  45. 1 the
  46. 1 star
  47. 1 pope
  48. 1 people
  49. 1 name
  50. """
Last edited by ZZucker : Mar 14th, 2008 at 12:17 pm.
Never argue with idiots, they'll just bring you down to their level and beat you with their experience.
Reply With Quote  
All times are GMT -4. The time now is 9:03 am.
Forum system based on vBulletin Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
©2003 - 2008 DaniWeb® LLC