943,981 Members | Top Members by Rank

Ad:
  • Python Discussion Thread
  • Marked Solved
  • Views: 9929
  • Python RSS
Nov 13th, 2007
0

Trying to compare the contents of two text files and save the difference

Expand Post »
I have two text files containing multiple lines of text from a datalogger, and I need to compare the two files and save the difference into a third text file.

ie....

text1:
10/13/01, 21:34:23, 4324
10/14/01, 09:12:32, 3423
10/15/01, 04:45:54, 7834

text2:
10/12/01, 43:34:34, 6453
10/13/01, 21:34:23, 4324
10/14/01, 09:12:32, 3423
10/15/01, 04:45:54, 7834
10/16/01, 05:34:26, 8323

text3:
10/12/01, 43:34:34, 6453
10/16/01, 05:34:26, 8323

I am able to accomplish this using a bash script, but since the rest of my code is in the python I would rather stick to using just python. Any advice would be great!

Thanks
Similar Threads
Reputation Points: 10
Solved Threads: 0
Newbie Poster
the1last is offline Offline
2 posts
since Nov 2007
Nov 13th, 2007
0

Re: Trying to compare the contents of two text files and save the difference

You can use lists or 2 sets. But you would want both set1.difference(set2) and set2.difference(set1). You can set up a process to read both files like a merge sort would, but the set solution seems more pythonic. Depends on how large the files are though.
Reputation Points: 741
Solved Threads: 692
Nearly a Posting Maven
woooee is offline Offline
2,307 posts
since Dec 2006
Nov 13th, 2007
0

Re: Trying to compare the contents of two text files and save the difference

Vegaseat left this example of the difflib module somewhere in the code snippets:
python Syntax (Toggle Plain Text)
  1. # find the difference between two texts
  2. # tested with Python24 vegaseat 6/2/2005
  3.  
  4. import difflib
  5.  
  6. text1 = """The World's Shortest Books:
  7. Human Rights Advances in China
  8. "My Plan to Find the Real Killers" by OJ Simpson
  9. "Strom Thurmond: Intelligent Quotes"
  10. America's Most Popular Lawyers
  11. Career Opportunities for History Majors
  12. Different Ways to Spell "Bob"
  13. Dr. Kevorkian's Collection of Motivational Speeches
  14. Spotted Owl Recipes by the EPA
  15. The Engineer's Guide to Fashion
  16. Ralph Nader's List of Pleasures
  17. """
  18.  
  19. text2 = """The World's Shortest Books:
  20. Human Rights Advances in China
  21. "My Plan to Find the Real Killers" by OJ Simpson
  22. "Strom Thurmond: Intelligent Quotes"
  23. America's Most Popular Lawyers
  24. Career Opportunities for History Majors
  25. Different Ways to Sell "Bob"
  26. Dr. Kevorkian's Collection of Motivational Speeches
  27. Spotted Owl Recipes by the EPA
  28. The Engineer's Guide to Passion
  29. Ralph Nader's List of Pleasures
  30. """
  31.  
  32. # create a list of lines in text1
  33. text1Lines = text1.splitlines(1)
  34. print "Lines of text1:"
  35. for line in text1Lines:
  36. print line,
  37.  
  38. print
  39.  
  40. # dito for text2
  41. text2Lines = text2.splitlines(1)
  42. print "Lines of text2:"
  43. for line in text2Lines:
  44. print line,
  45.  
  46. print
  47.  
  48. diffInstance = difflib.Differ()
  49. diffList = list(diffInstance.compare(text1Lines, text2Lines))
  50.  
  51. print '-'*50
  52. print "Lines different in text1 from text2:"
  53. for line in diffList:
  54. if line[0] == '-':
  55. print line,
Reputation Points: 404
Solved Threads: 180
Nearly a Posting Virtuoso
bumsfeld is offline Offline
1,422 posts
since Jul 2005
Nov 14th, 2007
0

Re: Trying to compare the contents of two text files and save the difference

Thanks for the advice guys. Using the difflib module things are up and running nicely. My only question at this point is how would the module react to files with many entires (say > 2000). I haven't had a chance to setup a test run like this yet, but I plan to soon.
Reputation Points: 10
Solved Threads: 0
Newbie Poster
the1last is offline Offline
2 posts
since Nov 2007
Nov 14th, 2007
0

Re: Trying to compare the contents of two text files and save the difference

With some additions to the data, note that it reports "1. first different line" as a difference when it is not and doesn't find "Another line that is different". Sorting text1Lines and text2Lines should solve the first problem since it seems to be comparing in file order. This may not make a difference since the file appears to be in ascending date order already. If there are lines in the 2nd file that are not in the first, then you will also have to insert a
diffList = list(diffInstance.compare(text2Lines, text1Lines)) routine. In general, when comparing we want to know how it is comparing.
Python Syntax (Toggle Plain Text)
  1. #!/usr/bin/python
  2.  
  3. # find the difference between two texts
  4. # tested with Python24 vegaseat 6/2/2005
  5.  
  6. import difflib
  7.  
  8. text1 = """The World's Shortest Books:
  9. Human Rights Advances in China
  10. Add some text lines that are not in either
  11. 1. first different line
  12. 2. line 2 added
  13. 3. also a third
  14. "My Plan to Find the Real Killers" by OJ Simpson
  15. "Strom Thurmond: Intelligent Quotes"
  16. America's Most Popular Lawyers
  17. Career Opportunities for History Majors
  18. Different Ways to Spell "Bob"
  19. Dr. Kevorkian's Collection of Motivational Speeches
  20. Spotted Owl Recipes by the EPA
  21. The Engineer's Guide to Fashion
  22. Ralph Nader's List of Pleasures
  23. """
  24.  
  25. text2 = """The World's Shortest Books:
  26. Human Rights Advances in China
  27. "My Plan to Find the Real Killers" by OJ Simpson
  28. "Strom Thurmond: Intelligent Quotes"
  29. America's Most Popular Lawyers
  30. Career Opportunities for History Majors
  31. Different Ways to Sell "Bob"
  32. Dr. Kevorkian's Collection of Motivational Speeches
  33. Spotted Owl Recipes by the EPA
  34. The Engineer's Guide to Passion
  35. Ralph Nader's List of Pleasures
  36. Another line that is different
  37. 1. first different line
  38. """
  39.  
  40. # create a list of lines in text1
  41. text1Lines = text1.splitlines(1)
  42. ##text1Lines.sort() ## uncomment to sort
  43. print "Lines of text1:"
  44. for line in text1Lines:
  45. print line,
  46. print
  47.  
  48. # dito for text2
  49. text2Lines = text2.splitlines(1)
  50. ##text2Lines.sort() ## uncomment to sort
  51. print "Lines of text2:"
  52. for line in text2Lines:
  53. print line,
  54. print
  55.  
  56. diffInstance = difflib.Differ()
  57. diffList = list(diffInstance.compare(text1Lines, text2Lines))
  58.  
  59. print '-'*50
  60. print "Lines different in text1 from text2:"
  61. for line in diffList:
  62. if line[0] == '-':
  63. print line,
  64. print
Last edited by woooee; Nov 14th, 2007 at 12:59 pm.
Reputation Points: 741
Solved Threads: 692
Nearly a Posting Maven
woooee is offline Offline
2,307 posts
since Dec 2006
Jan 28th, 2010
0

File comparison

Hello,

I am very new to python and these days learning this langauge.
At present I have to work on comparision of above two files and
generate a third file for the percentage difference between the
values.

Kindly help.

With my best regards,
Vani
File1
*ID4U.1 = 3.2516E-11
*ID4U.2 = 9.6499E-15
*ID4U.3 = 9.6499E-15
*ID4U.4 = 9.6499E-15
*ID4U.5 = 9.6499E-15
*ID4U.6 = 9.6499E-15
*ID4U.7 = 9.6499E-15
*ID4U.8 = 1.4720E-14
*ID4U.9 = 2.9930E-14
*ID4U.10 = 1.1154E-13
upto *ID4U.146

File2
id4u.1 = 7.4778456e-10
id4u.2 = 7.4778308e-10
id4u.3 = 7.4778228e-10
id4u.4 = 7.4778228e-10
id4u.5 = 7.4778228e-10
id4u.6 = 7.4778228e-10
id4u.7 = 7.4778228e-10
id4u.8 = 7.4778939e-10
id4u.9 = 7.4780360e-10
id4u.10 = 7.4788812e-10
upto id4u.146



Example:
((Value of *ID4U.1- value of id4u.1)/ (Value of *ID4U.1))*100 or

((3.2516E-11 - 7.4778456e-10)/3.2516E-11)*100

Editor's note:
Please don't hijack older threads with your problems. Write your own thread, title it properly and state your problem and code you have tried.
Last edited by vegaseat; Jan 28th, 2010 at 5:12 pm. Reason: hijack
Reputation Points: 10
Solved Threads: 1
Newbie Poster
vani priya is offline Offline
1 posts
since Jan 2010
Jul 26th, 2011
-1
Re: Trying to compare the contents of two text files and save the difference
The example code is very simple and useful.Thanks for the same.
Reputation Points: 7
Solved Threads: 0
Newbie Poster
radk is offline Offline
1 posts
since Jul 2011

This thread is solved

Either the thread starter or a moderator has marked this thread as solved. You can most likely trust the responses and answers given. There is most likely no reason for any further responses to be posted here. If you have a related question, please start a new thread in this forum instead.

This thread is more than three months old

No one has posted to this discussion for at least three months. Please let old threads die and do not reply to them unless you feel you have something new and valuable to contribute that absolutely must be added to make the discussion complete. Otherwise, please start a new thread in this forum instead.
Message:
Previous Thread in Python Forum Timeline: what do i do to become a professional programmer?
Next Thread in Python Forum Timeline: def f () / loop question





About Us | Contact Us | Advertise | Acceptable Use Policy
Forum Index | Build Custom RSS Feed


Follow us on Twitter


© 2011 DaniWeb® LLC