| | |
Trying to compare the contents of two text files and save the difference
![]() |
•
•
Join Date: Nov 2007
Posts: 2
Reputation:
Solved Threads: 0
I have two text files containing multiple lines of text from a datalogger, and I need to compare the two files and save the difference into a third text file.
ie....
text1:
10/13/01, 21:34:23, 4324
10/14/01, 09:12:32, 3423
10/15/01, 04:45:54, 7834
text2:
10/12/01, 43:34:34, 6453
10/13/01, 21:34:23, 4324
10/14/01, 09:12:32, 3423
10/15/01, 04:45:54, 7834
10/16/01, 05:34:26, 8323
text3:
10/12/01, 43:34:34, 6453
10/16/01, 05:34:26, 8323
I am able to accomplish this using a bash script, but since the rest of my code is in the python I would rather stick to using just python. Any advice would be great!
Thanks
ie....
text1:
10/13/01, 21:34:23, 4324
10/14/01, 09:12:32, 3423
10/15/01, 04:45:54, 7834
text2:
10/12/01, 43:34:34, 6453
10/13/01, 21:34:23, 4324
10/14/01, 09:12:32, 3423
10/15/01, 04:45:54, 7834
10/16/01, 05:34:26, 8323
text3:
10/12/01, 43:34:34, 6453
10/16/01, 05:34:26, 8323
I am able to accomplish this using a bash script, but since the rest of my code is in the python I would rather stick to using just python. Any advice would be great!
Thanks
Vegaseat left this example of the difflib module somewhere in the code snippets:
python Syntax (Toggle Plain Text)
# find the difference between two texts # tested with Python24 vegaseat 6/2/2005 import difflib text1 = """The World's Shortest Books: Human Rights Advances in China "My Plan to Find the Real Killers" by OJ Simpson "Strom Thurmond: Intelligent Quotes" America's Most Popular Lawyers Career Opportunities for History Majors Different Ways to Spell "Bob" Dr. Kevorkian's Collection of Motivational Speeches Spotted Owl Recipes by the EPA The Engineer's Guide to Fashion Ralph Nader's List of Pleasures """ text2 = """The World's Shortest Books: Human Rights Advances in China "My Plan to Find the Real Killers" by OJ Simpson "Strom Thurmond: Intelligent Quotes" America's Most Popular Lawyers Career Opportunities for History Majors Different Ways to Sell "Bob" Dr. Kevorkian's Collection of Motivational Speeches Spotted Owl Recipes by the EPA The Engineer's Guide to Passion Ralph Nader's List of Pleasures """ # create a list of lines in text1 text1Lines = text1.splitlines(1) print "Lines of text1:" for line in text1Lines: print line, # dito for text2 text2Lines = text2.splitlines(1) print "Lines of text2:" for line in text2Lines: print line, diffInstance = difflib.Differ() diffList = list(diffInstance.compare(text1Lines, text2Lines)) print '-'*50 print "Lines different in text1 from text2:" for line in diffList: if line[0] == '-': print line,
Should you find Irony, you can keep her!
•
•
Join Date: Dec 2006
Posts: 976
Reputation:
Solved Threads: 271
With some additions to the data, note that it reports "1. first different line" as a difference when it is not and doesn't find "Another line that is different". Sorting text1Lines and text2Lines should solve the first problem since it seems to be comparing in file order. This may not make a difference since the file appears to be in ascending date order already. If there are lines in the 2nd file that are not in the first, then you will also have to insert a
diffList = list(diffInstance.compare(text2Lines, text1Lines)) routine. In general, when comparing we want to know how it is comparing.
diffList = list(diffInstance.compare(text2Lines, text1Lines)) routine. In general, when comparing we want to know how it is comparing.
Python Syntax (Toggle Plain Text)
#!/usr/bin/python # find the difference between two texts # tested with Python24 vegaseat 6/2/2005 import difflib text1 = """The World's Shortest Books: Human Rights Advances in China Add some text lines that are not in either 1. first different line 2. line 2 added 3. also a third "My Plan to Find the Real Killers" by OJ Simpson "Strom Thurmond: Intelligent Quotes" America's Most Popular Lawyers Career Opportunities for History Majors Different Ways to Spell "Bob" Dr. Kevorkian's Collection of Motivational Speeches Spotted Owl Recipes by the EPA The Engineer's Guide to Fashion Ralph Nader's List of Pleasures """ text2 = """The World's Shortest Books: Human Rights Advances in China "My Plan to Find the Real Killers" by OJ Simpson "Strom Thurmond: Intelligent Quotes" America's Most Popular Lawyers Career Opportunities for History Majors Different Ways to Sell "Bob" Dr. Kevorkian's Collection of Motivational Speeches Spotted Owl Recipes by the EPA The Engineer's Guide to Passion Ralph Nader's List of Pleasures Another line that is different 1. first different line """ # create a list of lines in text1 text1Lines = text1.splitlines(1) ##text1Lines.sort() ## uncomment to sort print "Lines of text1:" for line in text1Lines: print line, # dito for text2 text2Lines = text2.splitlines(1) ##text2Lines.sort() ## uncomment to sort print "Lines of text2:" for line in text2Lines: print line, diffInstance = difflib.Differ() diffList = list(diffInstance.compare(text1Lines, text2Lines)) print '-'*50 print "Lines different in text1 from text2:" for line in diffList: if line[0] == '-': print line,
Last edited by woooee; Nov 14th, 2007 at 12:59 pm.
![]() |
Similar Threads
- how to get the difference between the data of the two files (Visual Basic 4 / 5 / 6)
- PHP and TEXT FILES (PHP)
- Write and Read text files (Java)
- How to compare folders of XML files in Perl using ExamXML (Perl)
- Insert Contents Of Text File Into Database (ASP.NET)
- read from database and writing the contents into a text file (C)
- Making arrays from text files (VB.NET)
- appending two java text files (Java)
- lost my notebook for making text files--how do I get it back? (Windows NT / 2000 / XP)
Other Threads in the Python Forum
- Previous Thread: Verbal raw input?
- Next Thread: Read from /dev/zero
| Thread Tools | Search this Thread |
alarm ansi anydbm app assignment backend beginner binary bluetooth character cipher cmd coordinates curves customdialog cx-freeze data decimals development directory exe feet file float format function generator getvalue gnu halp handling heads homework ideas input ip itunes java keycontrol leftmouse line linux list lists loop maintain maze millimeter module mouse number numbers output parsing path pointer prime programming push py2exe pygame pymailer python queue random recursion recursive schedule screensaverloopinactive script searchingfile slicenotation sqlite ssh statistics string strings sudokusolver text thread time tlapse tooltip tuple type ubuntu unicode url urllib urllib2 variable ventrilo vigenere web webservice wikipedia write wxpython xlib xlwt






