| | |
Comparing two text files
Please support our Python advertiser: Programming Forums - DaniWeb Sister Site
Thread Solved |
•
•
Join Date: Nov 2008
Posts: 11
Reputation:
Solved Threads: 0
I'm a little stuck on this particular piece of code that I'm working on. I'm supposed to check the contents of 1 text file (original), and compare it with another text file (filter) to see if there's any words that matches up.
What I have now is a so-called working comparison piece of code, as the code is only able to detect the last word in the original to see if it's similar to the one that is on the filter. What puzzles is the fact that I've got several of the similar words before the last word, but the code does not detect it as a similar text to the one that is on the filter.
Here's what I have now...
If it helps, here's the original text:
The text that is supposed to be filter is: "filter".
Any help would be greatly appreciated.
What I have now is a so-called working comparison piece of code, as the code is only able to detect the last word in the original to see if it's similar to the one that is on the filter. What puzzles is the fact that I've got several of the similar words before the last word, but the code does not detect it as a similar text to the one that is on the filter.
Here's what I have now...
Python Syntax (Toggle Plain Text)
def open_file(): f = open("c:/temp/test.txt","r") g = open("c:/temp/filter.txt","r") line = f.readlines() line2 = g.readlines() array_size = 0 for loop in line: if line[array_size] == line2[0]: print 'OFFENSIVE' print line[array_size] if line[array_size] != line2[0]: print 'NOT OFFENSIVE' print line[array_size] array_size+=1 g.close() f.close() open_file()
If it helps, here's the original text:
Python Syntax (Toggle Plain Text)
filter lol filter lol lol filter lol
The text that is supposed to be filter is: "filter".
Any help would be greatly appreciated.
You can try the built-in
As you can see, it takes each item in the list passed to filter (in this case, "a"), and returns a list of the values that returned True in the function passed to it.
In this case, the lambda function would return True if the current item (x) is in list "b". I hope that simplified your code a lot
Here's a Dive Into Python links concerning filter, and lambda.
filter function. Here's what I tried in the interpreter: python Syntax (Toggle Plain Text)
>>> a = [ 'filter', 'lol', 'filter', 'lol', 'lol', 'filter', 'lol' ] >>> b = ['filter'] >>> c = filter(lambda x: if x in b, a) >>> c ['filter', 'filter', 'filter']
As you can see, it takes each item in the list passed to filter (in this case, "a"), and returns a list of the values that returned True in the function passed to it.
In this case, the lambda function would return True if the current item (x) is in list "b". I hope that simplified your code a lot

Here's a Dive Into Python links concerning filter, and lambda.
Last edited by shadwickman; Jun 29th, 2009 at 4:39 am.
"Two good old boys in a fire-apple red convertible. Stoned. Ripped. Twisted. Good people."
- Hunter S. Thompson
my photography
- Hunter S. Thompson
my photography
•
•
Join Date: Nov 2008
Posts: 11
Reputation:
Solved Threads: 0
Thanks for the help!
However, I'm really confused at the filter and lambda. This is the first time that I've ever touched Python, so I'm quite new to all the stuff that Python uses.
I tried entering this: c = filter(lambda x: if x in b, a) into the interpreter, but it returns as invalid syntax on "if". Why's that?
However, I'm really confused at the filter and lambda. This is the first time that I've ever touched Python, so I'm quite new to all the stuff that Python uses.
I tried entering this: c = filter(lambda x: if x in b, a) into the interpreter, but it returns as invalid syntax on "if". Why's that?
Oh damn! I made a mistake there. That line should read:
That's embarrassing haha... I typed that in wrong. the statement
Anyways, lambda functions are just a way of declaring simple functions on-the-go without assigning a name to them.
python Syntax (Toggle Plain Text)
c = filter(lambda x: x in b, a)
x in b just returns a boolean of whether or not value "x" is an item in list "b", basically "is x in b?". The "if" shouldn't be there because that starts to define a conditional statement. Sorry about that!Anyways, lambda functions are just a way of declaring simple functions on-the-go without assigning a name to them.
Last edited by shadwickman; Jun 29th, 2009 at 5:42 am.
"Two good old boys in a fire-apple red convertible. Stoned. Ripped. Twisted. Good people."
- Hunter S. Thompson
my photography
- Hunter S. Thompson
my photography
•
•
Join Date: Nov 2008
Posts: 11
Reputation:
Solved Threads: 0
Ah, I've gotten it to work in the interpreter. However, I'm still rather confused about the usage of it when I actually code it down.
I've tried several times (while reading the filter and lambda articles), and all of them returns a single 'filter' result. However, from then on, I've got absolutely no idea how to proceed...
I've tried several times (while reading the filter and lambda articles), and all of them returns a single 'filter' result. However, from then on, I've got absolutely no idea how to proceed...
What do you mean, a single "filter" result? Filter returns a list... oh wait. Are you using Python 2.x, or are you using Python 3.0? If you are then
filter actually returns an iterator. Anyways, what did you mean by "single result"? "Two good old boys in a fire-apple red convertible. Stoned. Ripped. Twisted. Good people."
- Hunter S. Thompson
my photography
- Hunter S. Thompson
my photography
•
•
Join Date: Nov 2008
Posts: 11
Reputation:
Solved Threads: 0
I'm using 2.6.2 at the moment.
I've attached a picture to go along. I'm not exactly good at explaining all of these, because I'm as confused as it is...
Anyway, I've added some additional stuff to my code to allow me to debug it better. Here's the code:
And I think I've located the problem. The output window shows these in the list that I'm supposed to look through:
However, the filter.txt only has:
So, I think it's the \n that is affecting the comparisons. Is there a way to remove the \n in the list?
I've attached a picture to go along. I'm not exactly good at explaining all of these, because I'm as confused as it is...
Anyway, I've added some additional stuff to my code to allow me to debug it better. Here's the code:
Python Syntax (Toggle Plain Text)
def open_file(): f = open("c:/temp/test.txt","r") g = open("c:/temp/filter.txt","r") line = f.readlines() line2 = g.readlines() # Added this to check what's in memory after # reading the files. print line print '-----------' print line2 # End add array_size = 0 for loop in line: # Added one print here for testing: print 'Checking: ', loop, ' for ', line2[0] # End add print cmp(line[array_size],line2[0]) array_size+=1 g.close() f.close() open_file()
And I think I've located the problem. The output window shows these in the list that I'm supposed to look through:
Python Syntax (Toggle Plain Text)
['filter \n', 'lol \n', 'filter \n', 'lol \n', 'lol \n', 'filter \n', 'filter']
However, the filter.txt only has:
Python Syntax (Toggle Plain Text)
['filter ']
So, I think it's the \n that is affecting the comparisons. Is there a way to remove the \n in the list?
Oh! That's a simple problem. Use the str object's
As you can see, all spaces, tabs, newlines, etc. get removed. When you compare the indices in the list, strip each one first. Alternatively, you can strip each line with a list comprehension when you store the
That way the lists get stored without any whitespace in their indices.
strip() function. It removes all leading and trailing whitespace. Like this: python Syntax (Toggle Plain Text)
>>> a = " \t Hello World!\n " >>> b = a.strip() >>> b 'Hello World!'
readlines() lists like so: [python Syntax (Toggle Plain Text)
line_list = [x.strip() for x in open("filename", "r").readlines()]
"Two good old boys in a fire-apple red convertible. Stoned. Ripped. Twisted. Good people."
- Hunter S. Thompson
my photography
- Hunter S. Thompson
my photography
![]() |
Similar Threads
- reading from text files and manipulating the data....how (Visual Basic 4 / 5 / 6)
- manipulate text files (VB.NET)
- Trying to compare the contents of two text files and save the difference (Python)
- Need reference to Random Accessing of Text files (C++)
- Write and Read text files (Java)
- Making arrays from text files (VB.NET)
- appending two java text files (Java)
- lost my notebook for making text files--how do I get it back? (Windows NT / 2000 / XP)
Other Threads in the Python Forum
- Previous Thread: capture data at specific port using python script
- Next Thread: ctypes and speed
| Thread Tools | Search this Thread |
advanced aliased bash beginner bits calling casino changecolor class clear command convert corners count csv cturtle cursor def definedlines dictionary digital dynamic dynamically events examples external file float format frange function google gui hints homework i/o iframe import info input java line linux list lists loop matching mouse multiple number numbers obexftp output parsing path port prime programming projects py py2exe pygame pygtk python random rational raw_input recursion return scrolledtext signal singleton skinning stderr string strings subprocess table tails terminal text thread threading time tkinter tlapse tuple tutorial ubuntu unicode urllib urllib2 valueerror variable voip web-scrape whileloop windows word wxpython





