Comparing two text files

Please support our Python advertiser: Programming Forums - DaniWeb Sister Site
Thread Solved

Join Date: Nov 2008
Posts: 11
Reputation: Nyaato is an unknown quantity at this point 
Solved Threads: 0
Nyaato Nyaato is offline Offline
Newbie Poster

Comparing two text files

 
0
  #1
Jun 29th, 2009
I'm a little stuck on this particular piece of code that I'm working on. I'm supposed to check the contents of 1 text file (original), and compare it with another text file (filter) to see if there's any words that matches up.

What I have now is a so-called working comparison piece of code, as the code is only able to detect the last word in the original to see if it's similar to the one that is on the filter. What puzzles is the fact that I've got several of the similar words before the last word, but the code does not detect it as a similar text to the one that is on the filter.

Here's what I have now...

  1. def open_file():
  2. f = open("c:/temp/test.txt","r")
  3. g = open("c:/temp/filter.txt","r")
  4. line = f.readlines()
  5. line2 = g.readlines()
  6. array_size = 0
  7. for loop in line:
  8. if line[array_size] == line2[0]:
  9. print 'OFFENSIVE'
  10. print line[array_size]
  11.  
  12. if line[array_size] != line2[0]:
  13. print 'NOT OFFENSIVE'
  14. print line[array_size]
  15. array_size+=1
  16. g.close()
  17. f.close()
  18.  
  19. open_file()

If it helps, here's the original text:
  1. filter
  2. lol
  3. filter
  4. lol
  5. lol
  6. filter
  7. lol

The text that is supposed to be filter is: "filter".

Any help would be greatly appreciated.
Reply With Quote Quick reply to this message  
Join Date: Jul 2007
Posts: 489
Reputation: shadwickman will become famous soon enough shadwickman will become famous soon enough 
Solved Threads: 76
shadwickman's Avatar
shadwickman shadwickman is offline Offline
Posting Pro in Training

Re: Comparing two text files

 
1
  #2
Jun 29th, 2009
You can try the built-in filter function. Here's what I tried in the interpreter:
  1. >>> a = [
  2. 'filter',
  3. 'lol',
  4. 'filter',
  5. 'lol',
  6. 'lol',
  7. 'filter',
  8. 'lol'
  9. ]
  10. >>> b = ['filter']
  11. >>> c = filter(lambda x: if x in b, a)
  12. >>> c
  13. ['filter', 'filter', 'filter']

As you can see, it takes each item in the list passed to filter (in this case, "a"), and returns a list of the values that returned True in the function passed to it.
In this case, the lambda function would return True if the current item (x) is in list "b". I hope that simplified your code a lot

Here's a Dive Into Python links concerning filter, and lambda.
Last edited by shadwickman; Jun 29th, 2009 at 4:39 am.
"Two good old boys in a fire-apple red convertible. Stoned. Ripped. Twisted. Good people."
- Hunter S. Thompson

my photography
Reply With Quote Quick reply to this message  
Join Date: Nov 2008
Posts: 11
Reputation: Nyaato is an unknown quantity at this point 
Solved Threads: 0
Nyaato Nyaato is offline Offline
Newbie Poster

Re: Comparing two text files

 
0
  #3
Jun 29th, 2009
Thanks for the help!

However, I'm really confused at the filter and lambda. This is the first time that I've ever touched Python, so I'm quite new to all the stuff that Python uses.

I tried entering this: c = filter(lambda x: if x in b, a) into the interpreter, but it returns as invalid syntax on "if". Why's that?
Reply With Quote Quick reply to this message  
Join Date: Jul 2007
Posts: 489
Reputation: shadwickman will become famous soon enough shadwickman will become famous soon enough 
Solved Threads: 76
shadwickman's Avatar
shadwickman shadwickman is offline Offline
Posting Pro in Training

Re: Comparing two text files

 
0
  #4
Jun 29th, 2009
Oh damn! I made a mistake there. That line should read:
  1. c = filter(lambda x: x in b, a)
That's embarrassing haha... I typed that in wrong. the statement x in b just returns a boolean of whether or not value "x" is an item in list "b", basically "is x in b?". The "if" shouldn't be there because that starts to define a conditional statement. Sorry about that!

Anyways, lambda functions are just a way of declaring simple functions on-the-go without assigning a name to them.
Last edited by shadwickman; Jun 29th, 2009 at 5:42 am.
"Two good old boys in a fire-apple red convertible. Stoned. Ripped. Twisted. Good people."
- Hunter S. Thompson

my photography
Reply With Quote Quick reply to this message  
Join Date: Nov 2008
Posts: 11
Reputation: Nyaato is an unknown quantity at this point 
Solved Threads: 0
Nyaato Nyaato is offline Offline
Newbie Poster

Re: Comparing two text files

 
0
  #5
Jun 29th, 2009
Ah, I've gotten it to work in the interpreter. However, I'm still rather confused about the usage of it when I actually code it down.

I've tried several times (while reading the filter and lambda articles), and all of them returns a single 'filter' result. However, from then on, I've got absolutely no idea how to proceed...
Reply With Quote Quick reply to this message  
Join Date: Jul 2007
Posts: 489
Reputation: shadwickman will become famous soon enough shadwickman will become famous soon enough 
Solved Threads: 76
shadwickman's Avatar
shadwickman shadwickman is offline Offline
Posting Pro in Training

Re: Comparing two text files

 
0
  #6
Jun 29th, 2009
What do you mean, a single "filter" result? Filter returns a list... oh wait. Are you using Python 2.x, or are you using Python 3.0? If you are then filter actually returns an iterator. Anyways, what did you mean by "single result"?
"Two good old boys in a fire-apple red convertible. Stoned. Ripped. Twisted. Good people."
- Hunter S. Thompson

my photography
Reply With Quote Quick reply to this message  
Join Date: Nov 2008
Posts: 11
Reputation: Nyaato is an unknown quantity at this point 
Solved Threads: 0
Nyaato Nyaato is offline Offline
Newbie Poster

Re: Comparing two text files

 
0
  #7
Jun 29th, 2009
I'm using 2.6.2 at the moment.

I've attached a picture to go along. I'm not exactly good at explaining all of these, because I'm as confused as it is...

Anyway, I've added some additional stuff to my code to allow me to debug it better. Here's the code:

  1. def open_file():
  2. f = open("c:/temp/test.txt","r")
  3. g = open("c:/temp/filter.txt","r")
  4. line = f.readlines()
  5. line2 = g.readlines()
  6. # Added this to check what's in memory after
  7. # reading the files.
  8. print line
  9. print '-----------'
  10. print line2
  11. # End add
  12. array_size = 0
  13. for loop in line:
  14. # Added one print here for testing:
  15. print 'Checking: ', loop, ' for ', line2[0]
  16. # End add
  17. print cmp(line[array_size],line2[0])
  18. array_size+=1
  19. g.close()
  20. f.close()
  21.  
  22. open_file()

And I think I've located the problem. The output window shows these in the list that I'm supposed to look through:

  1. ['filter \n', 'lol \n', 'filter \n', 'lol \n', 'lol \n', 'filter \n', 'filter']

However, the filter.txt only has:
  1. ['filter ']

So, I think it's the \n that is affecting the comparisons. Is there a way to remove the \n in the list?
Attached Thumbnails
filter.JPG  
Reply With Quote Quick reply to this message  
Join Date: Jul 2007
Posts: 489
Reputation: shadwickman will become famous soon enough shadwickman will become famous soon enough 
Solved Threads: 76
shadwickman's Avatar
shadwickman shadwickman is offline Offline
Posting Pro in Training

Re: Comparing two text files

 
0
  #8
Jun 29th, 2009
Oh! That's a simple problem. Use the str object's strip() function. It removes all leading and trailing whitespace. Like this:
  1. >>> a = " \t Hello World!\n "
  2. >>> b = a.strip()
  3. >>> b
  4. 'Hello World!'
As you can see, all spaces, tabs, newlines, etc. get removed. When you compare the indices in the list, strip each one first. Alternatively, you can strip each line with a list comprehension when you store the readlines() lists like so:
  1. line_list = [x.strip() for x in open("filename", "r").readlines()]
That way the lists get stored without any whitespace in their indices.
"Two good old boys in a fire-apple red convertible. Stoned. Ripped. Twisted. Good people."
- Hunter S. Thompson

my photography
Reply With Quote Quick reply to this message  
Join Date: Nov 2008
Posts: 11
Reputation: Nyaato is an unknown quantity at this point 
Solved Threads: 0
Nyaato Nyaato is offline Offline
Newbie Poster

Re: Comparing two text files

 
0
  #9
Jun 29th, 2009
Awesome! Thanks a lot! Finally fixed the problem!

I'm kind of new to python, so thanks a lot for bearing with my questions!
Reply With Quote Quick reply to this message  
Reply

This thread has been marked solved.
Perhaps start a new thread instead?
Message:


Thread Tools Search this Thread



About Us | Contact Us | Advertise | DaniWeb | Acceptable Use Policy | RSS Feed

©2003 - 2009 DaniWeb® LLC