943,660 Members | Top Members by Rank

Ad:
  • Python Discussion Thread
  • Marked Solved
  • Views: 905
  • Python RSS
Jun 29th, 2009
0

Comparing two text files

Expand Post »
I'm a little stuck on this particular piece of code that I'm working on. I'm supposed to check the contents of 1 text file (original), and compare it with another text file (filter) to see if there's any words that matches up.

What I have now is a so-called working comparison piece of code, as the code is only able to detect the last word in the original to see if it's similar to the one that is on the filter. What puzzles is the fact that I've got several of the similar words before the last word, but the code does not detect it as a similar text to the one that is on the filter.

Here's what I have now...

Python Syntax (Toggle Plain Text)
  1. def open_file():
  2. f = open("c:/temp/test.txt","r")
  3. g = open("c:/temp/filter.txt","r")
  4. line = f.readlines()
  5. line2 = g.readlines()
  6. array_size = 0
  7. for loop in line:
  8. if line[array_size] == line2[0]:
  9. print 'OFFENSIVE'
  10. print line[array_size]
  11.  
  12. if line[array_size] != line2[0]:
  13. print 'NOT OFFENSIVE'
  14. print line[array_size]
  15. array_size+=1
  16. g.close()
  17. f.close()
  18.  
  19. open_file()

If it helps, here's the original text:
Python Syntax (Toggle Plain Text)
  1. filter
  2. lol
  3. filter
  4. lol
  5. lol
  6. filter
  7. lol

The text that is supposed to be filter is: "filter".

Any help would be greatly appreciated.
Similar Threads
Reputation Points: 10
Solved Threads: 0
Newbie Poster
Nyaato is offline Offline
11 posts
since Nov 2008
Jun 29th, 2009
1

Re: Comparing two text files

You can try the built-in filter function. Here's what I tried in the interpreter:
python Syntax (Toggle Plain Text)
  1. >>> a = [
  2. 'filter',
  3. 'lol',
  4. 'filter',
  5. 'lol',
  6. 'lol',
  7. 'filter',
  8. 'lol'
  9. ]
  10. >>> b = ['filter']
  11. >>> c = filter(lambda x: if x in b, a)
  12. >>> c
  13. ['filter', 'filter', 'filter']

As you can see, it takes each item in the list passed to filter (in this case, "a"), and returns a list of the values that returned True in the function passed to it.
In this case, the lambda function would return True if the current item (x) is in list "b". I hope that simplified your code a lot

Here's a Dive Into Python links concerning filter, and lambda.
Last edited by shadwickman; Jun 29th, 2009 at 4:39 am.
Reputation Points: 186
Solved Threads: 77
Posting Pro in Training
shadwickman is offline Offline
495 posts
since Jul 2007
Jun 29th, 2009
0

Re: Comparing two text files

Thanks for the help!

However, I'm really confused at the filter and lambda. This is the first time that I've ever touched Python, so I'm quite new to all the stuff that Python uses.

I tried entering this: c = filter(lambda x: if x in b, a) into the interpreter, but it returns as invalid syntax on "if". Why's that?
Reputation Points: 10
Solved Threads: 0
Newbie Poster
Nyaato is offline Offline
11 posts
since Nov 2008
Jun 29th, 2009
0

Re: Comparing two text files

Oh damn! I made a mistake there. That line should read:
python Syntax (Toggle Plain Text)
  1. c = filter(lambda x: x in b, a)
That's embarrassing haha... I typed that in wrong. the statement x in b just returns a boolean of whether or not value "x" is an item in list "b", basically "is x in b?". The "if" shouldn't be there because that starts to define a conditional statement. Sorry about that!

Anyways, lambda functions are just a way of declaring simple functions on-the-go without assigning a name to them.
Last edited by shadwickman; Jun 29th, 2009 at 5:42 am.
Reputation Points: 186
Solved Threads: 77
Posting Pro in Training
shadwickman is offline Offline
495 posts
since Jul 2007
Jun 29th, 2009
0

Re: Comparing two text files

Ah, I've gotten it to work in the interpreter. However, I'm still rather confused about the usage of it when I actually code it down.

I've tried several times (while reading the filter and lambda articles), and all of them returns a single 'filter' result. However, from then on, I've got absolutely no idea how to proceed...
Reputation Points: 10
Solved Threads: 0
Newbie Poster
Nyaato is offline Offline
11 posts
since Nov 2008
Jun 29th, 2009
0

Re: Comparing two text files

What do you mean, a single "filter" result? Filter returns a list... oh wait. Are you using Python 2.x, or are you using Python 3.0? If you are then filter actually returns an iterator. Anyways, what did you mean by "single result"?
Reputation Points: 186
Solved Threads: 77
Posting Pro in Training
shadwickman is offline Offline
495 posts
since Jul 2007
Jun 29th, 2009
0

Re: Comparing two text files

I'm using 2.6.2 at the moment.

I've attached a picture to go along. I'm not exactly good at explaining all of these, because I'm as confused as it is...

Anyway, I've added some additional stuff to my code to allow me to debug it better. Here's the code:

Python Syntax (Toggle Plain Text)
  1. def open_file():
  2. f = open("c:/temp/test.txt","r")
  3. g = open("c:/temp/filter.txt","r")
  4. line = f.readlines()
  5. line2 = g.readlines()
  6. # Added this to check what's in memory after
  7. # reading the files.
  8. print line
  9. print '-----------'
  10. print line2
  11. # End add
  12. array_size = 0
  13. for loop in line:
  14. # Added one print here for testing:
  15. print 'Checking: ', loop, ' for ', line2[0]
  16. # End add
  17. print cmp(line[array_size],line2[0])
  18. array_size+=1
  19. g.close()
  20. f.close()
  21.  
  22. open_file()

And I think I've located the problem. The output window shows these in the list that I'm supposed to look through:

Python Syntax (Toggle Plain Text)
  1. ['filter \n', 'lol \n', 'filter \n', 'lol \n', 'lol \n', 'filter \n', 'filter']

However, the filter.txt only has:
Python Syntax (Toggle Plain Text)
  1. ['filter ']

So, I think it's the \n that is affecting the comparisons. Is there a way to remove the \n in the list?
Attached Thumbnails
Click image for larger version

Name:	filter.JPG
Views:	15
Size:	12.4 KB
ID:	10636  
Reputation Points: 10
Solved Threads: 0
Newbie Poster
Nyaato is offline Offline
11 posts
since Nov 2008
Jun 29th, 2009
0

Re: Comparing two text files

Oh! That's a simple problem. Use the str object's strip() function. It removes all leading and trailing whitespace. Like this:
python Syntax (Toggle Plain Text)
  1. >>> a = " \t Hello World!\n "
  2. >>> b = a.strip()
  3. >>> b
  4. 'Hello World!'
As you can see, all spaces, tabs, newlines, etc. get removed. When you compare the indices in the list, strip each one first. Alternatively, you can strip each line with a list comprehension when you store the readlines() lists like so:
[python Syntax (Toggle Plain Text)
  1. line_list = [x.strip() for x in open("filename", "r").readlines()]
That way the lists get stored without any whitespace in their indices.
Reputation Points: 186
Solved Threads: 77
Posting Pro in Training
shadwickman is offline Offline
495 posts
since Jul 2007
Jun 29th, 2009
0

Re: Comparing two text files

Awesome! Thanks a lot! Finally fixed the problem!

I'm kind of new to python, so thanks a lot for bearing with my questions!
Reputation Points: 10
Solved Threads: 0
Newbie Poster
Nyaato is offline Offline
11 posts
since Nov 2008

This thread is solved

Either the thread starter or a moderator has marked this thread as solved. You can most likely trust the responses and answers given. There is most likely no reason for any further responses to be posted here. If you have a related question, please start a new thread in this forum instead.

This thread is more than three months old

No one has posted to this discussion for at least three months. Please let old threads die and do not reply to them unless you feel you have something new and valuable to contribute that absolutely must be added to make the discussion complete. Otherwise, please start a new thread in this forum instead.
Message:
Previous Thread in Python Forum Timeline: python 3 print()
Next Thread in Python Forum Timeline: ctypes and speed





About Us | Contact Us | Advertise | Acceptable Use Policy
Forum Index | Build Custom RSS Feed


Follow us on Twitter


© 2011 DaniWeb® LLC