Hey all. I'm new to this forum and Python.

I need a program that can search an inputted keyword through all the files that end with the ".txt" extension inside a given folder. The output should include the name of the files which contain the keyword, the sentences which contain the keyword (match exactly the word and not case-sensitive.
Lastly, the the output should be saved in a file named "Search Result.txt".

For example:

I input:

like

The output should look similar to:

FirstFile.txt:
...the word like has a very flexible range of uses...
...With LIKE you can use the following two wildcard characters in the pattern...
...Official site for People Like Us and Vicki Bennett...
...Celebs and Famous People who looks like other, um, things and stuff...

AnotherFile.txt:
...like Folke Rabe and his kid brother...
...Does your cat look like Adolf Hitler...

THANKS. :cool:

Recommended Answers

All 9 Replies

Member Avatar for sravan953

Edward, you have to show some effort from your side, how much ever of Python you know of.

Try to make a flowchart and go according to that:

--> Get a list of all .txt files
--> Open first file in list
--> Save the contents of the file as another list using readlines()
--> Go through each line and check for 'like', if the line contains the word, output that line to another text file
--> Repeat the whole process

Have a look for the "grep" tool that is available for your OS.
Ignore case searching is done with the -i option

Have a look for the "grep" tool that is available for your OS.
Ignore case searching is done with the -i option

There is an interesting python alternative to grep, called grin . Have a look here http://pypi.python.org/pypi/grin.

The Python module re is your friend:

# finding a search word in a text file using regex module re

import re

search = 'like'
# also matches 'like!', but not 'likely'
pattern = r"\b%s\b" % search
#print(pattern)  # for testing only --> \blike\b
rc = re.compile(pattern, re.IGNORECASE)

test_text = """\
the word like has a very flexible range of uses
With LIKE you can use two likely wildcard characters in the pattern
Official site for People Like Us and Vicki Bennett
Celebs and Famous People who looks like other, um, things and stuff
There is most likely a sentence without the search word
This is what the test file looks like!
"""

fname = "SearchFile.txt"
# save the test text  to a file
fout = open(fname, "w")
fout.write(test_text)
fout.close()

# read the file back line by line
word_list = []
for line in open(fname):
    match = rc.search(line)
    if match:
        print(match.group(0))  # for test
        print(line)

"""my output -->
like
the word like has a very flexible range of uses

LIKE
With LIKE you can use two likely wildcard characters in the pattern

Like
Official site for People Like Us and Vicki Bennett

like
Celebs and Famous People who looks like other, um, things and stuff

like
This is what the test file looks like!
"""

DAMN. There's another rule. If the input is "like" and the line is:

I'm pretty sure. Yes, that this is like the time when I went out.

instead of printing the entire line with "..." on the sides, this must be printed:

... that this is like the time when I went out ...

No punctuation marks apparently. :(

This is what I have so far. 2 problems:

1.) I don't really know how to search folders and files in folders.

2.) When I run the program, Python displays:

Traceback (most recent call last):
File "<pyshell#88>", line 1, in <module>
main()
File "<pyshell#87>", line 16, in main
for file in open(a):
TypeError: coercing to Unicode: need string or buffer, list found

def main():
	d = ""
	print "This program allows you to input a keyword..."
	print
	print "...to be searched through files in the folder Python26 of drive C."
	x = "1"
	while x == "1":
		print
		a = os.listdir("C:\Python26")
		print
		b = raw_input("What would you like to search for?: ")
		print
		print "Keyword: ", b,
		print
		print
		for file in open(a):
			for line in file:
				if b in line:
					print file, ":"
					print
					print "\t", "...", line, "...",
					print
					d = d + a + ":" + "\n" + "\t" + "..." + line + "..." + "\n"
		c = open(r"C:\Python26\SearchResult.txt", "w")
		c.write(d)
		c.close()
		print
		x = raw_input("Would you like to use this program again? (Typing anything other than 1 ends the program.): ")
	print
	print "Thank you for using this program."

Do us all a favor and don't use tabs for indentations!

for file in open(a):
    for line in file:
        if b in line:

One problem is here. First, do not use "file" as it is a reserved word. Second, it should be

for file_name in a:
    for rec in open(file_name):
        if b in rec:

Third, you could have directories in "a" as well as files, so you want to test with
if os.path.isfile(a): so try this

print a
for fname in a:
    full_name = os.path.join("C:", "Python26",  fname)
    if os.path.isfile(full_name):
        print "checking", full_name
        for rec in open(full_name):
            if b in rec:

THANKS SO MUCH EVERYONE. Just one last detail. When printing the line with the given keyword it's supposed to be between punctuation marks :(

For example, if I input "TAIL", instead of printing:

... cat's tail is white. Then I said ...

I oughta print:

... s tail is white ...

please help :)

This is what I have so far:

from os import listdir
from os.path import join, isfile

def main():
	print "This program allows you to input a keyword..."
	print
	print "...to be searched through files in the folder Python26 of drive C."
	x = "1"
	while x == "1":
		path = r"C:\Python26"
		entries = [join(path, entry) for entry in listdir(path)]
		files = filter(isfile, entries)
		c = open(join("C:\Python26", "SearchResult.txt"), "w")
		print
		print
		b = raw_input("What would you like to search for?: ")
		print
		print "Keyword: ", b,
		print
		print
		for file in files:
			f = open(file, 'r')
			for line in f.readlines():
				if b in line:
					print file, ":"
					print
					print "\t", "...", line, "...",
					print
					c.write(file + ":" + "\n" + "\t" + "..." + line + "..." + "\n")
			f.close()
			print
		c.close()
		print
		x = raw_input("Type 1 then press enter to run the program again.")
	print "Thank you for using this program."

main()

Using tabs does make for ugly looking code. Please use four spaces for indentations like every sane Python person. My programming editor balks at tabs in Python code, so I can't help.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.