Hello all, I am completely new to python, but unfortunately a professor I am working for needs some code done in it. Basically, I need to look through a file for the word "DATA" and then perform some manipulations (simple algebra) do a long list of numbers (3900 to be exact) after that. I think I can figure out the algebraic stuff with some for loops, but I cant seem to figure out how to make a program search for a word and start reading after that. Any help would be greatly appreciated.

Recommended Answers

All 3 Replies

This will look through your file for the word "DATA". It will print the index (location) of your word. Now you need to write a code to work with your file starting at that index number. If, for example, this returns/prints 38, then you will need to iterate through the rest of the file using in_list[39:].

in_file = open("path_to_file", "r")
in_list = in_file.readlines() #This makes our in_file into a list
word_to_find = "DATA"
for word in in_list:
	word = word.strip() #get rid of newline characters
	if word == word_to_find:
		print in_list.index(word_to_find) #See footnote

footnote:
Maybe someone can help me clean this up a little. If in_file contains newline characters, the last line would return an error because "DATA" is not in the list but "DATA\n" is.

Since we don't have your data file, we can simply create a sample test file and use that to show you how you can extract the numeric data after the marker ...

# create a test data string
raw_data = """\
a
b
c
DATA
11
22
17.5
45
19.5
66.5
"""

# write the test data file
fname = "mydata.txt"
fout = open(fname, "w")
fout.write(raw_data)
fout.close()

# read the test data file line by line and create a list
# of all the numeric items after the DATA marker
data_list = []
data_flag = False
for line in open(fname):
    line = line.rstrip()
    if data_flag == True:
        data_list.append(float(line))
    if line == "DATA":
        data_flag = True

# test print the result ...
print(data_list)  # [11.0, 22.0, 17.5, 45.0, 19.5, 66.5]

Now you can access the numbers in your data_list by index, or sequentially to process them.

Thanks all, this was a great help. What I ended up doing (actually before I saw any of these posts) was, since all my data files were the same up until the word "Data", just reading through the first 18 lines with a for loop and then processed all the important data after that with another for loop. I think I will modify the code to make use of vegaseat's suggestion. Thanks again.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.