0

Hello all, I am completely new to python, but unfortunately a professor I am working for needs some code done in it. Basically, I need to look through a file for the word "DATA" and then perform some manipulations (simple algebra) do a long list of numbers (3900 to be exact) after that. I think I can figure out the algebraic stuff with some for loops, but I cant seem to figure out how to make a program search for a word and start reading after that. Any help would be greatly appreciated.

3
Contributors
3
Replies
4
Views
7 Years
Discussion Span
Last Post by sniperx99
0

This will look through your file for the word "DATA". It will print the index (location) of your word. Now you need to write a code to work with your file starting at that index number. If, for example, this returns/prints 38, then you will need to iterate through the rest of the file using in_list[39:].

in_file = open("path_to_file", "r")
in_list = in_file.readlines() #This makes our in_file into a list
word_to_find = "DATA"
for word in in_list:
	word = word.strip() #get rid of newline characters
	if word == word_to_find:
		print in_list.index(word_to_find) #See footnote

footnote:
Maybe someone can help me clean this up a little. If in_file contains newline characters, the last line would return an error because "DATA" is not in the list but "DATA\n" is.

Edited by txwooley: need to add footnote

1

Since we don't have your data file, we can simply create a sample test file and use that to show you how you can extract the numeric data after the marker ...

# create a test data string
raw_data = """\
a
b
c
DATA
11
22
17.5
45
19.5
66.5
"""

# write the test data file
fname = "mydata.txt"
fout = open(fname, "w")
fout.write(raw_data)
fout.close()

# read the test data file line by line and create a list
# of all the numeric items after the DATA marker
data_list = []
data_flag = False
for line in open(fname):
    line = line.rstrip()
    if data_flag == True:
        data_list.append(float(line))
    if line == "DATA":
        data_flag = True

# test print the result ...
print(data_list)  # [11.0, 22.0, 17.5, 45.0, 19.5, 66.5]

Now you can access the numbers in your data_list by index, or sequentially to process them.

Edited by vegaseat: num

0

Thanks all, this was a great help. What I ended up doing (actually before I saw any of these posts) was, since all my data files were the same up until the word "Data", just reading through the first 18 lines with a for loop and then processed all the important data after that with another for loop. I think I will modify the code to make use of vegaseat's suggestion. Thanks again.

This question has already been answered. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.