I don't have a problem reading in one big array of numbers from a file. I can't figure out how to read in multiple arrays that are delimited by certain characters. For example, my dataset has the following format.
#BEGIN 1
1 .1 .2
2 .8 .3
3 .9 .5
#END

#BEGIN 2
1 .1 .2
2 .8 .3
3 .9 .5
#END

I want to be able to read these in as two separate arrays so that I can manipulate them independently. Thanks!

Recommended Answers

All 4 Replies

One way to do this ...

def extract(text, sub1, sub2):
    """
    extract a substring from text between first
    occurances of substrings sub1 and sub2
    """
    return text.split(sub1, 1)[-1].split(sub2, 1)[0]


# read the data from the file as one string
data = """\
#BEGIN 1
1 .1 .2
2 .8 .3
3 .9 .5
#END

#BEGIN 2
1 .1 .2
2 .7 .4
3 .8 .6
#END"""

ar1 = extract(data, "#BEGIN 1", "#END").split()
ar2 = extract(data, "#BEGIN 2", "#END").split()

print ar1  # ['1', '.1', '.2', '2', '.8', '.3', '3', '.9', '.5']
print ar2  # ['1', '.1', '.2', '2', '.7', '.4', '3', '.8', '.6']

Thanks,

Maybe this is the wrong community to ask, since the function I'm interested in is included with SciPy.

from scipy import *
data=io.array_import.read_array('datafile')


Using the above routine, the column/row structure is retained when converting the text file to an array...which makes it very helpful when performing operations on the data. I would to maintain this column structure as the file is split up.

Additionally, suppose that the number of "#BEGIN/#END" blocks is arbitrary, and not known before hand.

With Python you would use a list of lists for each group of records. Hopefully there is a similiar method in scipy. (If you were very, very lucky, each group would have the same number of records, you can use a counter to split them)

data_list=[
"#BEGIN 1",
"1 .1 .2",
"2 .8 .3",
"3 .9 .5",
"#END",
"",
"#BEGIN 2",
"1 .1 .3",
"2 .8 .4",
"3 .9 .6",
"#END",
"",
"#BEGIN 3",
"1 .1 .4",
"2 .8 .5",
"3 .9 .7",
"#END"]

group_list=[]
junk_list=[]
for rec in data_list:
   if rec.strip().startswith("#BEGIN") and (len(junk_list)):
      group_list.append(junk_list)
      junk_list=[]

   else:
      if (len(rec)) and (not rec.startswith("#")):
         junk_list.append(rec)
if len(junk_list):     ##   final group
   group_list.append(junk_list)
for group in group_list:
   print group

In other words you are talking about a matrix.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.