I want to obtain the data from .dat file which contains millions records (most of them are string) with 52 fields.
I am trying to store the data into 52 lists. However, it is very slow
Here is the code:
import sys
try:
file= open("test.dat", "r")
except IOError:
print >> sys.stderr, "File could not be opened"
sys.exit(1)
a=[]
b=[]
c=[]
.
.
for record in file:
a.append(record.split()[0])
b.append(record.split()[1])
c.append(record.split()[2])
.
.
z.append(record.split()[51])
On the other hand, I want to do some calculation on specific fields.
e.g. one field called starttime and another one called endtime which data format is "yymmddhhmmss". I want to calculate the time spent from these two fields.
Also, I want to do some data processing which is similar to deal with database.
Is there any better way to deal with this problem?
Thank you.