Hi everyone,

I have a fairly simple problem, but having not used python in awhile, I just can't seem to get things working.
Basically, I have a text file with a number of comma separated fields (attached).
What I want to do is split the string, and extract the "File" item from each line. I then need to write this to a new file. (I also want to skip the first line.)

So my desired output file would just have:

If anyone out there could help me with this, I'd be very grateful!

Quite simple with the code snippet I posted today, only add removing of quoting.

# text based data input with data accessible
# with named fields or indexing
from __future__ import print_function ## Python 3 style printing
from collections import namedtuple
import string

filein = open("cb2.txt")
quotes = '\'\"'
datadict = {}

headerline = filein.readline().lower() ## lowercase field names Python style
## first non-letter and non-number is taken to be the separator
separator = headerline.strip(string.lowercase + string.digits + quotes)[0]
print("Separator is '%s'" % separator)

headerline = [field.strip(string.whitespace + quotes) for field in headerline.split(separator)]
Dataline = namedtuple('Dataline',headerline)
print ('Fields are:',Dataline._fields,'\n')

for data in filein:
    data = [f.strip(string.whitespace + quotes) for f in data.split(separator)]
    d = Dataline(*data)
    datadict[d.id] = d ## do hash of id values for fast lookup (key field)

for id in  datadict.keys():

input('Ready') ## let the output be seen when run directly

One soultion with regular expression,not hard to wirte regex for this just a couple of min.

import re

text = '''\

test_match = re.findall(r'\d{7}\_\d{3}\_\d{3}\.\btif\b',text)
print test_match #Give us a list

#Looping over item in list
for item in test_match:
    print item

['2008308_017_079.tif', '2008308_017_080.tif', '2008308_017_081.tif', '2008308_017_082.tif', '2008308_017_083.tif']

For simple, inflexible solution you can do only:

filein = open("cb2.txt")
filein.readline() # drop first line
for line in filein:
    print line.split(',')[1]

Thanks tonyjv and Snippsat,

Your suggestions helped me get back on track.

filein = open("cb2.txt")

for line in filein:
    namedata = []
    namedata = line.split(",")[1]
    print namedata + "\n"
    fileout = open("copyimg.txt" , "a")
    fileout.write(namedata + "\n")

it is better though to move line 8 out of loop to line 3 with less indent. Then also mode 'w' is ok instead of 'a'. Of course closing must do after loop not inside (one indent less)

Thanks again tonyjv!
I'll make those changes.

Also print is providing the newline automatically, if you prefer you can use it also to file like this:

filein = open("cb2.txt")
fileout = open("copyimg.txt" , "w")

for line in filein:
    namedata = []
    namedata = line.split(",")[1]
    print namedata
    print >>fileout,namedata


Oh, terrific!
I didn't know that was an option with "print". I know I'll use that method again in the future.
Many thanks for the great help and advice :)

Be a part of the DaniWeb community

We're a friendly, industry-focused community of 1.18 million developers, IT pros, digital marketers, and technology enthusiasts learning and sharing knowledge.