Hi all. My new job involves writing scripts for people in other departments. I'm pretty much on my own with this and I'm still a beginner with Python(I think my brain is still in PHP mode and I'm still struggling with this object oriented approach).
Here is what I have:
Input file, call it bob.txt for arrangement's sake:
<tag> ....stuff1 ....more stuff1 </tag> <tag> ....stuff2 ....more stuff2 </tag> <tag> ....stuff3 ....more stuff3 </tag>
#!/usr/local/bin/python2.6 import sys if (len(sys.argv) < 4): print "Usage: splitfilebytag option1 option2 option3 option4" print "Run this application from the input file directory" print "Option 1: input filename" print "Option 2: output filename" print "Option 3: output file extension" print "Option 4: tag that indicates split. eg: \"</tag>\". Use inverted commas" print "Option 5(optional): Start file number. eg: 172" print "Example usage: splitfilebytag test.xml out txt \"</tag>\" 12" exit() readfile_= sys.argv outputfilename_ = sys.argv extension_ = sys.argv tag_ = sys.argv try: if sys.argv: num_ = int(sys.argv) except: num_ = 0 def split_(readfile_, outputfilename_, extension_, tag_, num_): thelist_= with open(readfile_, 'r') as thefile_: for line_ in thefile_: if tag_ in line_: thelist_.append(line_) outfilename_ = '%s%03d.%s' % (outputfilename_, num_, extension_) num_ += 1 outfile_ = open(outfilename_, 'w') for item_ in thelist_: outfile_.write(item_) thelist_= outfile_.close() else: thelist_.append(line_) if __name__ == "__main__": split_(readfile_, outputfilename_, extension_, tag_, num_)
So in this case if my "users" run this ./plitfilebytag.py bob.txt out txt \"</tag>\" 12
They will en up with 3 separate files that look like this:
<tag> ....stuff1 ....more stuff1 </tag>
<tag> ....stuff2 ....more stuff2 </tag>
<tag> ....stuff3 ....more stuff3 </tag>
If there is anyone that could suggest a better approach or improvements to this I would appreciate it. I think this looks ok(to me at least) and the only thing I might change is to have the amount of leading zeroes as another option.