First let me say thanks to everyone who has been responding to my posts and providing valuable insight. I have been trying to add rep whenever possible, and appreciate your help.

My assignment was to write a python code which takes data from an infile, then has the user specify a column out of the infile, and then the program writes only the information from that column to an outfile. Thanks to help from you guys, I've also been able to include a portion of the code which will skip over blank lines instead of crashing. I'd like to optimize/change the code with your suggestions. I feel that I've probably included too many operations and am being redundant in places. Here is my code, with comments which try to convey my level of understanding. Please add suggestions and point out where my comments are fallacious.

def columns(infile, outfile):
	f = open(infile,'r')
	o = open(outfile,'w')

	col = int(raw_input('Please select a column (starting at 0) from you in file, %s:' % (infile))) 

	temp = []   #Store an empty list
 
	for line in f:
		if not line.strip():   #If empty line, strip the new line character and remaining space 
			continue
		else:
			line = line.split() #Split lines so that they can be operated on via list operations
        		temp.append(line[col])	 #Write the user-specified column into the empty list, "temp
		       
	o.write('\n'.join(temp))   
					

	print "See %s" % (outfile)  
	
	f.close()
	o.close()

Recommended Answers

All 6 Replies

this isnt that big of a deal, but you could add col -= 1 after this:

col = int(raw_input('Please select a column (starting at 0) from you in file, %s:' % (infile)))

this way, you can eliminate the "starting at 0" part and they will enter "1" for the first column.

Some of your comments may be overkill. Here's how I'd revise your function:

* Added Billy's suggestion from above
* I like comments on separate lines unless they're extremely short
* Tabs are an absolute no-no in my book
* No need to say if something: continue ... when there's no other action in the loop, simply omit the check for something

def columns(infile, outfile):
    f = open(infile,'r')
    o = open(outfile,'w')

    col = raw_input('Please select a column from your input file, %s:' % (infile))
    col = int(col) - 1

    temp = []
    for line in f:
        line = line.strip()
        if line:
            # Split each line into columns
            line = line.split()
            # Write contents of user-specified columns to temp
            temp.append(line[col])
    o.write('\n'.join(temp))   
                    

    print "See %s" % (outfile)  
    f.close()
    o.close()

You unnecessary use the list.

def columns(infile, outfile):
    f = open(infile,'r')
    o = open(outfile,'w')

    col = raw_input('Please select a column from your input file, %s:' % (infile))
    col = int(col) - 1
    for line in f:
        line = line.strip()
        if line:
            # Split each line into columns
            line = line.split()
            # Write contents of user-specified columns to temp
            o.write(line[col])
            o.write("\n")
    f.close()
    o.close() # this is the actual flushing
    print "See %s" % (outfile)

If you're really looking to get fancy use some list comprehension:

def columns(infile, outfile):
    f = open(infile,'r')
    o = open(outfile,'w')
    #
    col = raw_input('Please select a column from your input file, %s:' % (infile))
    col = int(col) - 1
    o.write('\n'.join([line.strip().split()[col] for line in f if line != '\n']))
    #
    f.close()
    o.close() # this is the actual flushing
    print "See %s" % (outfile)
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.