954,541 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

CSV Search Question

I have a problem that I've been puzzling with so I thought it was time to ask the experts.

I have two csv files that I am trying to search between. One csv has a Title column, and a URL column. The second has a variety of columns but a Title column that matches the Title column of the first CSV. I want to search the larger CSV to see if a record exists for a particular title and if so append the corresponding URL into a column at the end of the sheet.

I have been browsing the various csv reader properties but I can't seem to find too many examples. Would I use the find function along with csv reader to accomplish this?

betatype
Newbie Poster
8 posts since Jan 2009
Reputation Points: 10
Solved Threads: 0
 

I would read in the first csv into a dictionary with title- url key-value pairs. Then go through the second large one line by line, and append the looked up values, and write it to a new file.
The code goes something like this (not tested):

import csv
urlReader = csv.reader(open('url.csv'), delimiter=' ', quotechar='|')#FIXME filename delimiter quote etc...
url=dict()
for row in urlReader:
    url[row[0]]=row[1]
largeReader=csv.reader(open('large.csv'), delimiter=' ', quotechar='|')#FIXME filename delimiter quote etc...
largeWriter=csv.writer(open('large_out.csv',"w"), 
delimiter=' ', quotechar='|')#FIXME filename delimiter quote etc...
for row in largeReader:
    largeWriter.writerow(row+[url.get(row[???]),""]#FIXME ???=title column index
slate
Posting Whiz in Training
252 posts since Jun 2008
Reputation Points: 72
Solved Threads: 66
 

The dictionary suggestion was exactly what I needed. I've included the script that I got to work for me. This takes two semicolon delimited files in the format discussed above. It then searches a value from the larger csv against a smaller file with just a "Title" column and a "URL" column.

import csv
smallFile = csv.reader(open('/home/user/Desktop/smallfile.csv'), delimiter=';')
bigFile = csv.reader(open('/home/user/Desktop/bigfile.csv'), delimiter=';')
url=dict()
for row in smallFile:
	url[row[0]]=row[1]
for row in bigFile:
	i=0
	result=row[i]
	if result in url:
		print result+";"+url[result]
	else:
		print result+";"+"no corresponding url"
	i+=1
betatype
Newbie Poster
8 posts since Jan 2009
Reputation Points: 10
Solved Threads: 0
 

This article has been dead for over three months

Post: Markdown Syntax: Formatting Help
You