To modify the script to batch save .xlsx to .csv It was a tremendous help and I thought I'd share my code here.
You'll need Python for Windows (I used 2.6) and the pywin32 extensions...easily downloadable off the internet. Install them on the Windows PC that has Excel 2007 on it. Save the following file as "xls2csv.py" into the directory that contains all your .XLSX files. This script will not delete the original .XLSX files.
# You must have Python for Windows (I used 2.6) and pywin32 extensions
# installed and Excel 2007 on a Windows PC
# Put this script into the dir where all the .XLSX files are and then cd to that dir
# Usage: c:\python26\python.exe xls2csv.py
xlsx_files = glob.glob('*.xlsx')
if len(xlsx_files) == 0:
raise RuntimeError('No XLSX files to convert.')
xlApp = win32com.client.Dispatch('Excel.Application')
for file in xlsx_files:
xlWb = xlApp.Workbooks.Open(os.path.join(os.getcwd(), file))
xlWb.SaveAs(os.path.join(os.getcwd(), file.split('.xlsx') +
time.sleep(2) # give Excel time to quit, otherwise files may be locked
# Uncomment the two lines below if you want the script to remove
# the orig .xlsx files when done
#for file in xlsx_files:
to the linux users, who don't have access to those windows modules basically those files (xlsx) are zipped files that contain xml files where the data is contained, so essentially you could write a programme that unzips the xlsx file and parse the xml files contained in the "xl/worksheets" folder to your preferred xml module parser. i used xml.dom.minidom module and i've tested it with the etree module.