I am looking for a good example of the tarfile module, writing to and reading from an archive. Particularly the highly compressed filename.tar.bz2 format.
Here is an example:
import tarfile # uncompressed = "w" use extension .tar # gzip compressed = "w:gz" dito .tar.gz # bzip2 super compressed = "w:bz2" dito .tar.bz2 tar = tarfile.open("sample.tar.bz2", "w:bz2") # turn three regular files into a tar file archive for name in ["test1.py", "test2.py", "test3.py"]: tar.add(name) tar.close() # read the tarfile tar = tarfile.open("sample.tar.bz2", "r:bz2") file_list =  for file in tar: print file.name, file.size file_list.append(file.name) # another way to get the file list file_list2 = tar.getnames() print file_list print file_list2 # pick one of the three files in the tarball/tar-archive filename = file_list # decompress the particular file data = tar.extractfile(filename).read() print "Content of file %s in the tar-archive:" % filename print data tar.close()
works like a charm, got to keep playing with it.
Thanks for the example, VE. I have a .tar.bz2 file I need to read within Python and was able to take your code and use it. I've put together a little .tar.bz2 extraction example in case others follow the same path and wind up here. This is for Python 2.5 WinXP Pro but should almost work under Linux.
import os import tarfile tar = tarfile.open("MyTarFile.tar.bz2","r:bz2") # Replace MyTarFile with the right name file_list = tar.getnames() for fn in file_list: # Filenames xfile = tar.extractfile(fn) if xfile: # True if data file, False if directory (apparently) data = xfile.read() fo = open(fn, "wb") if fo: print fn fo.write(data) fo.close() else: print "Error opening output file %s" % fn else: # ASSuME xfile None because filename is a directory try: os.mkdir(fn) # Also ASSuME higher directories show up first except WindowsError, e: if e == 183: # This happens when you try to re-make an existing directory continue # Ignore duplicate directory else: print repr(e) raise WindowsError, e tar.close()
OT mini-rant: I'd been googling in vain for an example using the bz2 module to decompress a .tar.bz2 file, and not had any luck. The bz2 documentation is lacking (in my opinion) as it doesn't precisely describe where the bz2.decompress() input data comes from. I tried all the obvious alternatives but none of them worked, hence the tarfile module instead. I only mention this for the benefit of the next person with the same problem.