So, I wrote a little script to convert tifs into jpegs. The tifs are about 600mb in size, I reduce the size, and then convert. Here is the code in its entirety:

## First we import the image library, Image object, and garbage
## collector, as well as os
from PIL import Image
import gc, os
## Get the path for the directory to be converted, then change to that path
print "Please enter the path to the files that you wish to convert."
path = raw_input()
os.chdir(path)
## For each file in every subdirectory, see if it's a tif file
for root, dir, files in os.walk(path):
    for name in files:
        if name[-4:] == ".tif":
            print 'Opening ' + name
            os.chdir(path)
            im = Image.open(root + '/' + name)
            x, y = im.size
            ## Resize the tiff tile at a 2/3 scale.  Make a new directory to
            ## mimic the file heiarchy of the original file, only on the C
            ## drive instead of wherever it was to begin with.
            print 'Resizing'
            im2 = im.resize((int(x*.66), int(y*.66)), Image.ANTIALIAS)
            n = 'c' + root[1:]
            savedfile = n +"/jpegs/"
            try:
                os.makedirs(savedfile)
                os.chdir(savedfile)
            except WindowsError:
                os.chdir(savedfile)
            savedfile = name[:-4] + ".jpg"
            ## Save the file as a jpg, with a high quality
            print 'Saving'
            im2.save(savedfile, quality=85)
            del im
            del im2
            ## Force a memory dump.  Otherwise memory will get cluttered
            ## up, very quickly too.
            gc.collect()
            print 'Memory Wiped'

Due to space limitations in the workplace, I need it to copy the directory structure onto the c drive of the computer the script is running off of, hence the funky directory dance. I'm still kinda new to all of this, and I thought the garbage collector would help me out. But, it'll wipe the memory at first, but things will still accumulate. After converting about 30-40 images (and I need to convert literally tens of thousands) the program crashed with a MemoryError. Page File usage was at about 2G. There is about 3 gigs of ram on this PC, which is running XP. Any tips or advice?

Recommended Answers

All 5 Replies

Well, a quick tangential note: your method for extracting the extension

name[-4:]

is buggy. What if the file extension is ".jpeg"?

Better:

filename, ext = os.path.splitext(name)
if ext == ".tif":
   do stuff ...
   savedfile = filename+".jpg"
   do more stuff...

WRT to the main problem, the memory error, I agree that there's something wrong, and it doesn't appear to be your code.

You might try e-mailing the PIL support at

image-sig@python.org

Jeff

If it's anything but a tif, it'll be ignored, as I don't need to do anything with it. However, I think that the method you showed me is much more practical than what I've got (I wasn't aware of that tool to grab the extension name). So, I'll be modifying it. Thanks for the tip. And I'll shoot them an email. Thanks for the help, really. Happy to know that I at least had the code right.

Hi Edwards8,

It turns out that PIL will let you read pieces of an image file in sequence without loading the entire thing into memory. Fred Lundh has a post about it on a python mailing list here. I don't know if this will work for TIFs, but you can always give it a shot.

It sounds like some piece of the file is being kept resident in memory, despite your deleting and garbage collection - and from the sizes you quoted, it looks to be about a tenth of each file. Maybe PIL is reading the files in pieces, and hanging onto the last piece of each? Or maybe the file is being kept open in some way, even though you delete the image object? One thing you could try is to create a file handle on the image file, then use Image.open() on that, instead of the file name. Then, when you are finished, close the original file handle. This may convince PIL to let go of whatever data it's holding onto.

Unfortunately, I don't know a whole lot about PIL, so I'm shooting in the dark, here.

Hope this helps!

Try running everything after "if name[-4:] == ".tif":" in a separate function. The memory should be garbage collected when the function exits.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.