| | |
Python and the JPEG Image File, Part 1, The Header
Thread Solved |
Python and the JPEG Image File, Part 1, The Header
Intro
The JPEG image file format (.jpg) is very popular on the internet because you can pack a lot of picture information into a relatively small file. There are competing file formats like GIF and PNG. GIF is rather limited to the number of colors (8 bit = 256) compared to JPEG (24 bit = 16,777,216). JPEG can typically achieve 10:1 to 20:1 compression without visible loss. It allows you to specify a quality setting (1 - 100) with higher quality giving less compression.
JPEG header
Sharp edges in images like borders and embedded text give JPEG a hard time, that's why you can insert a text comment directly into the JPEG file header. As the name implies, the header is the first part of the JPEG file, followed by the compressed picture. In this part of the tutorial we take a look at the header of the a typical JPEG file and extract some information it contains. To make things simple, I have attached a sample JPEG image file that contains a 80x80 blue square at a resolution of 200dpi. This file also contains a text comment that can be extracted. Blue is the favorite color of vegaseat.
Intro
The JPEG image file format (.jpg) is very popular on the internet because you can pack a lot of picture information into a relatively small file. There are competing file formats like GIF and PNG. GIF is rather limited to the number of colors (8 bit = 256) compared to JPEG (24 bit = 16,777,216). JPEG can typically achieve 10:1 to 20:1 compression without visible loss. It allows you to specify a quality setting (1 - 100) with higher quality giving less compression.
JPEG header
Sharp edges in images like borders and embedded text give JPEG a hard time, that's why you can insert a text comment directly into the JPEG file header. As the name implies, the header is the first part of the JPEG file, followed by the compressed picture. In this part of the tutorial we take a look at the header of the a typical JPEG file and extract some information it contains. To make things simple, I have attached a sample JPEG image file that contains a 80x80 blue square at a resolution of 200dpi. This file also contains a text comment that can be extracted. Blue is the favorite color of vegaseat.
python Syntax (Toggle Plain Text)
# print out the hex bytes of a jpeg file, find end of header, image size, and extract any text comment # (JPEG = Joint Photographic Experts Group) # tested with Python24 vegaseat 21sep2005 try: # the sample jpeg file is an "all blue 80x80 200dpi image, saved at a quality of 90" # with the quoted comment added imageFile = 'Blue80x80x200C.JPG' data = open(imageFile, "rb").read() except IOError: print "Image file %s not found" % imageFile raise SystemExit # initialize empty list hexList = [] for ch in data: # make a hex byte byt = "%02X" % ord(ch) hexList.append(byt) #print hexList # test print "hex dump of a 80x80 200dpi all blue jpeg file:" print "(the first two bytes FF and D8 mark a jpeg file)" print "(index 6,7,8,9 spells out the subtype JFIF)" k = 0 for byt in hexList: # add spacer every 8 bytes if k % 8 == 0: print " ", # new line every 16 bytes if k % 16 == 0: print byt, k += 1 print "-"*50 # the header goes from FF D8 to the first FF C4 marker for k in range(len(hexList)-1): if hexList[k] == 'FF' and hexList[k+1] == 'C4': print "end of header at index %d (%s)" % (k, hex(k)) break # find pixel width and height of image # located at offset 5,6 (highbyte,lowbyte) and 7,8 after FF C0 or FF C2 marker for k in range(len(hexList)-1): if hexList[k] == 'FF' and (hexList[k+1] == 'C0' or hexList[k+1] == 'C2'): #print k, hex(k) # test height = int(hexList[k+5],16)*256 + int(hexList[k+6],16) width = int(hexList[k+7],16)*256 + int(hexList[k+8],16) print "width = %d height = %d pixels" % (width, height) # find any comment inserted into the jpeg file # marker is FF FE followed by the highbyte/lowbyte of comment length, then comment text comment = "" for k in range(len(hexList)-1): if hexList[k] == 'FF' and hexList[k+1] == 'FE': #print k, hex(k) # test length = int(hexList[k+2],16)*256 + int(hexList[k+3],16) #print length # test for m in range(length-3): comment = comment + chr(int(hexList[k + m + 4],16)) #print chr(int(hexList[k + m + 4],16)), # test #print hexList[k + m + 4], # test if len(comment) > 0: print comment else: print "No comment"
Last edited by vegaseat; Aug 12th, 2009 at 5:17 pm. Reason: Formatting
May 'the Google' be with you!
![]() |
Similar Threads
- How to load a JPEG image file, store it in array and then save it (Visual Basic 4 / 5 / 6)
- Extract header info from image file (Python)
- no WxPython here? (Python)
- jpg with Python (Python)
- Python and the JPEG Image File, Part 2, The Image (Python)
Other Threads in the Python Forum
- Previous Thread: Comparing Python and C, Part 1, Integer Variables
- Next Thread: How to parse patent data from html's for dummy
| Thread Tools | Search this Thread |
abrupt alarm ansi anti approximation assignment avogadro backend beginner binary bluetooth calculator character cmd customdialog cx-freeze data decimals dictionaries dictionary directory dynamic error exe file float format function generator gnu graphics halp heads homework http ideas import input itunes java leftmouse line linux list lists loop maze millimeter module mouse number numbers output parsing path pointer prime programming progressbar push py2exe pygame python random recursion schedule screensaverloopinactive script scrolledtext slicenotation sqlite ssh statistics string strings sudokusolver sum table terminal text thread threading time tlapse tuple tutorial twoup ubuntu unicode urllib urllib2 variable ventrilo vigenere web webservice wikipedia write wxpython xlib






