I'm working on a program that takes ISBN numbers, grabs data on them from ISBNdb.com and sticks said data into a database.

At this point, I've successfully managed to get the ISBN XML file from isbndb.com, but whenever I try to feed it into SAX, it seems to try to treat the XML file as as URL, and retrieve it from the web, and then crashes.

In any case, here's the main.py

import urllib
import sys
from string import split

from xml.sax  import make_parser 
from Handler import BookHandler 

ISBNdb_Act = "J6JNO9UH"

while 1:
	bc = raw_input("Scan Barcode\n")
	print bc
	strQuery = "http://isbndb.com/api/books.xml?access_key="+ISBNdb_Act+"&index1=isbn&value1="+bc
	print strQuery
	print ("\n")
	ISBNdb_fh = urllib.urlopen(strQuery)
	ISBNdb_XML = ISBNdb_fh.read()
	print ISBNdb_XML
	ch = BookHandler( ) 
	saxparser = make_parser( ) 

	print "Done!\n\n"

The Handler

from xml.sax.handler import ContentHandler 

class BookHandler(ContentHandler):

	def startElement(self, name, attributes):
		print "Start element:", name

And what I get on the command line

Scan Barcode

<?xml version="1.0" encoding="UTF-8"?>

<ISBNdb server_time="2008-09-10T06:21:22Z">
<BookList total_results="1" page_size="10" page_number="1" shown_results="1">
<BookData book_id="understanding_power" isbn="1565847032">
<Title>Understanding power</Title>
<TitleLong>Understanding power: the indispensable Chomsky</TitleLong>
<AuthorsText>edited by Peter R. Mitchell and John Schoeffel</AuthorsText>
<PublisherText publisher_id="new_press">New York : New Press, c2002.</PublisherText>

Traceback (most recent call last):
  File "C:\BookDb\Main.py", line 26, in <module>
  File "C:\Python25\lib\xml\sax\expatreader.py", line 102, in parse
    source = saxutils.prepare_input_source(source)
  File "C:\Python25\lib\xml\sax\saxutils.py", line 298, in prepare_input_source
    f = urllib.urlopen(source.getSystemId())
  File "C:\Python25\lib\urllib.py", line 82, in urlopen
    return opener.open(url)
  File "C:\Python25\lib\urllib.py", line 187, in open
    return self.open_unknown(fullurl, data)
  File "C:\Python25\lib\urllib.py", line 199, in open_unknown
    raise IOError, ('url error', 'unknown url type', type)
IOError: [Errno url error] unknown url type: '?xml version="1.0" encoding="utf-8"?>\n\n<isbndb server_time="2008-09-10t06'


System is WXP SP3 w/
Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)] on win32

I'm not the best coder around (by a long shot), but this is so simple I cannot see where it's going wrong.

On the other hand, I've got the barcode scanner front-end working perfectly; I'm better with hardware.

From the documentation

parse( source)
Process an input source, producing SAX events. The source object can be a system identifier (a string identifying the input source - typically a file name or an URL), a file-like object, or an InputSource object. When parse() returns, the input is completely processed, and the parser object can be discarded or reset. As a limitation, the current implementation only accepts byte streams; processing of character streams is for further study.

So it looks like you can't simply send text; a quick work-around would be to save that ISBNdb_XML to a temporary file and then use the file path as the parameter to parse()

This article has been dead for over six months. Start a new discussion instead.