0

Ok...

I'm working on a program that takes ISBN numbers, grabs data on them from ISBNdb.com and sticks said data into a database.

At this point, I've successfully managed to get the ISBN XML file from isbndb.com, but whenever I try to feed it into SAX, it seems to try to treat the XML file as as URL, and retrieve it from the web, and then crashes.

In any case, here's the main.py

import urllib
import sys
from string import split

from xml.sax  import make_parser 
from Handler import BookHandler 

ISBNdb_Act = "J6JNO9UH"

while 1:
	bc = raw_input("Scan Barcode\n")
	print bc
	strQuery = "http://isbndb.com/api/books.xml?access_key="+ISBNdb_Act+"&index1=isbn&value1="+bc
	print strQuery
	print ("\n")
	ISBNdb_fh = urllib.urlopen(strQuery)
	ISBNdb_XML = ISBNdb_fh.read()
	print ISBNdb_XML
	
	
	
	ch = BookHandler( ) 
	saxparser = make_parser( ) 
 
	saxparser.setContentHandler(ch) 
	saxparser.parse(ISBNdb_XML) 

		
	"""
	print "Done!\n\n"
	"""

The Handler

from xml.sax.handler import ContentHandler 

class BookHandler(ContentHandler):

	def startElement(self, name, attributes):
		print "Start element:", name

And what I get on the command line

C:\BookDb>main.py
Scan Barcode
9781565847033
9781565847033
http://isbndb.com/api/books.xml?access_key=J6JNO9UH&index1=isbn&value1=9781565847033


<?xml version="1.0" encoding="UTF-8"?>

<ISBNdb server_time="2008-09-10T06:21:22Z">
<BookList total_results="1" page_size="10" page_number="1" shown_results="1">
<BookData book_id="understanding_power" isbn="1565847032">
<Title>Understanding power</Title>
<TitleLong>Understanding power: the indispensable Chomsky</TitleLong>
<AuthorsText>edited by Peter R. Mitchell and John Schoeffel</AuthorsText>
<PublisherText publisher_id="new_press">New York : New Press, c2002.</PublisherText>
</BookData>
</BookList>
</ISBNdb>

Traceback (most recent call last):
  File "C:\BookDb\Main.py", line 26, in <module>
    saxparser.parse(ISBNdb_XML)
  File "C:\Python25\lib\xml\sax\expatreader.py", line 102, in parse
    source = saxutils.prepare_input_source(source)
  File "C:\Python25\lib\xml\sax\saxutils.py", line 298, in prepare_input_source
    f = urllib.urlopen(source.getSystemId())
  File "C:\Python25\lib\urllib.py", line 82, in urlopen
    return opener.open(url)
  File "C:\Python25\lib\urllib.py", line 187, in open
    return self.open_unknown(fullurl, data)
  File "C:\Python25\lib\urllib.py", line 199, in open_unknown
    raise IOError, ('url error', 'unknown url type', type)
IOError: [Errno url error] unknown url type: '?xml version="1.0" encoding="utf-8"?>\n\n<isbndb server_time="2008-09-10t06'

C:\BookDb>

System is WXP SP3 w/
Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)] on win32

I'm not the best coder around (by a long shot), but this is so simple I cannot see where it's going wrong.

On the other hand, I've got the barcode scanner front-end working perfectly; I'm better with hardware.

2
Contributors
1
Reply
2
Views
8 Years
Discussion Span
Last Post by jlm699
0

From the documentation

parse( source)
Process an input source, producing SAX events. The source object can be a system identifier (a string identifying the input source - typically a file name or an URL), a file-like object, or an InputSource object. When parse() returns, the input is completely processed, and the parser object can be discarded or reset. As a limitation, the current implementation only accepts byte streams; processing of character streams is for further study.

So it looks like you can't simply send text; a quick work-around would be to save that ISBNdb_XML to a temporary file and then use the file path as the parameter to parse()

This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.