Hi all,
I am attempting to use the LWP::UserAgent to request a Soap call, and in the response the request returns a TIFF image. When I write the contents out to a file(binary mode), the file also contains header information. How would I go about extracting just the body(TIF image) of the response?


use strict;
use Data::Dumper;
use LWP::UserAgent;

my $type = "TIFF";
my $ua = new LWP::UserAgent;
my $service = "http://ops.epo.org//soap-services/document-retrieval";

my $content = '<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:ops="http://ops.epo.org">';
$content = $content . '<soapenv:Header/>';
$content = $content . '<soapenv:Body>';
$content = $content . "<ops:document-retrieval id=\"EP        1000000A1 I \" page-number=\"1\" document-format=\"SINGLE_PAGE_$type\" system=\"ops.epo.org\">";
$content = $content . '</ops:document-retrieval>';
$content = $content . '</soapenv:Body>';
$content = $content . '</soapenv:Envelope>';

my $header = new HTTP::Headers (
        'Content-Type'   => 'text/xml; charset=utf-8',
        'SOAPAction'     => 'document-retrieval',

my $req = new HTTP::Request('POST',$service,$header,$content);
my $res = $ua->request($req);
print "request string:\n". $req->as_string."\n";

print "content type: ". $res->content_type."\n";
print "header = ". $res->headers_as_string()."\n";

#my $response = $res->headers_as_string();
#my $response .= $res->content;
#print "---response---\n$response\n";

#my ($body,$mime);
#   $mime = $res->parts([1]);
   #$body = $mime->body_handle();
#if ($@)
#{ die "error: $@\n"; }

if ($res->is_success)
  my $dlfile = "file.$type";
  open(OUT, ">$dlfile") or die "whoops $!";
  print OUT $res->content;
  close OUT;
{ warn "request failed...\n"; }


request string:
POST http://ops.epo.org//soap-services/document-retrieval
User-Agent: libwww-perl/5.834
Content-Type: text/xml; charset=utf-8
SOAPAction: document-retrieval

<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:ops="http://ops.epo.org"><soapenv:Header/><soapenv:Body><ops:document-retrieval id="EP 1000000A1 I " page-number="1" document-format="SINGLE_PAGE_TIFF" system="ops.epo.org"></ops:document-retrieval></soapenv:Body></soapenv:Envelope>

content type: multipart/related
header = Connection: close
Date: Wed, 14 Apr 2010 13:31:58 GMT
Accept: text/xml, text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2
Server: Apache
Content-Type: multipart/related; boundary="----=_Part_901910_514465450.1271251918535"; type="text/xml"
Client-Date: Wed, 14 Apr 2010 13:32:00 GMT
Client-Response-Num: 1
Client-Transfer-Encoding: chunked
SOAPAction: ""
X-Powered-By: Servlet 2.4; JBoss-4.3.0.GA (build: SVNTag=JBPAPP_4_3_0_GA date=200801031548)/Tomcat-5.5

Top few lines of TIFF file:

Content-Type: text/xml; charset=utf-8

<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"><SOAP-ENV:Header/><SOAP-ENV:Body><ops:world-patent-data xmlns="http://www.epo.org/exchange" xmlns:ops="http://ops.epo.org"><ops:meta name="elapsed-time" value="42"/><ops:document-retrieval document-format="SINGLE_PAGE_TIFF" id="EP 1000000A1 I " page-number="1" system="ops.epo.org"><ops:desc>FullDocument</ops:desc><ops:content-ref>EP 1000000A1 I .tiff</ops:content-ref></ops:document-retrieval></ops:world-patent-data></SOAP-ENV:Body></SOAP-ENV:Envelope>
Content-Type: application/tiff
Content-ID: EP 1000000A1 I .tiff
Content-Transfer-Encoding: binary

Thanks for any replies!!

you show how to retrieve the content of the HTTP response. Now how will i parse it if i want to get value of the <soapenv:body> tag?

This article has been dead for over six months. Start a new discussion instead.