Extracting data from RDF/XML files

Question

smandape 0 Newbie Poster

12 Years Ago

Hello Experts, you have been of great help to me when it comes to XSLT. Here is another problem I have while I try to extract the data from RDF/XML files. I don't know how to do that as there are terms like dcterms defined in the XML file. They have mentioned the namespace in the XML file. But, I don't know how to extract the data. The XML file looks something like this..

<?xml version="1.0"?>
<rdf:RDF
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
  xmlns:dc="http://purl.org/dc/elements/1.1/"
  xmlns:dcterms="http://purl.org/dc/terms/"
  xmlns:prism="http://prismstandard.org/namespaces/1.2/basic/"
  xmlns:foaf="http://xmlns.com/foaf/0.1/"
  xmlns="http://www.connotea.org/2005/01/schema#"
>
  
  <dcterms:URI rdf:about="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=17477949">
    <link>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=17477949</link>
    <dc:title></dc:title>
    
    <tag>Formicidae</tag>
    <tag>RNA virus</tag>
    <tag>strand rna</tag>
    <tag>RNA viruses</tag>
    <tag>genome characteristics</tag>
    <tag>RNA polymerase</tag>
    <tag>genome structure</tag>
    <tag>Solenopsis invicta</tag>
    <tag>red imported fire ant</tag>
    <tag>pubmed</tag>
    <tag>picornaviridae</tag>
    <tag>Polycistronic</tag>
    <tag>helicase</tag>
    <tag>codons</tag>
    <tag>protease</tag>
    <tag>orf</tag>
    <tag>orientation</tag>
    <tag>cdna synthesis</tag>
    <tag>expressed sequence tag</tag>
    
    <postedBy>semant</postedBy>
    
    <postCount>1</postCount>
    <hash>34d77b6b622570e5a215702ff6d7156e</hash>
    <bookmarkID>830485</bookmarkID>
    <created>2007-05-05T22:58:43Z</created>
    <updated>2007-07-13T23:02:01Z</updated>
    <firstUser>semant</firstUser>
    
        <citation>
          <rdf:Description>
            <citationID>482422</citationID>
            <prism:title>A new positive-strand RNA virus with unique genome characteristics from the red imported fire ant, Solenopsis invicta.</prism:title>
            
            <foaf:maker>
              <foaf:Person>
                <foaf:name>Steven M Valles</foaf:name>
              </foaf:Person>
            </foaf:maker>
            
            <foaf:maker>
              <foaf:Person>
                <foaf:name>Charles A Strong</foaf:name>
              </foaf:Person>
            </foaf:maker>
            
            <foaf:maker>
              <foaf:Person>
                <foaf:name>Yoshifumi Hashimoto</foaf:name>
              </foaf:Person>
            </foaf:maker>
            
            <dc:date>2007-05-01T00:00:00Z</dc:date>
            
            <journalID>449933</journalID>
            <prism:publicationName>Virology</prism:publicationName>
            
            <prism:issn>0042-6822</prism:issn>
            
            <doiResolver rdf:resource="http://dx.doi.org/10.1016/j.virol.2007.03.043"/>
            <dc:identifier>doi:10.1016/j.virol.2007.03.043</dc:identifier>
            
            <pmidResolver rdf:resource="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=17477949"/>
            <dc:identifier>PMID: 17477949</dc:identifier>
            
          </rdf:Description>
        </citation>
    
    <rdfs:seeAlso rdf:resource="http://www.connotea.org/data/uri/34d77b6b622570e5a215702ff6d7156e" /> <!-- GET this URI to retrieve further information -->
  </dcterms:URI>

And I want to simply extract the data from this file that will look something like this.

<uri>http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=17477949</uri>
<title>A new positive-strand RNA virus with unique genome characteristics from the red imported fire ant, Solenopsis invicta.</title>
<author>Steven M Valles</author>
<author>Charles A Strong</author>
<author>Yoshifumi Hashimoto</author>
<PubmedID>PMID: 17477949</PubmedID>

I have worked previously with XML files, but I used to exclude the namespaces. I don't know how to extract with namespaces.
There is much more data than this, I am presenting a snapshot of it. And I am going to generalize this code, so the data retrieving is not specific for this file.
Any help is greatly appreciated.

Thank you,
Sammed

xml xslt

2 Contributors
3 Replies
348 Views
2 Days Discussion Span
Latest Post 12 Years Ago Latest Post by smandape

All 3 Replies

xml_looser 3 Junior Poster

12 Years Ago

i have problem with your xml

my parser have porblem with

&db in

<dcterms:URI rdf:about="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=17477949"/>

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

smandape 0 Newbie Poster · Answer 1 · 2011-06-05T19:52:54+00:00

smandape 0 Newbie Poster

12 Years Ago

Hello experts, any idea on this please?

smandape 0 Newbie Poster · Answer 2 · 2011-06-05T23:32:26+00:00

Even I got the same error at that line, and I don't know what to do. I am still figuring it out.
Thank you,
Sammed

Extracting data from RDF/XML files

Recommended Answers Collapse Answers

All 3 Replies

Recommended Answers