Hello experts,

I am working on the following XML code

<GBSeq_source-db>UniProtKB: locus NR1I3_HUMAN, accession Q14994; class: standard. extra accessions:Q5VTW5,Q5VTW6 created: Jul 15, 1999. sequence updated: Jun 21, 2004. annotation updated: Feb 8, 2011. xrefs: Z30425.1, CAA83016.1, AL590714.27, CAH72153.1, CAH72154.1, CH471121.2, EAW52608.1, EAW52609.1, BC069626.1, AAH69626.1, A56197, NP_001070948.1, NP_005113.1, 1XV9_B, 1XV9_D, 1XVP_B, 1XVP_D xrefs (non-sequence databases): IPI:IPI00001687, IPI:IPI00418379, UniGene:Hs.349642, PDBsum:1XV9, PDBsum:1XVP, ProteinModelPortal:Q14994, SMR:Q14994, IntAct:Q14994, STRING:Q14994, PhosphoSite:Q14994, Ensembl:ENST00000367979, Ensembl:ENSP00000356958, Ensembl:ENSG00000143257, Ensembl:ENST00000367983, Ensembl:ENSP00000356962, GeneID:9970, KEGG:hsa:9970, UCSC:uc001fzx.1, CTD:9970, GeneCards:GC01M161199, HGNC:7969, MIM:603881, neXtProt:NX_Q14994, GeneTree:ENSGT00570000079102, GeneTree:EPGT00050000002812, HOVERGEN:HBG108655, Reactome:REACT_71, NextBio:37626, ArrayExpress:Q14994, Bgee:Q14994, CleanEx:HS_NR1I3, Genevestigator:Q14994, GermOnline:ENSG00000143257, GO:0005654, GO:0004882, GO:0005515, GO:0043565, GO:0003700, GO:0004887, GO:0003713, GO:0008270, GO:0034339, InterPro:IPR008946, InterPro:IPR000536, InterPro:IPR001723, InterPro:IPR001728, InterPro:IPR001628, InterPro:IPR013088, Gene3D:G3DSA:1.10.565.10, Gene3D:G3DSA:3.30.50.10, Pfam:PF00104, Pfam:PF00105, PRINTS:PR00398, PRINTS:PR00047, PRINTS:PR00546, SMART:SM00430, SMART:SM00399, SUPFAM:SSF48508, PROSITE:PS00031, PROSITE:PS51030</GBSeq_source-db>

I want to extract different parts from it and display them as separate elements..

like

<field name="InterPro">
      IPR008946
      IPR000536
      IPR001723
      IPR001728
      IPR001628
      IPR013088
</field>
<pfam>
      PF00104
      PF00105
</pfam>

and many of the others...

i did following at leat

<field name="pfam"><xsl:for-each select="tokenize(substring-after(GBSeq_source-db,'Pfam'),',')">
                <xsl:for-each select="tokenize(.,':')">
                    <xsl:if test="position() >  1">
                        <xsl:value-of select="starts-with(.,'PF')"/> <xsl:text>
                              </xsl:text> </xsl:if> 
                </xsl:for-each> </xsl:for-each> </field>

i know,i did it in a wrong way, as it gives me the output for the presence/absence of those terms...but that is wat came to my mind..i m all new..if any one of you please provide me with a clue, I can work for many other things tat are similar to this..

any help is greatly appreciated..
thank you,
Sammed

Here's the first bit. You should be able to do the second bit by copy and paste.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs" version="2.0">
    <xsl:template match="/">
        <field name="InterPro">
            <xsl:for-each select="tokenize(GBSeq_source-db,'InterPro:')">
                <xsl:choose>
                    <xsl:when test="position() != 1">
                        <xsl:value-of select="substring-before(.,',')"/>
                        <xsl:text>                
                            </xsl:text>
                    </xsl:when>
                </xsl:choose>
            </xsl:for-each>
        </field>
    </xsl:template>
</xsl:stylesheet>

Cheers, John Bampton

Edited 5 Years Ago by JohnBampton: n/a

Hey thanks a lot John...i got it..
but what would be the case if I were to extract the last element Prosite, because it doesn't end with a ','. this was the case even in some other events where i was to extract the last element using tokenize. Can I generalize this for the last element that can be anything(it is prosite here).

can you please help me..

thank you,
sammed

Edited 5 Years Ago by smandape: n/a

put the following bit of code before the "when position != 1" for the new code for prosite

<xsl:when test="position() = last()">
<xsl:value-of select="." />
</xsl:when>

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs" version="2.0">
    <xsl:template match="/">
        <field name="PROSITE">
            <xsl:for-each select="tokenize(GBSeq_source-db,'PROSITE:')">
                <xsl:choose>                   
                    <xsl:when test="position() != 1">
                        <xsl:choose>
                            <xsl:when test="contains(.,',')">
                                <xsl:value-of select="substring-before(.,',')"/>
                                <xsl:text>                
                            </xsl:text>
                            </xsl:when>
                            <xsl:otherwise>
                                <xsl:value-of select="."/>
                                <xsl:text>                                        
                                    </xsl:text>
                            </xsl:otherwise>
                        </xsl:choose>
                    </xsl:when>
                </xsl:choose>
            </xsl:for-each>
        </field>
    </xsl:template>
</xsl:stylesheet>
This question has already been answered. Start a new discussion instead.