Here is the XML code i am trying to work on...

<GBSeq_definition>RecName: Full=Solute carrier family 2, facilitated glucose transporter member 1; AltName: Full=Glucose transporter type 1, erythrocyte/brain; Short=GLUT-1; AltName: Full=HepG2 glucose transporter</GBSeq_definition>

I want to extract the elements such that I get the output as

<aliases>
Glucose transporter type 1, erythrocyte/brain;
GLUT-1;
HepG2 glucose transporter.
</aliases>

i am using XSLT 2.0 saxon processor..i think it can be done using tokenize(), but being a newbie i don't know how to use it..i want to separate the AltName (alternate name) from the RecName (recommended name) and run in it loops so I get the desired output...

Any help is greatly appreciated..i thank you for your time and help...

thank you,
sammed

Recommended Answers

All 7 Replies

Hi, simple!

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs" version="2.0">
    <xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>
    <xsl:template match="/">
        <aliases>
            <xsl:for-each select="tokenize(substring-after(GBSeq_definition,'AltName:'),';')">
                <xsl:for-each select="tokenize(.,'=')">
                    <xsl:if test="position() > 1">
                        <xsl:value-of select="normalize-space(.)"/>
                        <xsl:text>;                
                </xsl:text>
                    </xsl:if>
                </xsl:for-each>
            </xsl:for-each>
        </aliases>
    </xsl:template>
</xsl:stylesheet>

Cheers, John Bampton.

hey John..thank you..thanks a lot..
i really needed this to solve...
by the way just as curiosity and if you don't mind..could you please tell me why you did position() > 1 and normalize-space(.)...
i want to learn this and book don't seem to be of great help..
i thank you for your time and help..

thank you,
sammed

Hey John, another quick question,
for the same XML code

<GBSeq_definition>RecName: Full=Solute carrier family 2, facilitated glucose transporter member 1; AltName: Full=Glucose transporter type 1, erythrocyte/brain; Short=GLUT-1; AltName: Full=HepG2 glucose transporter</GBSeq_definition>

i can i get the output as following

<recommended_name>Solute carrier family 2, facilitated glucose transporter member 1 ;</recommended_name>
<aliases>
Glucose transporter type 1, erythrocyte/brain ;
GLUT-1 ;
HepG2 glucose transporter ;
</aliases>

i tried ur XSLT code in somewhat different way

<xsl:for-each select="tokenize(substring-after(GBSeq_definition,'RecName:'),';')">
                    <xsl:for-each select="tokenize(.,'=')">
                        <xsl:if test="position() >  1">
                            <xsl:value-of select="normalize-space(.)">
                             </xsl:value-of>
                            <xsl:text> ;
                            </xsl:text>
                        </xsl:if> 
                    </xsl:for-each>
                </xsl:for-each>

to get the output

<recommended_name>Solute carrier family 2, facilitated glucose transporter member 1 ;
                            Glucose transporter type 1, erythrocyte/brain ;
                            GLUT-1 ;
                            HepG2 glucose transporter ;
                            </recommended_name>

but was trying to loop over again this output to separate the recommended name and alternate name in different fields.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs" version="2.0">
    <xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>
    <xsl:template match="/">
        <recommended_name>
            <xsl:for-each select="tokenize(substring-before(GBSeq_definition,'AltName:'),'=')">
                <xsl:if test="position() > 1">
                    <xsl:value-of select="."/>
                </xsl:if>
            </xsl:for-each>
        </recommended_name>
        <aliases>
            <xsl:for-each select="tokenize(substring-after(GBSeq_definition,'AltName:'),';')">
                <xsl:for-each select="tokenize(.,'=')">
                    <xsl:if test="position() > 1">
                        <xsl:value-of select="normalize-space(.)"/>
                        <xsl:text>;                
                </xsl:text>
                    </xsl:if>
                </xsl:for-each>
            </xsl:for-each>
        </aliases>
    </xsl:template>
</xsl:stylesheet>

Cheers, John Bampton.

Hey John, thank you for your reply..
this really works, but the i was trying to tokenize everything and then put it separately because i want to generalize this code..where as some of the files have code as above the another example of XML code is the one where there is only recommended name and no alternate name, so i cannot use substring-before AltName..here is the example of the XML code

<GBSeq_definition>RecName: Full=Glutathione S-transferase omega-2; Short=GSTO-2</GBSeq_definition>

in this case i want to extract full as recommended name and short as alternate name(aliases)

<recommended>
Glutathione S-transferase omega-2 </recommended>
<aliases>
GSTO-2 </aliases>

I am trying to generalize the code that would work for both the files..
in the case mentioned here i can extract recommended name but cannot really extract the short name as aliases and generalize this for both the XML codes(the one here and the one above)

Can you please help me..
I thank you for your time and help..
any help is greatly appreciated..

thank you,
sammed

hey John or seniors can you please help to generalize the above code..

Thank you,
Sammed

Hey John thank you for your help..i greatly appreciate it..i got the solution to my problem..

Thank you,
Sammed

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.