954,517 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

Help needed with a minute problem

Hello experts,
I am trying to extract data from XML file using XSLT. I am trying to code a general XSLT code that can handle similar XML files that may differ a bit from each other.

the XML code I am working on can have the following 4 scenarios for the field [FUNCTION] (i am trying to extract [FUNCTION]). The [FUNCTION] may be in the middle or at the start or at the end. If I try to use tokenize with the delimiter ';', the problem is sometimes it is in between the [FUNCTION] statement as it is here,

<GBSeq_comment>On or before Feb 16, 2007 this sequence version replaced gi:121945493, gi:121751.; <strong>[FUNCTION] Facilitative glucose transporter. This isoform may be responsible for constitutive or basal glucose uptake. Has a very broad substrate specificity; can transport a wide range of aldoses including both pentoses and hexoses.</strong>; [SUBCELLULAR LOCATION] Cell membrane; Multi-pass </GBSeq_comment>

or

<GBSeq_comment>On or before Feb 16, 2007 this sequence version replaced gi:121945493, gi:121751.; <strong>[FUNCTION] Facilitative glucose transporter. This isoform may be responsible for constitutive or basal glucose uptake. Has a very broad substrate specificity; can transport a wide range of aldoses including both pentoses and hexoses.</strong></GBSeq_comment>

or

<GBSeq_comment>[<strong>FUNCTION] Facilitative glucose transporter. This isoform may be responsible for constitutive or basal glucose uptake. Has a very broad substrate specificity; can transport a wide range of aldoses including both pentoses and hexoses.</strong></GBSeq_comment>

or

<GBSeq_comment><strong>[FUNCTION] Facilitative glucose transporter. This isoform may be responsible for constitutive or basal glucose uptake. Has a very broad substrate specificity; can transport a wide range of aldoses including both pentoses and hexoses.</strong>; [SUBCELLULAR LOCATION] Cell membrane; Multi-pass </GBSeq_comment>


i want to write a code that can work for all three of this, I have the following XSLT code that works for scenario 1 and 3 (thanks to xml_looser), but doesn't work for 2 and 4.

the code is

<xsl:for-each select="GBSeq_comment">
            <field name="protein_function"> 
                <xsl:choose>
                    <xsl:when test="contains(.,'[FUNCTION]') and contains(.,'; [')">
                        <xsl:value-of select="substring-before(substring-after(.,'; [FUNCTION] '),'; [')"/>
                    </xsl:when>
                    <xsl:when test="contains(.,'[FUNCTION] ')">
                        <xsl:value-of select="substring-after(.,'[FUNCTION] ')"/>
                    </xsl:when>
                   </xsl:choose>
            </field>
                </xsl:for-each>


Could anyone of u please help me. I greatly appreciate your help and your time.
Thank you,
Sammed

smandape
Newbie Poster
24 posts since Feb 2011
Reputation Points: 10
Solved Threads: 0
 
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
	<xsl:template match="/">
		<xsl:apply-templates select="root"/>
	</xsl:template>
	<xsl:template match="root">
		<xsl:apply-templates select="GBSeq_comment"/>
	</xsl:template>
	<xsl:template match="GBSeq_comment">
		<xsl:choose>
		<xsl:when test="contains(.,'[FUNCTION] ') and contains(.,'.; [SUB')">
		
				<xsl:value-of select="substring-before(substring-after(.,'[FUNCTION] '),'.; [SUB')"/>
			</xsl:when>
			<xsl:when test="contains(.,'; [FUNCTION] ') and contains(.,'; [SUB')">
				
				<xsl:value-of select="substring-before(substring-after(.,'; [FUNCTION] '),'; [SUB')"/>
			</xsl:when>

			<xsl:when test="contains(.,'[FUNCTION] ')">
				
				<xsl:value-of select="substring-after(.,'[FUNCTION] ')"/>
			</xsl:when>
		</xsl:choose>
	</xsl:template>
</xsl:stylesheet>
xml_looser
Junior Poster
179 posts since Apr 2009
Reputation Points: 16
Solved Threads: 21
 

This question has already been solved

Post: Markdown Syntax: Formatting Help
You