Have I got a problem. I am quite new to xslt, and I have the following situation. I am presented with an xml file. I only want to extract the text in four of the tags present in the document. My first approach was to create a template for each of these tags, say the tags are <a>, <b>, <c>, and <d>. The templates are probably identical to the default templates except for one of them, so I probably didn't have to create the others. At any rate, my xslt code now looks something like

<xsl:apply-templates select="//a" />
<xsl:apply-templates select="//b" />
<xsl:apply-templates select="//c" />
<xsl:apply-templates select="//d" />

This produces all the material that want, of course, but unfortunately, it changes the order. I would like the order of the text in the output to be the same as the order of the text in the input. It is just that I only want to select the text in some of the tags present in the document.

How do I do this?

Recommended Answers

All 5 Replies

I'll be happy to help you with what you're trying to achieve. However,It's very difficult to try and understand what you want to do. The best way to show me is to provide a short, sample input XML document of what your source data looks like. Then provide me a short, OUTPUT document of what the previous input document would look like after it goes through your transformation. This way I can see exactly what you're trying to accomplish.

Do that and I'll see what I can do.

I am not actually allowed to show you what I am working on. However, I can make someting up that illustrates the same situation. My XML file then looks something like:

<?xml version="1.0" encoding="iso-8859-1"?>
<docElt>
<q>The cat went upstairs leaving white fur all over the wood.</q>
<author>
<Name>George Morley</Name>
<Weight>253 pounds</Weight>
</Author>
<q2>She meowed loudly and lewdly, pacing the landing in anticipation of being given bacon and cheese.</q2>
<TimeOfDay>21445564</TimeOfDay>
<q>Out came a Labrador Retriever who had an amazing ability.
<q2>She could talk to cats.</q2>
</docElt>

I want the final product to look, after being translated into HTML and displayed by the browser, something like:

The cat went upstairs leaving white fur all over the wood.

She meowed loudly and lewdly, pacing the landing in anticipation of being given bacon and cheese.

Out came a Labrador Retriever who had an amazing ability.

She could talk to cats.

The technicalities of turning this into HTML are irrelevant to me. I want to be able to keep the order of the text that I am extracting exactly as it is. I want the text inside any tags other than <q> and <q2> to not display at all.

I have various ideas for this, all of which are probably wrong, but hopefully you know. If there is some way of disabling the built-in templates, then what I am doing will work. If it is possible to select //q or //q2, that would work, and I would love to know how to do it. I can presumably make templates for the tags that I want to not show up that are empty and so cause nothing to be output when they are applied, but this is really ugly, and I am dealing with an RSS feed, so this could change without notice. I have tried altering template priorities and various other
things, but so far nothing has worked for me. I thank you for any help you have to offer.

This is simple. XSLT Processors always process nodes in the order they are in the document, unless you apply-templates in a particular order. So the idea is to create a node set of all the "q" and all the "q2" nodes in the document order. In order to do this you create a template match of the union of all the "q" and "q2" nodes. Then you apply-template on that ENTIRE union-ed set. It treats all these nodes as one giant node, in document order.

Your input document was not a valid XML document (you had some tags missing), but here is the input document.

<docElt>
	<q>The cat went upstairs leaving white fur all over the wood.</q>
	<Author>
		<Name>George Morley</Name>
		<Weight>253 pounds</Weight>
	</Author>
	<q2>She meowed loudly and lewdly, pacing the landing in anticipation of being given bacon and cheese.</q2>
	<TimeOfDay>21445564</TimeOfDay>
	<q>Out came a Labrador Retriever who had an amazing ability.</q>
	<q2>She could talk to cats.</q2>
</docElt>

Here's the transformation that I've created to get what you want.

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

	<xsl:template match="/">
		<html>
			<body>
				<xsl:apply-templates select="//q | //q2" />
			</body>
		</html>
	</xsl:template>

	<xsl:template match="q | q2" >
		<xsl:value-of select="." />
		<br/>
	</xsl:template>
		
</xsl:stylesheet>

Here's the output HTML Document.
The cat went upstairs leaving white fur all over the wood.
She meowed loudly and lewdly, pacing the landing in anticipation of being given bacon and cheese.
Out came a Labrador Retriever who had an amazing ability.
She could talk to cats.

<html><body>The cat went upstairs leaving white fur all over the wood.<br/>She meowed loudly and lewdly, pacing the landing in anticipation of being given bacon and cheese.<br/>Out came a Labrador Retriever who had an amazing ability.<br/>She could talk to cats.<br/></body></html>

There ya go.


Additon:
If you want to over ride the built in templates this can be easily done as well. Just write a template that matches the same thing and then have that template do nothing. For exmaple.

<xsl:template match="*" />
<xsl:template match="@*" />
<xsl:template match="comment()" />

<xsl:temp

Thank you very much for your answer. You taught me two useful things. One is that you can modify the built-in templates by putting in a template with a match="*" to do whatever you want. You can also modify the priority, although I don't know if you need it. You can also use OR (with the C construct |) in both select and match. This is extremely useful, and your first example is the cleanest way (by me anyway) to solve my problem.

The "|" is NOT an OR. This is something that is easily confused in XSLT. The "|" symbol when applied to sets of nodes, is a UNION. It takes all nodes from set A and set B and creates a single node set.

Now most of the time it can be thought of as OR because you're creating a union of exclusive nodes, but what if you're creating a union of nodes that are similar? You've got to be careful if you think of it like an OR.

For example, let's say I have the following:

Set A
<Number>1</Number>
<Number>2</Number>
<Number>3</Number>

Set B
<Number>1</Number>
<Number>2</Number>
<Number>3</Number>
<Number>4</Number>


The result of a "Set A|SetB" is just:
<Number>1</Number>
<Number>2</Number>
<Number>3</Number>
<Number>4</Number>

It's not:
<Number>1</Number>
<Number>2</Number>
<Number>3</Number>
<Number>1</Number>
<Number>2</Number>
<Number>3</Number>
<Number>4</Number>


Just be careful.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.