I'm trying to add copyright information to the top of XML files. However, it needs to go after the prologue:

<?xml version="1.0"?>
<!DOCTYPE ...>

My problem is that some XML Documents have the <!DOCTYPE...> tag spread out over many lines and I need to add the copyright information after the whole tag. With the following regular expression, it only matches the first line of the <!DOCTYPE...> tag. Any help would be appriciated. Thanks.

if($XML)
{
 [INDENT] $holdTerminator = $/;
   undef $/; 
   $buf = <DAT> or die "Can't read into variable";
   $/ = $holdTerminator;
   if($buf =~ m/(<\?xml version="\d\.\d".*\?>[.\s\n]*(<!DOCTYPE.*>?)?)/i)
     {[/INDENT]

[INDENT]	[INDENT]print "XML $1";
	$version=$1;
	$buf =~ s/<\?xml version="\d\.\d".*\?>[.\s\n]*(<!DOCTYPE.*>?)?/$version \n\n $start_comment $copyright $end_comment/i;
	seek(DAT, 0, 0);
	print DAT $buf;[/INDENT]  [/INDENT]   
[INDENT]}[/INDENT]
}

Recommended Answers

All 2 Replies

Thanks for the reply. I had tried adding /m to the end of the substitution, but that didn't work. Then it matches the <?XML...> tag, but not the <!DOCTYPE...>. It inserts the copyright info inbetween the <?XML...> and <!DOCTYPE...> tags, like it didn't recognize the <!DOCTYPE...> tag. I also tried adding /s to the end, and that matched the entire document, which really screwed things up. The beginning of the XML file I'm testing looks like this...

<!DOCTYPE web-app  
	PUBLIC "-//Sun Microsystems, Inc.//DTD Web Application 2.3//EN" "http://java.sun.com/dtd/web-app_2_3.dtd">

but there are a few other files that have many lines of <!DOCTYPE...>.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.