0

Please Help me to extract the inner table to the outer table so that it will not have table within table because it is not good data.. thank you

Raw Data

<item>
	<pre>
	<table width="100%" border="0" cellpadding="2" cellspacing="2" bgcolor="#eeeeee">
	 <tr>
	  <td><p>"Gratitude for the abundance you have received is
	the best insurance that the abundance will continue."<br>
	                              -Muhammad</p></td>
	  </tr>
	  <tr>
	   <td></td>
	  </tr>
	  <tr>
	    <td><p><p>"Gratitude for the abundance you have received is
	the best insurance that the abundance will continue."<br>
	                              -Muhammad</p></td>
	    </tr>
	    <tr>
	    <td></td>
	    </tr>
	    <tr>
	    <td><pre><table cellspacing="5" cellpadding="0" width="100" align="left">
	        <tr>
	         <td>
	         <pre><table cellspacing="0" cellpadding="5" width="100" bgcolor="#eeeeee">
	          <tr>
	           <td class="art"></td>
	          </tr>
	          <tr>
	            <td class="art">Photo courtesy of Bellentani</td>
	          </tr>
	        </table></pre>           
	        </td>
	        </tr>
	     </table></pre></td> </tr>
	</table></pre>
	</item>

the code i use

while(<INFILE>){
	 s/<\/?pre>//gm;
	 s/(<table)\s+[^\000]*?>/$1>/mgi;
	 
	 s/<(\/?table)/<$1_tmp/g;
	  while(/<table_tmp[^>]*?>[^\000]*?<\/table_tmp>/){
	    $text=$&;
	    $text=~s/<(\/?table)_tmp[^>]*?>/<\$1>/g;
	    $para="";
	    while($text=~s/(<table>[^\000]*?<\/table>)//){
	      $para="$para\n$1";
	    }
	    $para=~s/<table>([^\000]*?)<\/table>/<table>$1<\/table>/mgi;
	    s/(<table_tmp[^>]*?>[^\000]*?<\/table_tmp>)/$text\n$para/;
	  }
	  s/<(\/?table)[^>]*?>/<$1>/g;
}

Print OUTFILE $_;

the output should be

<item>
	 
	<table>
	 <tr>
	  <td><p>"Gratitude for the abundance you have received is
	the best insurance that the abundance will continue."<br>
	                              -Muhammad</p></td>
	  </tr>
	  <tr>
	   <td></td>
	  </tr>
	  <tr>
	    <td><p><p>"Gratitude for the abundance you have received is
	the best insurance that the abundance will continue."<br>
	                              -Muhammad</p></td>
	    </tr>
	    <tr>
	    <td></td>
	    </tr>
	    <tr>
	    <td>
	     </td>
	     </tr>
	</table>
	 
	<table cellspacing="5" cellpadding="0" width="100" align="left">
	        <tr>
	         <td></td> </tr>
	</table>
	          
	<table>
	    <tr>
	     <td class="art"></td>
	    </tr>
	    <tr>
	     <td class="art">Photo courtesy of Bellentani</td>
	   </tr>
	</table>          
	  
	</item>

Thank you

2
Contributors
1
Reply
3
Views
5 Years
Discussion Span
Last Post by k_manimuthu
1
use strict;
use warnings;

open (FIN, "test.xml") or die "Error : $!";
read FIN, my $file, -s FIN;
close (FIN);

my @table=();
# Get the necessary contents 
while ($file=~ m{<table((?:(?!<\/?table).)*)</table>}sg) {
	my $find=quotemeta($&);
	$file=~ s{$find}{}s;
	unshift(@table, $&);
}

# show the extracted conents
for (my $i=0; $i<=$#table; $i++) {
print "\n\n---- Table", $i+1 ," -------\n$table[$i]";
}

Do some stuffs you may get your expected output. The above code extract single table contents only. Try this code.

Edited by k_manimuthu: n/a

Votes + Comments
Nice work.
This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.