Hi,

I have 3 arrays.

@arr=("TDP-43 is a highly conserved, 43-kDa RNA-binding protein implicated to play a role in transcription repression, nuclear organization, and alternative splicing","the major disease protein of several neurodegenerative diseases, including frontotemporal lobar degeneration with ubiquitin-positive inclusions","For the splicing activity, the factor has been shown to be mainly an exon-skipping promoter");#main array

@arr1=("TDP-43 is a highly conserved, 43-kDa RNA-binding protein implicated to play a role in transcription repression, nuclear organization, and alternative splicing","the major disease protein of several neurodegenerative diseases, including frontotemporal lobar degeneration with ubiquitin-positive inclusions")

@arr2=("TDP-43 is a highly conserved, 43-kDa RNA-binding protein implicated to play a role in transcription repression, nuclear organization, and alternative splicing","the major disease protein of several neurodegenerative diseases, including frontotemporal lobar degeneration with ubiquitin-positive inclusions");

I want to compare @arr and @arr1 arrays if its matching i should replace with @arr2 contents.Here is the code.

foreach (@arr)
{
        foreach $sent(@arr1)
        {
                if($_=~/$sent/i)
                {
                        print "<br> matched <br>";
                        foreach $sent2(@arr2)
                        {
                                $_=~s/$_/$sent2/ig;
                               #i am substituting arr2 contents.
                        }
                }
        }
        print "<br> *** $_ <br>";
}

output i got was like this:

the major disease protein of several neurodegenerative diseases, including frontotemporal lobar degeneration with ubiquitin-positive inclusions
the major disease protein of several neurodegenerative diseases, including frontotemporal lobar degeneration with ubiquitin-positive inclusions
For the splicing activity, the factor has been shown to be mainly an exon-skipping promoter

The problem here is the 2 sentences is being matched but the second matched sentence is getting replaced and not at all printing the first sentence and the last matched sentence is being printed twice (here instead of TDP-43 is a highly conserved, 43-kDa RNA-binding protein implicated to play a role in transcription repression, nuclear organization: the major disease protein of several neurodegenerative diseases, including frontotemporal lobar degeneration with ubiquitin-positive inclusions. is getting printed)

The output should be like this:

TDP-43 is a highly conserved, 43-kDa RNA-binding protein implicated to play a role in transcription repression, nuclear organization
the major disease protein of several neurodegenerative diseases, including frontotemporal lobar degeneration with ubiquitin-positive inclusions
For the splicing activity, the factor has been shown to be mainly an exon-skipping promoter

Any suggestions solving this problem?

How can i substitute the sentence?

With regards
Vanditha

I believe your algorithm is the source of your problem...you are running 3 nested loops, when I think your intent is to iterate at least some of the arrays simultaneously.

Your algorithm (as written) does the following:
For each row of the first array, compare it with each of the rows from the second array. (Note this not comparing relative rows, but all rows). Then if ANY row from the second array matched, replace the contents iteratively with all of the data from the third array (ending naturaly with the last entry in the third array).

I think the piece you are missing is that you don't want to iterate through all of the rows in the third array, you just want the row in the third array that is in the same position as the row you matched from the second array.

If you want more help, a 'bigger picture' of what you're trying to accomplish might be useful.

I believe your algorithm is the source of your problem...you are running 3 nested loops, when I think your intent is to iterate at least some of the arrays simultaneously.

Your algorithm (as written) does the following:
For each row of the first array, compare it with each of the rows from the second array. (Note this not comparing relative rows, but all rows). Then if ANY row from the second array matched, replace the contents iteratively with all of the data from the third array (ending naturaly with the last entry in the third array).

I think the piece you are missing is that you don't want to iterate through all of the rows in the third array, you just want the row in the third array that is in the same position as the row you matched from the second array.

If you want more help, a 'bigger picture' of what you're trying to accomplish might be useful.

Hi,

what u told is correct!!

I am trying to substitute the matched contents(the row in the third array that is in the same position as the row you matched from the second array) and finally print the contents.

I will explain the scenario with another example.

I have 3 arrays:

@arr1=("Furthermore, apigenin treatment increased the level of association of the RNA binding protein HuR with endogenous p53 mRNA","one of the mechanisms by which apigenin induces p53 protein expression is enhancement of translation through the RNA binding protein HuR","Here we further demonstrated that the increase in p53 protein level induced by apigenin treatment of 308 keratinoyctes");

@arr2=("Furthermore, apigenin treatment increased the level of association of the RNA binding protein HuR with endogenous p53 mRNA","one of the mechanisms by which apigenin induces p53 protein expression is enhancement of translation through the RNA binding protein HuR");

@arr3=("Moreover, apigenin treatment of cells induced p16 protein expression, which in turn was correlated with cytoplasmic localization of HuR induced by apigenin","nhancement of p53 expression in keratinocytes by the bioflavonoid apigenin");

First i want to compare @arr1 and @arr2 if its matching i want to replace with @arr3 contents.

The same piece of code to match!!!

foreach $str1(@arr1)
{
   foreach $str2(@arr2)
	{
		if($str1=~/$str2/)
                 {
                      print "<br> matched <br>";
		      foreach $str3(@arr3)
			{
				$str1=~s/$str1/$str3/ig;		
			}
		
		 }

	}
	print "<br> *** $str1 <br>";
}

But the output i am getting is like this:

Enhancement of p53 expression in keratinocytes by the bioflavonoid apigenin

Moreover, apigenin treatment of cells induced p16 protein expression, which in turn was correlated with cytoplasmic localization of HuR induced by apigenin

Enhancement of p53 expression in keratinocytes by the bioflavonoid apigenin

The desired output should be like this:

Moreover, apigenin treatment of cells induced p16 protein expression, which in turn was correlated with cytoplasmic localization of HuR induced by apigenin

Enhancement of p53 expression in keratinocytes by the bioflavonoid apigenin 

Here we further demonstrated that the increase in p53 protein level induced by apigenin treatment of 308 keratinoyctes

Basically how can i change the substitution statement to achieve the desired output!!

Any ideas????

with regards
Vandhita

I tried to say it before, but you didn't appear to understand.

You CAN NOT foreach over arr3 inside the if match

foreach $str1(@arr1)
{
   for (my $ii=0; $ii< scalar(@arr2); $ii++)
   {
      $str2 = $arr2[$ii];
      if($str1=~/$str2/)
      {
         print "<br> matched <br>";
         # at this point, we just replace $str1
         # with the string in @arr3 at the same index
         # as the string we just matched from @arr2
         $str1 = $arr3[$ii];
         # note that this actually updates the data in @arr1
      }
   }
   print "<br> *** $str1 <br>";
}

Note that we are still iterating over ALL of the lines in arr2 for EACH line in arr1. This would allow us to match the strings from arr2 more than once. I don't know if this might be an issue, either from a correctness or a performance standpoint.

Note also that the test condition $str1=~/$str2/ allows for partial matches. If the string in $str2 is found anywhere in $str1, the entire $str1 is replaced with $str3. This was also true in your original code.

I tried to say it before, but you didn't appear to understand.

You CAN NOT foreach over arr3 inside the if match

foreach $str1(@arr1)
{
   for (my $ii=0; $ii< scalar(@arr2); $ii++)
   {
      $str2 = $arr2[$ii];
      if($str1=~/$str2/)
      {
         print "<br> matched <br>";
         # at this point, we just replace $str1
         # with the string in @arr3 at the same index
         # as the string we just matched from @arr2
         $str1 = $arr3[$ii];
         # note that this actually updates the data in @arr1
      }
   }
   print "<br> *** $str1 <br>";
}

Note that we are still iterating over ALL of the lines in arr2 for EACH line in arr1. This would allow us to match the strings from arr2 more than once. I don't know if this might be an issue, either from a correctness or a performance standpoint.

Note also that the test condition $str1=~/$str2/ allows for partial matches. If the string in $str2 is found anywhere in $str1, the entire $str1 is replaced with $str3. This was also true in your original code.

Hi,

Thanks for the reply!!!!

with regards
Vandhita

This article has been dead for over six months. Start a new discussion instead.