The string is about 3k long. It contains a substring several times with the following specification:
1. First part is FIXED number of characters. The same for all the substrings.
2. Second part is the one to be extracted and stored in an array.It is of variable length
3. Third part is FIXED number of characters.

Part 1 & 3 contains all types of characters including special characters and regex metacharacters.
I am attaching sort of pesudo code just to summarize my problem

my @substringArray;
my $longString; 

my $subStart="START";
my $subEnd = "END";

while (How to loop a string ???) {

    push (@substringArray,/$subStart(\.*)$subEnd/);
  }
say @substringArray;

I am attaching sort of pesudo code just to summarize my problem

Recommended Answers

All 4 Replies

Something like this:

  my $str='TGCTCCGGAGCGGTTGACCGACGAATCATGGTTTCGTCTACATCCCGTCTAGTTTCTAG';

     while($str=~m/(TC)/g){
          print $1,$/
      }

OR something like this (UPDATE):

    my $str = do{local $/;<DATA>};

      for(split/\n/,$str){
       print $_,$/ if(/It followed/ .. /till Mary/);  # using range operator
     }

__DATA__
 Mary had a little lamb,
 whose fleece was white as snow.

 And everywhere that Mary went,
 the lamb was sure to go.

 It followed her to school one day
 which was against the rules.

 It made the children laugh and play,
 to see a lamb at school.

 And so the teacher turned it out,
 but still it lingered near,

 And waited patiently about,
 till Mary did appear.

Which in both cases worked well.

So, there are several ways to search a string in Perl, no matter how long it is.

Show a pesudo string ( or the string to parse ) and a desired outcome.

The reason I did not provide pesudo string, is I want it as general as possible. By that I mean I want to create (eventually) a function to provide the borders left and right and create an array with the extraction(s). Your example provided the "start" AND selected the start!!. Suppose I provide "TC" as start and "TT" as the end then using your sample, I get as the first element in the output array "CGGAGCGG" then I continue for the next hunt..

I like the idea of looping the string as long as what you are searching for is true ...simple logic...I love it.

Suppose I provide "TC" as start and "TT" as the end then using your sample, I get as the first element in the output array "CGGAGCGG" then I continue for the next hunt..

YES something like this:

use warnings;
use strict;

my $str = 'TGCTCCGGAGCGGTTGACCGACGAATCATGGTTTCGTCTACATCCCGTCTAGTTTCTAG';

my $start = qr/(TC)/;
my $end   = qr/(TT)/;

my @string_extraction;

while ( $str =~ m/(?<=$start)(?<string_extracted>.*?)(?=$end)/g ) {
    push @string_extraction, $+{string_extracted};
}

{
    local $" = "\n";
    print "@string_extraction";
}

output:

CGGAGCGG
ATGG
GTCTACATCCCGTCTAG

Worked like a charm...Really appreciate your support

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.