Hi,

I have a sentences like this:
I have words like this:

Example 1:

$string="control";

@arr= qw(Our data suggests that shape and growth control are unequally affected,The growth of tissue is control,The observed regulation is due to elevated levels in the blood,The sugar levels are in control and there is change in control of elevated levels in blood);

I want to retrieve the maximum matched sentence and that to be displayed first (For example in 4th sentence control ($string) is occurred twice. I want to display that sentence first and next the lesser occurrence)

Output should be like this:

The sugar levels are in control and there is change in control of elevated levels in blood
The growth of tissue is control
Our data suggests that shape and growth control are unequally affected
The observed regulation is due to elevated levels in the blood

How can i display maximum occurred word in that sentence as first and next the lesser?

To match a sentence i tried like this:

foreach $arr(@arr2)
    {
       if($arr=~/\b$string\b/i)
       {
           print "<br>matched<br>";
       }
   }

Similarly if i have string like this:
Example 2:

$string="shape,control,growth";

How can i get the maximum matched sentence in both cases i.e example 1 and example 2???

Any suggestions??

Recommended Answers

All 5 Replies

use strict;
use warnings;
use 5.010;


my @sentences = <DATA>;  #reads everything after __DATA__ below
my $str = 'control';
my @results = reverse sort by_str_matches @sentences;

sub by_str_matches {
    my @a_matches = $a =~ /$str/g;
    my @b_matches = $b =~ /$str/g;
    
    @a_matches <=> @b_matches;
}

print for (@results);

__DATA__
__DATA__
The growth of tissue is control.
The sugar levels are in control and there is change in control of elevated levels in blood.
The observed regulation is due to elevated levels in the blood.
Our data suggests that shape and growth control are unequally affected.

--output:--
The sugar levels are in control and there is change in control of elevated levels in blood.
The growth of tissue is control.
Our data suggests that shape and growth control are unequally affected.
The observed regulation is due to elevated levels in the blood.

Equivalently, you can dispense with calling reverse(), and instead switch the order of the comparison in the sort function. However, perl is smart enough to recognize "reverse sort", and perl doesn't sort first and then reverse the array--it just sorts in reverse order.

perl creates the global variables $a and $b for you. When trying to determine the order of the array, perl assigns pairs of values to $a and $b, and then perl calls:

by_str_matches($a, $b)

to determine whether one element of the array should be considered smaller than another element.

I'll leave the second question for you to complete.

commented: Nice Solution +1
use strict;
use warnings;
use 5.010;


my @sentences = <DATA>;  #reads everything after __DATA__ below
my $str = 'control';
my @results = reverse sort by_str_matches @sentences;

sub by_str_matches {
    my @a_matches = $a =~ /$str/g;
    my @b_matches = $b =~ /$str/g;
    
    @a_matches <=> @b_matches;
}

print for (@results);

__DATA__
__DATA__

Note the error. There should be only one line with __DATA__.

Hi,

Thanks for the reply!!

I didn't understand why are you using "use 5.010";

why should i use that ?

In below line of code.

my @sentences = <DATA>;

what does <DATA> that mean? You mentioned it reads the data. How it reads the data? Are you using file to read the data? Sorry i didn't get that part.

How to give input i am confused?

Can you please explain me those 2 parts?

Regards

Hi,

Thanks for the reply!!

I didn't understand why are you using "use 5.010";

why should i use that ?

I use perl version 5.10.1. It has new features, like say(), which is equivalent to print() except it adds a newline to the end of the output. I hate using print() because I don't like the extra typing required to add a newline. As a result, I put those three using statements at the top of every program I write. In this case, I didn't use say or any other new feature from perl 5.10, so 'use 5.010' has no effect.

In below line of code.

my @sentences = <DATA>;

what does <DATA> that mean? You mentioned it reads the data. How it reads the data? Are you using file to read the data? Sorry i didn't get that part.

perl creates various file handles for you automatically, for instance STDIN. You don't have to open() STDIN to read a user's input. Likewise, perl opens a file called DATA for you. The DATA file consists of all the lines after a line in your program that has __DATA__ on it. Using the DATA file makes constructing examples easier: you can see all the data that is read in because it is part of the program file rather than in a separate file.

To read in data from your file, all you have to do is open() the file containing your data, and substitute the file handle name you choose in place of DATA:

open (my $MYDATA, '<', 'somefile.txt');  #use this format
my @sentences = <$MYDATA>;

Also, you would delete the line with __DATA__ on it and all the lines after it.

thanks to you all for sharing this useful information with us. its really very great posting here. thanks for this.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.