Hi,

I have few sentences and here are those sentences.

Protein modeling studies reveal that the RG-rich region is part of a three to four strand antiparallel beta-sheet, which in other RNA binding protein functions as a platform for nucleic acid interactions.

PUF proteins comprise a highly conserved family of sequence-specific RNA-binding protein that regulate target mRNAs

I have an interface which accepts query from user and i am taking that query like this:

$word=param('query');

For eg if user enters "RNA binding proteins" it should pick up both the sentences.

I don't know how to write a regular expression such that this $word should satisfy both the conditions(i.e "RNA-binging protein" and "RNA binding protein") and both these sentences should be picked up!

How should a regular expression to be written such that occurence of these words in sentences are matched and picked up?

I tried like this but its not matching!!

if($word=~/.*[\s\-]/)

I am not getting proper regular expression to match these words and retrieve sentences.

Please help!!!

With regards
Vandithar

if (/RNA[ -]?binding protein/) {
   do somethihng
}

? is a quantifier which means zero or one, its the same as:

if (/RNA[ -]{0,1}binding protein/) {
   do somethihng
}

Look up "quantifiers" in a regular expression tutorial.

i would personally do something like this:

$match = "rna binding protein"

@matchwords = split(/[ -]/,$match);

$newmatch = join("[ -]",@matchwords);

if ($_ =~ /$newmatch/i ) {

blah blah blah

}

> I don't know how to write a regular expression such that this $word should satisfy both the
> conditions(i.e "RNA-binging protein" and "RNA binding protein") and both these sentences
> should be picked up!
Including spelling mistakes as well?

Case folding and stripping punctuation will only get you so far.

if (/RNA[ -]?binding protein/) {
   do somethihng
}

? is a quantifier which means zero or one, its the same as:

if (/RNA[ -]{0,1}binding protein/) {
   do somethihng
}

Look up "quantifiers" in a regular expression tutorial.

Hi,

I got this expression but how to check this expression using $word?

That i didn't get!!

if ($word =~ /RNA[ -]?binding protein/) {
   do somethihng
}

Hi,

Here $word as an example is "RNA binding proteins" but if user enters something like this i should write regular expression to match that that i am not able to do?

if ($word =~ /RNA\s+?-?\s+?binding\s+proteins?/i) {
   do somethihng
}

Hi

I have to match $sentences with $word.

$word might have phrases like this!!!

How should i match $sentences?

if($sentences=~/$word/) //$word can be "RNA binding protein" or "RNA-binding protein"
{

}

How should i do this?

How should i write expression such that $sentences will match $word such that it matches both the conditions(RNA binding protein and RNA-binding protein)?


With regards
Vanditha

i already gave you the answer. i'll repost it using your variable names:

# variable "$word" is set by user search request
# variable "$sentence" has text to be searched

$word = "rna binding protein"     # ---OR---
$word = "rna-binding protein"

@matchwords = split(/[ -]/,$word);

$newmatch = join("[ -]",@matchwords);

if ($sentences =~ /$newmatch/i ) {

      # do something

}

I think what the OP is trying to get at is that the input text, AND the query are both free-form user input.

Crafting specific RE's to deal with the example text and "RNA binding protein", whilst interesting, does not solve the problem.

Because if the next search is "DNA hybrid" or something, then it's back to the start again.

Did the OP read the FAQ link I posted?
That would seem to have more prospect.

Personally I think the OP has been given more than enough information to figure this out, all they have to do is try some of the suggestions and mess around a bit with some code to fine tune it to their needs.

I think what the OP is trying to get at is that the input text, AND the query are both free-form user input.

that's what i gave him, at least the interesting bits. my example "$word" can be any string. i did not spend time on how he gets input into the variable.

Crafting specific RE's to deal with the example text and "RNA binding protein", whilst interesting, does not solve the problem.

well, i don't find that to be interesting at all.

and it's not what i did.

This article has been dead for over six months. Start a new discussion instead.