Question about string pattern matching

Reply

Join Date: Nov 2004
Posts: 189
Reputation: George2 is an unknown quantity at this point 
Solved Threads: 0
George2 George2 is offline Offline
Junior Poster

Question about string pattern matching

 
0
  #1
Jan 2nd, 2006
Hello everyone,


I am working on a dictionary application implementation. I have question about a string pattern matching algorithm implementation. In this algorithm, * can be used to match any sequence of characters. The input is string s1 and string s2, the algorithm using s2 to match s1, and the output is the number of possibility matching.

For example, s1 = ABC, s2 = AC*, output is 0;
For example, s1 = ABC, s2 = A*C, output is 1;
For example, s1 = AB, s2 = ***, output is 6 (AB,NULL,NULL
NULL,AB.NULL
NULL,NULL,AB
NULL,A,B
A,NULL,B
A, B, NULL);

Does anyone know what is the most efficient implementation? Currently, I only have a brute-force implementation.


thanks in advance & happy new year,
George
Reply With Quote Quick reply to this message  
Join Date: Nov 2004
Posts: 6,143
Reputation: jwenting is just really nice jwenting is just really nice jwenting is just really nice jwenting is just really nice 
Solved Threads: 213
Team Colleague
jwenting's Avatar
jwenting jwenting is offline Offline
duckman

Re: Question about string pattern matching

 
0
  #2
Jan 2nd, 2006
Write a parser to turn the input into a regular expression and run that on the dataset.
As people are clearly allowed to attack me but I'm not allowed to defend myself, I no longer post to this site.
Reply With Quote Quick reply to this message  
Join Date: Nov 2004
Posts: 189
Reputation: George2 is an unknown quantity at this point 
Solved Threads: 0
George2 George2 is offline Offline
Junior Poster

Re: Question about string pattern matching

 
0
  #3
Jan 2nd, 2006
jwenting,


Originally Posted by jwenting
Write a parser to turn the input into a regular expression and run that on the dataset.
Your idea is very helpful! In my application, I can not use regular expression library because I want to make it a small foot-print application. So, I want to build the algorithm from scratch and want to find out an efficient implementation.

Could you help please? Thanks.


regards,
George
Reply With Quote Quick reply to this message  
Join Date: Nov 2004
Posts: 6,143
Reputation: jwenting is just really nice jwenting is just really nice jwenting is just really nice jwenting is just really nice 
Solved Threads: 213
Team Colleague
jwenting's Avatar
jwenting jwenting is offline Offline
duckman

Re: Question about string pattern matching

 
0
  #4
Jan 2nd, 2006
The regexp package is part of the core JRE distribution since 1.4.0 so there is no additional installation or download required.
As people are clearly allowed to attack me but I'm not allowed to defend myself, I no longer post to this site.
Reply With Quote Quick reply to this message  
Join Date: Nov 2004
Posts: 189
Reputation: George2 is an unknown quantity at this point 
Solved Threads: 0
George2 George2 is offline Offline
Junior Poster

Re: Question about string pattern matching

 
0
  #5
Jan 3rd, 2006
jwenting,


Originally Posted by jwenting
The regexp package is part of the core JRE distribution since 1.4.0 so there is no additional installation or download required.
I fully agree with you. In my specific application, I can not use that regular expression library and I have to implement this string matching algorithm from scratch. Do you have any ideas of an efficient implementation?


regards,
George
Reply With Quote Quick reply to this message  
Join Date: Nov 2004
Posts: 6,143
Reputation: jwenting is just really nice jwenting is just really nice jwenting is just really nice jwenting is just really nice 
Solved Threads: 213
Team Colleague
jwenting's Avatar
jwenting jwenting is offline Offline
duckman

Re: Question about string pattern matching

 
0
  #6
Jan 3rd, 2006
why can't you use regexp? It's the natural choice.
If not that you're going to have to use some form of system using indexOf() and keeping track of all the numbers that's returning from repeated calls.
Could probably be written quite efficiently using recursion.
As people are clearly allowed to attack me but I'm not allowed to defend myself, I no longer post to this site.
Reply With Quote Quick reply to this message  
Join Date: Nov 2004
Posts: 189
Reputation: George2 is an unknown quantity at this point 
Solved Threads: 0
George2 George2 is offline Offline
Junior Poster

Re: Question about string pattern matching

 
0
  #7
Jan 3rd, 2006
jwenting,


Originally Posted by jwenting
why can't you use regexp? It's the natural choice.
If not that you're going to have to use some form of system using indexOf() and keeping track of all the numbers that's returning from repeated calls.
Could probably be written quite efficiently using recursion.
For a number of reasons, I can not use regexp -- it is up to project lead. And up to me, I am asked to implement from scratch. :-)

I think using indexOf() is a very good idea! Anyway, do you have any idea of an efficient implementation using indexOf() -- not using regexp?


regards,
George
Reply With Quote Quick reply to this message  
Join Date: Nov 2004
Posts: 6,143
Reputation: jwenting is just really nice jwenting is just really nice jwenting is just really nice jwenting is just really nice 
Solved Threads: 213
Team Colleague
jwenting's Avatar
jwenting jwenting is offline Offline
duckman

Re: Question about string pattern matching

 
0
  #8
Jan 4th, 2006
shoot the project lead for his decision making

indexOf will tell you where a substring resides (if anywhere). Call repeatedly on the parts of the string that aren't wildcards and check whether the returned values are in order.
Should be quite efficient, leaving the biggest trick the splitting of the pattern into chunks to feed into your search function.
As people are clearly allowed to attack me but I'm not allowed to defend myself, I no longer post to this site.
Reply With Quote Quick reply to this message  
Join Date: Nov 2004
Posts: 189
Reputation: George2 is an unknown quantity at this point 
Solved Threads: 0
George2 George2 is offline Offline
Junior Poster

Re: Question about string pattern matching

 
0
  #9
Jan 5th, 2006
jwenting,


Originally Posted by jwenting
indexOf will tell you where a substring resides (if anywhere). Call repeatedly on the parts of the string that aren't wildcards and check whether the returned values are in order.
Good idea! I still have some detailed questions about your method, especially "Call repeatedly on the parts of the string that aren't wildcards". Could you provide some sample source codes please?


regards,
George
Reply With Quote Quick reply to this message  
Join Date: Nov 2004
Posts: 6,143
Reputation: jwenting is just really nice jwenting is just really nice jwenting is just really nice jwenting is just really nice 
Solved Threads: 213
Team Colleague
jwenting's Avatar
jwenting jwenting is offline Offline
duckman

Re: Question about string pattern matching

 
0
  #10
Jan 5th, 2006
say your pattern is something like "AB*cd*F*", you'd start by checking whether the string begins with "AB".
If it does, check whether in the substring you get when removing AB from the front of the data you have "cd" at some point.
If it does, remove everything up to and including that "cd" and check for an "F" in the rest.
If that's found, return true. If at any point indexOf() returns -1 (not found) return false.

Something like
  1. public boolean matcher(String data, String[] pattern, boolean startsWith) {
  2. int idx = data.indexOf(pattern[0]);
  3. if (idx == -1 || (idx > 0 && startsWith)) {
  4. return false;
  5. }
  6. String newdata = data.substring(idx);
  7. String[] newPat = new String[pattern.length()-1];
  8. for (int i=1; i<pattern.length; i++) {
  9. newPat[i-1] = pattern[i];
  10. }
  11. return matcher(newdata, newPat, false);
  12. }
should do the trick.
Mind I've not tested this, and I've excluded the needed bounds checking for brevity.
As people are clearly allowed to attack me but I'm not allowed to defend myself, I no longer post to this site.
Reply With Quote Quick reply to this message  
Reply

This thread is more than three months old.
Perhaps start a new thread instead?
Message:


Thread Tools Search this Thread



About Us | Contact Us | Advertise | DaniWeb | Acceptable Use Policy | RSS Feed

©2003 - 2009 DaniWeb® LLC