| | |
Question about string pattern matching
![]() |
•
•
Join Date: Nov 2004
Posts: 189
Reputation:
Solved Threads: 0
Hello everyone,
I am working on a dictionary application implementation. I have question about a string pattern matching algorithm implementation. In this algorithm, * can be used to match any sequence of characters. The input is string s1 and string s2, the algorithm using s2 to match s1, and the output is the number of possibility matching.
For example, s1 = ABC, s2 = AC*, output is 0;
For example, s1 = ABC, s2 = A*C, output is 1;
For example, s1 = AB, s2 = ***, output is 6 (AB,NULL,NULL
NULL,AB.NULL
NULL,NULL,AB
NULL,A,B
A,NULL,B
A, B, NULL);
Does anyone know what is the most efficient implementation? Currently, I only have a brute-force implementation.
thanks in advance & happy new year,
George
I am working on a dictionary application implementation. I have question about a string pattern matching algorithm implementation. In this algorithm, * can be used to match any sequence of characters. The input is string s1 and string s2, the algorithm using s2 to match s1, and the output is the number of possibility matching.
For example, s1 = ABC, s2 = AC*, output is 0;
For example, s1 = ABC, s2 = A*C, output is 1;
For example, s1 = AB, s2 = ***, output is 6 (AB,NULL,NULL
NULL,AB.NULL
NULL,NULL,AB
NULL,A,B
A,NULL,B
A, B, NULL);
Does anyone know what is the most efficient implementation? Currently, I only have a brute-force implementation.
thanks in advance & happy new year,
George
•
•
Join Date: Nov 2004
Posts: 189
Reputation:
Solved Threads: 0
jwenting,
Your idea is very helpful! In my application, I can not use regular expression library because I want to make it a small foot-print application. So, I want to build the algorithm from scratch and want to find out an efficient implementation.
Could you help please? Thanks.
regards,
George
•
•
•
•
Originally Posted by jwenting
Write a parser to turn the input into a regular expression and run that on the dataset.
Could you help please? Thanks.
regards,
George
•
•
Join Date: Nov 2004
Posts: 189
Reputation:
Solved Threads: 0
jwenting,
I fully agree with you. In my specific application, I can not use that regular expression library and I have to implement this string matching algorithm from scratch. Do you have any ideas of an efficient implementation?
regards,
George
•
•
•
•
Originally Posted by jwenting
The regexp package is part of the core JRE distribution since 1.4.0 so there is no additional installation or download required.
regards,
George
why can't you use regexp? It's the natural choice.
If not that you're going to have to use some form of system using indexOf() and keeping track of all the numbers that's returning from repeated calls.
Could probably be written quite efficiently using recursion.
If not that you're going to have to use some form of system using indexOf() and keeping track of all the numbers that's returning from repeated calls.
Could probably be written quite efficiently using recursion.
As people are clearly allowed to attack me but I'm not allowed to defend myself, I no longer post to this site.
•
•
Join Date: Nov 2004
Posts: 189
Reputation:
Solved Threads: 0
jwenting,
For a number of reasons, I can not use regexp -- it is up to project lead. And up to me, I am asked to implement from scratch. :-)
I think using indexOf() is a very good idea! Anyway, do you have any idea of an efficient implementation using indexOf() -- not using regexp?
regards,
George
•
•
•
•
Originally Posted by jwenting
why can't you use regexp? It's the natural choice.
If not that you're going to have to use some form of system using indexOf() and keeping track of all the numbers that's returning from repeated calls.
Could probably be written quite efficiently using recursion.
I think using indexOf() is a very good idea! Anyway, do you have any idea of an efficient implementation using indexOf() -- not using regexp?
regards,
George
shoot the project lead for his decision making 
indexOf will tell you where a substring resides (if anywhere). Call repeatedly on the parts of the string that aren't wildcards and check whether the returned values are in order.
Should be quite efficient, leaving the biggest trick the splitting of the pattern into chunks to feed into your search function.

indexOf will tell you where a substring resides (if anywhere). Call repeatedly on the parts of the string that aren't wildcards and check whether the returned values are in order.
Should be quite efficient, leaving the biggest trick the splitting of the pattern into chunks to feed into your search function.
As people are clearly allowed to attack me but I'm not allowed to defend myself, I no longer post to this site.
•
•
Join Date: Nov 2004
Posts: 189
Reputation:
Solved Threads: 0
jwenting,
Good idea! I still have some detailed questions about your method, especially "Call repeatedly on the parts of the string that aren't wildcards". Could you provide some sample source codes please?
regards,
George
•
•
•
•
Originally Posted by jwenting
indexOf will tell you where a substring resides (if anywhere). Call repeatedly on the parts of the string that aren't wildcards and check whether the returned values are in order.
regards,
George
say your pattern is something like "AB*cd*F*", you'd start by checking whether the string begins with "AB".
If it does, check whether in the substring you get when removing AB from the front of the data you have "cd" at some point.
If it does, remove everything up to and including that "cd" and check for an "F" in the rest.
If that's found, return true. If at any point indexOf() returns -1 (not found) return false.
Something like should do the trick.
Mind I've not tested this, and I've excluded the needed bounds checking for brevity.
If it does, check whether in the substring you get when removing AB from the front of the data you have "cd" at some point.
If it does, remove everything up to and including that "cd" and check for an "F" in the rest.
If that's found, return true. If at any point indexOf() returns -1 (not found) return false.
Something like
Java Syntax (Toggle Plain Text)
public boolean matcher(String data, String[] pattern, boolean startsWith) { int idx = data.indexOf(pattern[0]); if (idx == -1 || (idx > 0 && startsWith)) { return false; } String newdata = data.substring(idx); String[] newPat = new String[pattern.length()-1]; for (int i=1; i<pattern.length; i++) { newPat[i-1] = pattern[i]; } return matcher(newdata, newPat, false); }
Mind I've not tested this, and I've excluded the needed bounds checking for brevity.
As people are clearly allowed to attack me but I'm not allowed to defend myself, I no longer post to this site.
![]() |
Other Threads in the Java Forum
- Previous Thread: How do I create a Java code for this number sequence?
- Next Thread: Need helpp!!! How to compare two images in java...
| Thread Tools | Search this Thread |
911 actionlistener addball addressbook android api append applet application array arrays automation binary bluetooth button character chat class client code component consumer css csv database desktop eclipse ee error fractal ftp game givemetehcodez graphics gui html ide image integer j2me japplet java javaarraylist javac javaee javaprojects jni jpanel julia jvm linked linux list loan mac map method methods mobile netbeans newbie objects online oriented output panel phone printf problem program programming project projects properties recursion replaydirector reporting researchinmotion rotatetext rsa scanner se server service set sms software sort sql string swing test threads time transfer tree ubuntu update windows working






