944,198 Members | Top Members by Rank

Ad:
  • Java Discussion Thread
  • Unsolved
  • Views: 16162
  • Java RSS
You are currently viewing page 1 of this multi-page discussion thread
Jan 2nd, 2006
0

Question about string pattern matching

Expand Post »
Hello everyone,


I am working on a dictionary application implementation. I have question about a string pattern matching algorithm implementation. In this algorithm, * can be used to match any sequence of characters. The input is string s1 and string s2, the algorithm using s2 to match s1, and the output is the number of possibility matching.

For example, s1 = ABC, s2 = AC*, output is 0;
For example, s1 = ABC, s2 = A*C, output is 1;
For example, s1 = AB, s2 = ***, output is 6 (AB,NULL,NULL
NULL,AB.NULL
NULL,NULL,AB
NULL,A,B
A,NULL,B
A, B, NULL);

Does anyone know what is the most efficient implementation? Currently, I only have a brute-force implementation.


thanks in advance & happy new year,
George
Reputation Points: 11
Solved Threads: 0
Junior Poster
George2 is offline Offline
189 posts
since Nov 2004
Jan 2nd, 2006
0

Re: Question about string pattern matching

Write a parser to turn the input into a regular expression and run that on the dataset.
Team Colleague
Reputation Points: 1658
Solved Threads: 331
duckman
jwenting is offline Offline
7,719 posts
since Nov 2004
Jan 2nd, 2006
0

Re: Question about string pattern matching

jwenting,


Quote originally posted by jwenting ...
Write a parser to turn the input into a regular expression and run that on the dataset.
Your idea is very helpful! In my application, I can not use regular expression library because I want to make it a small foot-print application. So, I want to build the algorithm from scratch and want to find out an efficient implementation.

Could you help please? Thanks.


regards,
George
Reputation Points: 11
Solved Threads: 0
Junior Poster
George2 is offline Offline
189 posts
since Nov 2004
Jan 2nd, 2006
0

Re: Question about string pattern matching

The regexp package is part of the core JRE distribution since 1.4.0 so there is no additional installation or download required.
Team Colleague
Reputation Points: 1658
Solved Threads: 331
duckman
jwenting is offline Offline
7,719 posts
since Nov 2004
Jan 3rd, 2006
0

Re: Question about string pattern matching

jwenting,


Quote originally posted by jwenting ...
The regexp package is part of the core JRE distribution since 1.4.0 so there is no additional installation or download required.
I fully agree with you. In my specific application, I can not use that regular expression library and I have to implement this string matching algorithm from scratch. Do you have any ideas of an efficient implementation?


regards,
George
Reputation Points: 11
Solved Threads: 0
Junior Poster
George2 is offline Offline
189 posts
since Nov 2004
Jan 3rd, 2006
0

Re: Question about string pattern matching

why can't you use regexp? It's the natural choice.
If not that you're going to have to use some form of system using indexOf() and keeping track of all the numbers that's returning from repeated calls.
Could probably be written quite efficiently using recursion.
Team Colleague
Reputation Points: 1658
Solved Threads: 331
duckman
jwenting is offline Offline
7,719 posts
since Nov 2004
Jan 3rd, 2006
0

Re: Question about string pattern matching

jwenting,


Quote originally posted by jwenting ...
why can't you use regexp? It's the natural choice.
If not that you're going to have to use some form of system using indexOf() and keeping track of all the numbers that's returning from repeated calls.
Could probably be written quite efficiently using recursion.
For a number of reasons, I can not use regexp -- it is up to project lead. And up to me, I am asked to implement from scratch. :-)

I think using indexOf() is a very good idea! Anyway, do you have any idea of an efficient implementation using indexOf() -- not using regexp?


regards,
George
Reputation Points: 11
Solved Threads: 0
Junior Poster
George2 is offline Offline
189 posts
since Nov 2004
Jan 4th, 2006
0

Re: Question about string pattern matching

shoot the project lead for his decision making

indexOf will tell you where a substring resides (if anywhere). Call repeatedly on the parts of the string that aren't wildcards and check whether the returned values are in order.
Should be quite efficient, leaving the biggest trick the splitting of the pattern into chunks to feed into your search function.
Team Colleague
Reputation Points: 1658
Solved Threads: 331
duckman
jwenting is offline Offline
7,719 posts
since Nov 2004
Jan 5th, 2006
0

Re: Question about string pattern matching

jwenting,


Quote originally posted by jwenting ...
indexOf will tell you where a substring resides (if anywhere). Call repeatedly on the parts of the string that aren't wildcards and check whether the returned values are in order.
Good idea! I still have some detailed questions about your method, especially "Call repeatedly on the parts of the string that aren't wildcards". Could you provide some sample source codes please?


regards,
George
Reputation Points: 11
Solved Threads: 0
Junior Poster
George2 is offline Offline
189 posts
since Nov 2004
Jan 5th, 2006
0

Re: Question about string pattern matching

say your pattern is something like "AB*cd*F*", you'd start by checking whether the string begins with "AB".
If it does, check whether in the substring you get when removing AB from the front of the data you have "cd" at some point.
If it does, remove everything up to and including that "cd" and check for an "F" in the rest.
If that's found, return true. If at any point indexOf() returns -1 (not found) return false.

Something like
Java Syntax (Toggle Plain Text)
  1. public boolean matcher(String data, String[] pattern, boolean startsWith) {
  2. int idx = data.indexOf(pattern[0]);
  3. if (idx == -1 || (idx > 0 && startsWith)) {
  4. return false;
  5. }
  6. String newdata = data.substring(idx);
  7. String[] newPat = new String[pattern.length()-1];
  8. for (int i=1; i<pattern.length; i++) {
  9. newPat[i-1] = pattern[i];
  10. }
  11. return matcher(newdata, newPat, false);
  12. }
should do the trick.
Mind I've not tested this, and I've excluded the needed bounds checking for brevity.
Team Colleague
Reputation Points: 1658
Solved Threads: 331
duckman
jwenting is offline Offline
7,719 posts
since Nov 2004

This thread is more than three months old

No one has posted to this discussion for at least three months. Please let old threads die and do not reply to them unless you feel you have something new and valuable to contribute that absolutely must be added to make the discussion complete. Otherwise, please start a new thread in this forum instead.
Message:
Previous Thread in Java Forum Timeline: How do I create a Java code for this number sequence?
Next Thread in Java Forum Timeline: Need helpp!!! How to compare two images in java...





About Us | Contact Us | Advertise | Acceptable Use Policy
Forum Index | Build Custom RSS Feed


Follow us on Twitter


© 2011 DaniWeb® LLC