where the green marked substrings seems to be such LCS. There is also a longest common subsequence problem where the sequence of chars need not to be contiguously stored.
LCS is a famous problem usually solved by dynamic programming, its time complexity is O(mn). Some theory can be found here. There are also implementations on wikibooks in various languages. Unfortunately, I haven't seen any implementation in PSM (pl/sql) for SQL user defined function (UDF). I am sure one cannot do that in SQL queries directly. So an UDF written in PSM or even a C/C++ implementation what can be called from SQL, what is possible as for example in postgresql, seems to be a solution. Yet what I have seen are some PHP implementations having poor O(n^4).
I couldn´t stop thinking about the longest-common-substring problem for SQL language (user defined function). So I took and old version of LCS which I once implemented in plain C. Its time complexity is O(m*n), storage O(m+n). I translated this C version line by line into a user defined function (PSM of SQL 1999) for Microsoft SQL Server and Sybase database. Finally I also compiled a Java version from my old C LCS function.
All three functions implement the very same LCS algorithm and I myself can also affirm that every function is based on the same programming methods and approaches what guaranties that they have same O(m*n).
Then I did some benchmarks using your sample strings containing Hungarian Debrecen. I executed all three functions/ methods 100000 times on these strings on an "medium upper class" win XP computer.
[I]Here are the results for 100000 function calls:
Programming Language Execution time
plain C (no classes) 2.2 seconds
Java, static member 2.8 seconds
SQL user defined function 2658.0 seconds [/I]
I am very surprised at the poor SQL-UDF result. I had never believed that SQL-UDF would consume a thousand times execution time of C/Java function. So it is rather out of the question to implement LCS algorithm in SQL user defined function for practical usage.
This benchmark pleasantly surprised one by the computing power of Java. It put up a great fight by contrast with fast C for this non-numerical algorithm.