954,498 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

Converting special regular expression quantifiers to NFAs efficiently

I developed a custom pattern language that is used to validate output from tests. I convert patterns to NFAs and then to DFAs, which are used by my test analyzer to validate test output in real time. At first, I only supported operators that are defined as part of Thompson's algorithm: alternation, concatenation, and kleene star. I then decided to support the one or more operator, +, by simply converting the pattern from X+ to XX* before passing on my my NFA generator. Next, I added count by converting X{3} to XXX before passing on to my generator. The quantifier I am having issues with is: {x,y} or the range quantifer. The way I implemented it was to turn something like: X{2,4} to XX|XXX|XXXX. As you can imagine, this quickly becomes very inefficient if the min and max values are large and spaced out (i.e. X{1,1024}). Is there any trick for turning this sort of quantifer into an efficent NFA? My pattern->NFA converter only understands concatenation, |, *, and a special character used to refer to an epsilon transition.

Thanks,
Mears

mears
Newbie Poster
1 post since Dec 2006
Reputation Points: 10
Solved Threads: 0
 

This article has been dead for over three months

Post: Markdown Syntax: Formatting Help
You