![]() |
| ||
| Lexer- Tokenizer problem hi, i use the flex tool {http://www.gnu.org/software/flex/manual/} to generate a tokenizer ,but i have the following problem {it has to do with the way that flex tokenizes the input:: FILE : flex.l %{Example file: test_string daniweb What i want is to have the above string tokenized as STRING SPACE WEB instead flex recognizes it as STRING, because it tries to match the longest input.... How can i fix this problem? all ideas are welcomed.... PS:: to compile: flex flex.l |
| ||
| Re: Lexer- Tokenizer problem Your string component matches spaces, and now you're complaining that you don't want to match spaces. You can't have it both ways. |
| ||
| Re: Lexer- Tokenizer problem Quote:
Thank you for answering {apparently, few people have read the post...} Yes you are rigth, it seems that i can't have it both ways... but from where i stand i want to use flex in order to do the following::: Recognize some specif keywords {in the simplified example i provided the keyword was "daniweb"} and recognize everything else as a string...any ideas on how can i do that? PS: maybe start conditions could help me solve the problem?{ i havven't understand them so well...} PS2:in the beggining i thought it wouldn't be that difficult, but i was wrong... |
| ||
| Re: Lexer- Tokenizer problem What is this Flex? some kinda regular expression library or something. Do you even need it or can your problem be simplified? |
| ||
| Re: Lexer- Tokenizer problem Quote:
Flex (The Fast Lexical Analyzer) Flex is a fast lexical analyser generator. It is a tool for generating programs that perform pattern-matching on text. Flex is a non-GNU free implementation of the well known Lex program. http://www.gnu.org/software/flex/ http://flex.sourceforge.net/ |
| ||
| Re: Lexer- Tokenizer problem Um ok, please explain this: string_component [0-9a-zA-Z \t\.!#$%^&()*@_] and what you think it does? |
| ||
| Re: Lexer- Tokenizer problem There's a way to set precedence of regex's in flex. I don't remember the exact syntax, but you should put it before your catchall regex that you have defined there. |
| ||
| Re: Lexer- Tokenizer problem Quote:
unfortunately i haven't found the solution...i worked around my problem by changing the grammar {i.e. bison file}, and finally i gave the project... Now when i find the time i will try to find a solution using starting conditions |
| ||
| Re: Lexer- Tokenizer problem First you gotta know what your regular expressions are doing. To me string_component [0-9a-zA-Z \t\.!#$%^&()*@_] and the example you have given is contradictory, like salem mentioned. |
| ||
| Re: Lexer- Tokenizer problem using boost.spirit may be much easier: http://www.boost.org/libs/spirit/doc/quick_start.html #include <boost/spirit/core.hpp> |
| All times are GMT -4. The time now is 10:49 pm. |
Forum system based on vBulletin Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.
©2003 - 2009 DaniWeb® LLC