0

guys, can u help me make a lexical analyzer that outputs invalid/valid only?
it would really help if you guys help me with this. this is my midterm project. thanks!!

4
Contributors
9
Replies
10
Views
8 Years
Discussion Span
Last Post by techie2008
0

That would depend on the complexity of the "language" you were attempting to parse?

A C expression would probably be do-able.
A C program would not.

0

Well, let's remember C language (it's almost so called expression language) lexical grammar...
Here it is expression part:

C language expression tokens::

Identifiers:: all but sizeof

<letter|_>[<letter>|_|<digit>]...
----------------------------------

Operators and punctuators:: one of

[ ] ( )
:
? .
+ - * / % ^ & | ~
! = < > += -= *= /= %=
^= &= |= << >> >>= <<= == !=
<= >= && || ++ -- , ->
sizeof
-----------------

Literals:: one of

integer-literal
character-literal
floating-literal
string-literal
-----------------

Whitespaces:: one of

blanks
horizontal tab
vertical tab
newlines
formfeed
comment
-------

Add some punctuators (~5-6) and recognize some identifiers as keywords - and you have full C language scanner...

0

ahhhmmm..... would it be possible if you give me the source code?? i really can't get it.. i'm really stupid when it comes to this subject

0

This very minute? It's impossible. The scanner for C expressions is a program with ~1000 lines of source code (of course, it's a rough estimate). Probably, it's possible to write more compact (and fast) scanner with special methods (based on automata theory, for example).

Better start from a simple grammar. For example, consider simplest arithmetical expressions:

Tokens:: one of

identifier
decimal-integer-literal
Operators:: one of
+ - / *
Punctuators:: one of
( )

Try to implement a scanner for this lexical grammar.

Input: source text string (or char array)
Output: next token code

0

Given an input of say 10*(x + 2) Something which prints
Found "10"
Found "*"
Found "("
Found "x"
Found "+"
Found "2"
Found ")"
would be a useful thing to accomplish.

When that works reliably, then think about adding handling for parentheses and operator precedence.

We're here to "help", not "give".

0

Follow Salem's advice!

I have wrote a scanner for simple C expressions (~300 LOCs) but I will post the source (may be ;)) only after you will try to do it yourself.

What's a bliss to write parsers for automat grammatics!..

Some remark: fortunately, a scanner does not know operator precedence (let syntax parser knows).

0

thanks for your help guys.. i have gotten through the basics, a friend helped me with it...

This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.