guys, can u help me make a lexical analyzer that outputs invalid/valid only?
it would really help if you guys help me with this. this is my midterm project. thanks!!

Recommended Answers

All 9 Replies

Probably. Read the sticky posts...

That would depend on the complexity of the "language" you were attempting to parse?

A C expression would probably be do-able.
A C program would not.

yes! it could be a C expression.. thanks in advance..

Well, let's remember C language (it's almost so called expression language) lexical grammar...
Here it is expression part:

C language expression tokens::

Identifiers:: all but sizeof

<letter|_>[<letter>|_|<digit>]...
----------------------------------

Operators and punctuators:: one of

[ ] ( )
:
? .
+ - * / % ^ & | ~
! = < > += -= *= /= %=
^= &= |= << >> >>= <<= == !=
<= >= && || ++ -- , ->
sizeof
-----------------

Literals:: one of

integer-literal
character-literal
floating-literal
string-literal
-----------------

Whitespaces:: one of

blanks
horizontal tab
vertical tab
newlines
formfeed
comment
-------

Add some punctuators (~5-6) and recognize some identifiers as keywords - and you have full C language scanner...

ahhhmmm..... would it be possible if you give me the source code?? i really can't get it.. i'm really stupid when it comes to this subject

This very minute? It's impossible. The scanner for C expressions is a program with ~1000 lines of source code (of course, it's a rough estimate). Probably, it's possible to write more compact (and fast) scanner with special methods (based on automata theory, for example).

Better start from a simple grammar. For example, consider simplest arithmetical expressions:

Tokens:: one of

identifier
decimal-integer-literal
Operators:: one of
+ - / *
Punctuators:: one of
( )

Try to implement a scanner for this lexical grammar.

Input: source text string (or char array)
Output: next token code

Given an input of say 10*(x + 2) Something which prints
Found "10"
Found "*"
Found "("
Found "x"
Found "+"
Found "2"
Found ")"
would be a useful thing to accomplish.

When that works reliably, then think about adding handling for parentheses and operator precedence.

We're here to "help", not "give".

Follow Salem's advice!

I have wrote a scanner for simple C expressions (~300 LOCs) but I will post the source (may be ;)) only after you will try to do it yourself.

What's a bliss to write parsers for automat grammatics!..

Some remark: fortunately, a scanner does not know operator precedence (let syntax parser knows).

thanks for your help guys.. i have gotten through the basics, a friend helped me with it...

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.