program 1. create a program that accepts strings and syntactically analyze it whether it belongs to the language defined by the BNF below:
BNF:
<VD>-><DT><VL>;
<DT>->char|float|int|double
<VL>-><V>|<V>,<VL>
<V>->'any valid variable name'
program 2. valid assignment statement
BNF:
<AS>-><ID>=EXP>;
<ID>->a|b|c|d
<EXP>-><EXP>+<TERM>|<EXP>-<TERM>|<TERM>
<TERM>-><ID>|'('<EXP>')'

I NEED HELP. THANKS:)) I'M FINE WITH ENGLISH SPEAKING !

Hi. It looks like a homework and if this is the case, you should not look over the internet a solution for it. You cannot inprove in this way.
If you know Regex (Regular expressions) you can find this C# tutorial that is very easy to port to Boost::Regex. Anyway, for this speciffic program, I would use this pseudocode:

std::vector <Token> _tokens;

int GetIdentifierLength(const std::string& input, int startPos)
{
//checks how many chars are alpha or numeric to be identifiers (like variables or reserved words
//returns the length of the text
}
bool IdentifierIsReserved(const std::string& input, int startPos, int len)
{
 //checks if in your input text the identifier is a reserved word
}
bool Parse(std::string input,std::vector <Token*>& _tokens)
{
   int pos = 0;
   while(pos < input.size())
   {
       //skip spaces
       if(IsSpace(input, pos))
       {
           pos++;
           _tokens.push_back(new TokenSpace(pos));
           continue;
       }
      if(IsAlphaNumeric(input, pos))
      {
          int length = GetIdentifierLength(input, pos);
          if(IdentifierIsReserved(input, pos, length))
          {
           _tokens.push_back(new TokenReserved(pos, length));
          }
          else
          {
              _tokens.push_back(new TokenIdentifier(pos, length));          
          }
          pos+= length;
          continue;
      }
      if(IsEqual(input, pos, '-'))
      {
          if(IsEqual(input, pos+1, '>'))
              {
                  _tokens.push_back(new TokenArrow(pos));  
                  pos+=2;
                  continue;
              }
          else
              return false;
      }
      //current char is invalid are invalid
      return false;
   }
   return true;
}

You should do (very) similarly for problem 2, if you don't want to use Regex.
I have sadly to repeat: if is it a homework, do it yourself and ask specific questions here regarding how to implement a technique, not someone doing your homeworks.

    hey thanks.. by the way there are some sample programs that needs to be run after creating the first program that i asked lately,  here it is:
    SAMPLE PROGRAM 1 Run. Valid C variable declaration statement.

        It is a valid C declaration statement!
    2.) Enter a declaration statement: int, abc, def,23jordan;
        It is NOT a Valid Declaration Statement!
        Illegal variable name!
    3.) Enter a declaration statement : float x
        It is NOT a valid C Declaration Statement!
        Semicolon missing!

    SAMPLE PROGRAM 2. Run. Valid C variable declaration statement
    1.) Enter an assignment statement: a=b + c – d * a / d;
        It is a valid assignment statement!
    2.) Enter an assignment statement: b=a +;1.)     Enter a declaration statement: int, abc, def,ghi;
        It is NOT a valid assignment statement!
        Assignment missing
    3.) Enter an assignment statement: b = a /(c + d;
    4.) It is NOT a valid assignment statement!
        Parenthesis missing

First of all, this looks like a homework, and I said to you, you should not try to find homework solutions over the internet.
Try to make first the syntax program (that splits the text into tokens). After you do so, try to make a semantic analyzer. If you have issues with it, ask speciffic questions.

here's my first program in the problem that i posted, but there are some error with this program, i can't debug it. what's the problem?


                        #include <stdio.h>
                        #include <conio.h>
                        #include <ctype.h>
                        #include <string.h>

                        int main(void)
                        {
                            int strlen, a, b,c;
                            int data_type, var;
                            char dtypechecker;
                            int isalpha;
                            char *str;
                            clrscr();

                            data_type=0;
                            var=1;
                            b=0;
                            gets(str);
                            str+='�';
                            for(a=0;a<strlen;a++)
                            {
                                if(data_type!=1)
                                    if(str[a]==isalpha)
                                    {
                                        dtypechecker[b]=str[a];
                                        b++;
                                    }
                                    if(str[a]==':')
                                        if(b==0)
                                        //do nothing
                                        else
                                        {
                                            dtypechecker==[10];
                                            switch(dtypechecker)
                                            printf("intt data ytpe case 'int':  var=0");
                                            printf("intt data ytpe case 'char': var=0");
                                            printf("intt data ytpe case 'float':    var=0");
                                            printf("intt data ytpe case 'double':   var=0");
                                            printf("intt data ytpe case default=data type error");
                                            error==1;               }
                                        }
                            }
                            if(error!=1)
                            else{

                                exit(1);
                            }
                            if(var!=1)
                            if(str[a]==isalpha){
                                varchecker[c]==str[a];
                                c++;
                            }
                            else if(str[a]==isdigit)
                                if(c==0)
                                    error=1;
                                else{
                                    varchecker[c]==str[a];
                                    c++;
                                }
                                else if(str[a]=='*')
                                    if(c==0)
                                        varchecker[c]=str[a];
                                        printf("nt*t pointer");
                                        c++;
                                    else
                                        error=1;
                                    else if(str[a]==',')
                                        if(c==0||str[a+1]==';')
                                        error=1;
                                    else
                                        for(d=0;d<c;d++){
                                            printf("t variablen");
                                            printf("t separator n");
                                        }
                                        else if(str[a]==';')
                                            if(a+1!=strlen(str);c==0){
                                                //error replacing semi colon
                                                printf("ntlexemett token");
                                                printf("%s",dtypechecker);
                                                printf("t data type n");
                                            }
                                            else
                                            for(d=0;d<c;d++){
                                                printf("%c",varchecker);
                                                printf("t variable");
                                                printf(";t semi colon");
                                            }
                                            return 1;
                        [Click Here](null)
                        }

Edited 2 Years Ago by jalferez1: added some text

is_alpha should be a function, not an int. Like:

bool is_alpha(char ch)
{
    return (ch >= 'A' && ch <= 'Z') || (ch >= 'a' && ch <= 'z');
}

Also, I recommend to you to split the program into two parts:
- syntactic analysis:
split every parts of text into "words" (are named tokens).
A token has typically 3 or 4 fields like this:

struct Token{
     int startPos;
     std::string text;
     int kind; 
     //0 = space
     //1 = reserved word
     //2 = identifier (like variable name)
     //3 = constant
     //4 = operator
     //5 = paranthesis (open or close)
     //6 = comment
 };

 //the program can have maximum 1024 words
 Token tokens[1024];
 int tokenCount = 0; 

With first step you will fill the tokens array to contain all the words as much as you can identify them. If you find invalid characters, like # you will report that is not a known character.

  • semantic analysis: knowing the words and their order, is it much easier to test variables and expressions.
    For example: if it does start with a reserved word (like a type name) after this has to be an identifier.
    A pseudocode of it it could be:
  • make a second array of tokens where you remove spaces just to simplify the testing.
    in this way from:
    int a = 3;
    will be from:
    reserved (int), space, identifier(a), space, operator (=), space, numeric (3), operator (;)
    you will have the array:
    reserved (int), identifier(a), operator (=), numeric (3), operator (;)
    As it starts with reserved type, you can say that is a declaration, and after that you can check that is well formed.

But the most important part as for me: make sure that the syntactic part is done first and correctly. Try giving any text and see if it splits into tokens. Without it is it much harder.

You seem to be coding in C here in the C++ coding forum.

This may be better handled in the C coding area?

Not good to use conio.h stuff

Shun the dangerous use of gets in C
(I use my own readLine to handle the dynamic allocation of C strings that are to be input from a user or file.)

OK, while Ciprian 2 says is broadly correct, the post overlooks an important first step: lexical analysis, the process of identifying lexemes (also called tokens) in the input stream. While Ciprian2 mentions tokenizing, it gets handwaved in the post, which isn't really fair; lexing is a fairly involved subject by itself. Aside from defining a Token type, as C2 showed, you need to have a means for determing what is and isn't a valid token, and for categorizing the valid tokens. Let's revisit the Token class with a minor refinement in the form of a TokenType enumeration:

enum TokenType {
     INVALID, EOF,
     IDENTIFIER, RESERVED,
     INTEGER_LITERAL, 
     LPAREN, RPAREN, 
     ADDOP, MULOP, ASSIGN,
     COMMA, SEMICOLON
};

struct Token {
    int startPos;
    std::string text;
    TokenType type;
};

We'll ignore whitespace for the moment; most languages today use whitespace primarily as a separator, though there are exceptions. What you will need is a function that reads the input stream one character at a time, and based on the first character and thopse which follow, determines what kind of token you are reading, until it comes to the end of the token.

So, the first thing we need is function that reads in a single character, and keeps track of the position in the input stream. Fortunately, all you need for this is cin and a counter, which can be done like so:

int pos = 0;

char readchar(ifstream f)
{
    char in; 
    f >> in;
    if (in != EOF && f.good())
    {    
        pos++;
    }

    return in;
}

Next we need a function which will recognize one token. We will call this repeatedly, until the end of the input.

Token getToken(ifstream f)
{
    char c = readchar(f);

    if (f.eof())
    {
        return EOF;
    }
    else if (isalpha(c)) 
    {
        return get_identifier(f, c);
    }
    else if (isdigit(c))
    {
        return get_num(f, c);
    }
    else
        switch (c) 
        {
            case '(':
                return LPAREN;
                break;
            case ')':
                return RPAREN;
                break;

            // etc.

            default: 
                return INVALID;
        }
    }
}

As you can guess, the get_num() function just reads until it finds something other than a number; if the end value is whitespace, it returns the token, otherwise it returns INVALID.

The get_indetifier() function is a bit more complex, as there are more valid end characters, and it also has to test to see if the given identifier (once it is read in completely) is a reserved word or not. This function is left as an exercise for the reader. :-p

Edited 2 Years Ago by Schol-R-LEA

@shyamdadhich

Please start a new thread for your question.

Best to ask C programming questions in the C forum.

This article has been dead for over six months. Start a new discussion instead.