Hello there,

I was trying to implement a program for a friend of mine in C, this program has to read the HTML user input (directly or from a file), check that input if the HTML tags are put in their correct order.

For example, If I entered

<HTML></BODY>Whatever text<BODY></HTML>

The program should give me an error message about the disorder in the tag <BODY>.


Should I use stacks or can I use something else instead ?


Thanks for the nice place.:cool:

A stack makes sense. That way you'd be able to catch problems like This is a <I><B>mismatched tag pair</I></B> too.

So do you think I would use the stack to push and pull the strings or a character by character ?

Push/pop HTML entities, not characters (perhaps)

You're matching <BODY></BODY>, not <BODY></YDOB>

It makes sense Salem, Thanks.

I might ask more questions during testing :)

if "strtok( )" is not already a good friend of yours, you should become acquainted soon :P

And what if string.h is not allowed to be used !!!!

>And what if string.h is not allowed to be used !!!!
Most of the functions in string.h are simple to implement. If you can't use the standard library, it's not a great effort to roll your own string handling functions.

>And what if string.h is not allowed to be used !!!!
Most of the functions in string.h are simple to implement. If you can't use the standard library, it's not a great effort to roll your own string handling functions.

That's another good idea. In addition to that, I was thinking to use array of structures each in which has an array struct element. for loops to read the string and store each tag to an array, then a stack to store those tags(arrays).

Hi Again,


I have wrote this simple 2-D array to implement the stack, the code -so far- reads a string and store its HTML tags into a 2-d array using a function called "push"and show them using a function called "writetag".
Then I have tried to add a filter to take off the closing tags (i.e. </HTML>) from the original array to a new array called "bracket_off".
However, the filter takes only one character from the closing tag which is "/" according to the IF statement, and continues without adding the rest of the tag to that array. Can someone please check this code and help in this ? Thanks in advance :)

#include <stdio.h>
#define B_OPEN '<'
#define B_CLOSE '>'
#define BRACKET_MAX 10
#define B_CHAR_MAX 7

char bracket[BRACKET_MAX][B_CHAR_MAX];
char bracket_off [BRACKET_MAX][B_CHAR_MAX];//new
int diagnosis;
signed char bracket_nr;
signed char bracket_nr_off;

int push (void);
int writetags (void);
//int off_bracket (void);//new
int main (void){
    extern int diagnosis;
    extern signed char bracket_nr;
     extern signed char off_bracket_nr;//new
    char c;
    bracket_nr=-1;
    diagnosis=0;
    while((c=getchar())!=EOF && c!='\n'){
                             if(c==B_OPEN){
                                           diagnosis=push();
                                           };
                                           switch (diagnosis){
                                                  case -3:
                                                       printf("infinate bracket.\n");
                                                       break;
                                                  case -5:
                                                       printf("Too Large Bracket.\n");
                                                       break;
                                                  case -7:
                                                       printf("Too many bracket.\n");
                                                       break;
                                           
                                                               };
                                           if (diagnosis < 0)
                                           break;
                                           
                             
                             };
                             if(diagnosis==0){
                                              printf("No brackets.\n");
                             
                             }else if (diagnosis > 0){
                                   printf("You have written %d brackets; they are:\n",diagnosis);
                                   writetags();
 
                                   if (diagnosis%2==0){printf("You have entered correct number of brackets in the HTML statement\n");}
                                   else if (diagnosis%2!=0){printf("You have entered incorrect number of brackets in the HTML statement\n");}
                                   }else{
                                         printf("Text contains errors.\n");
                                   
                                   };
                                   system ("PAUSE");
                                   return diagnosis;
                                                               
                           };

/**********************PUSH FUNCTION*****************************/
 int push (void){
       extern char bracket [BRACKET_MAX][B_CHAR_MAX];
       extern int diagnosis;
       extern signed char bracket_nr;
       extern signed char bracket_nr_off;
        extern char bracket_off [BRACKET_MAX][B_CHAR_MAX];//new
      char temp, c, bracket_char, bracket_char_off;
      
                                
      c=getchar();
      bracket_nr++;
      if(bracket_nr>=BRACKET_MAX)
                 return (diagnosis = -7); //to many tags
      diagnosis=bracket_nr+1;
            for(bracket_char=0;c!=EOF && c!='\n' && c!=B_CLOSE && bracket_char<B_CHAR_MAX;bracket_char++, c=getchar()){
                    
                    
               
               bracket[bracket_nr][bracket_char]=c;                 //else, record tag name into the stack bracket
               
                              
                              
            };  //This line    
            
            
            if(bracket_char=='/'){
        while (bracket_char!='\n'){
        
                
                                                                 
               bracket_off[bracket_nr_off][bracket_char_off]=bracket[bracket_nr][bracket_char] ;
               bracket_char++;    
               bracket_char_off++;
              
               
               }; 
         //};     
         };                           
       
         bracket[bracket_nr][bracket_char]='\0';
        
        
        if(bracket_nr=='/'){
        while (bracket_char!='\n'){
        
                
                                                                  
               bracket_off[bracket_nr_off][bracket_char_off]=bracket[bracket_nr][bracket_char];
               bracket_char++;    
               bracket_char_off++;
              
               
               }; 
         //};     
         };
                 
        
      
               
   bracket_off [bracket_nr_off][bracket_char_off]='\0';//new
       if(c==B_CLOSE){
           return diagnosis; 
                                                           
          }else if (bracket_char>=B_CHAR_MAX){
             return(diagnosis=-5);
                                                           
              }else if (c==EOF || c=='\n')
                  return (diagnosis=-3);
                         return diagnosis;
                             
 //};
 


/**********************WRITETAGS FUNCTION*****************************/
 };
int writetags (void) {
extern char bracket[BRACKET_MAX][B_CHAR_MAX];
extern int diagnosis;
extern signed char bracket_nr;
extern signed char bracket_nr_off;
extern char bracket_off [BRACKET_MAX][B_CHAR_MAX];//new
char bracket_char, bracket_char_off;



for(;bracket_nr>=0; bracket_nr--){
    
     for(bracket_char=0; bracket[bracket_nr][bracket_char]!='\0'; bracket_char++){
        
         putchar(bracket[bracket_nr][bracket_char]);

         
      };
      
  putchar('\n');
  
};
 printf("The off tags in your HTML are:\n");//new
for(;bracket_nr_off>=0; bracket_nr_off--){//new
     for(bracket_char_off=0; bracket_off[bracket_nr_off][bracket_char_off]!='\0'; bracket_char_off++){//this one
         putchar(bracket_off[bracket_nr_off][bracket_char_off]);//new
      };
      
   putchar('\n');
  
                 };


            
};

Maybe I didn't explain what I want exactly :P

The code above takes any string entered by the keyboard , for example this string:
<HTML><BODY>whatever</BODY></HTML>

And stack all the tags as follows:
HTML
BODY
/BODY
/HTML


Then I have tried to add a filter of for and while loops to only copy the closing tags which are in the example:
/BODY
/HTML


So far, the function is not working as supposed to do, and it only takes the first character (the /) and continues.


What am I supposed to do to make it working please ???


P.S. Here is the new filter, but it doesn't seem to be working:

for (bracket_nr_off=0;bracket_nr>=0; bracket_nr--, bracket_nr_off++){
                
                    if(bracket_char=='/')
                       while (bracket_char!=B_CLOSE){
                             for (bracket_char=0; bracket_char<B_CHAR_MAX; bracket_char++){
                              bracket_off[bracket_nr_off][bracket_char_off]=bracket[bracket_nr][bracket_char];
                              };
                              };
                              };

The poor indentation is putting me off from working out what else might be wrong.

Does your program work with say
<html></html>
It's the most basic of tests, either it passes or fails.

Hi Salem,

Sorry for the indentation, I was on hurry. However, Here is the code again but more readable :) and before that, the answer for your question is that my program shows the following as a result of entering <hrml></html>:

you have written 2 brackets; and they are:

html
/html
The off brackets of the program are:

The number of brackets in your html is correct.

#include <stdio.h>
#define B_OPEN '<'
#define B_CLOSE '>'
#define BRACKET_MAX 10
#define B_CHAR_MAX 7

char bracket[BRACKET_MAX][B_CHAR_MAX];
char bracket_off [BRACKET_MAX][B_CHAR_MAX];//new
int diagnosis;
signed char bracket_nr;
signed char bracket_nr_off;


/****************Function Prototypes*****************/
int push (void);
int writetags (void);




/****************Main Function********************/
int main (void){
    extern int diagnosis;
    extern signed char bracket_nr;
     extern signed char off_bracket_nr;
    char c;
    bracket_nr=-1;
    diagnosis=0;
    while((c=getchar())!=EOF && c!='\n'){
         if(c==B_OPEN){
         diagnosis=push();
         };
          switch (diagnosis){
             case -3:
              printf("infinate bracket.\n");
               break;
             case -5:
               printf("Too Large Bracket.\n");
               break;
             case -7:
               printf("Too many bracket.\n");
               break;
                  
         };
            if (diagnosis < 0)
             break;
             
         };
         if(diagnosis==0){
           printf("No brackets.\n");
              
          }else if (diagnosis > 0){
           printf("You have written %d brackets; they are:\n",diagnosis);
           writetags();
 
          if (diagnosis%2==0){printf("You have entered correct number of brackets in the HTML statement\n");}
            else if (diagnosis%2!=0){printf("You have entered incorrect number of brackets in the HTML statement\n");}
         }else{
         printf("Text contains errors.\n");
               
          };
             system ("PAUSE");
             return diagnosis;                
            };


/**********************PUSH FUNCTION*****************************/
 int push (void){
       extern char bracket [BRACKET_MAX][B_CHAR_MAX];
       extern int diagnosis;
       extern signed char bracket_nr;
       extern signed char bracket_nr_off;
       extern char bracket_off [BRACKET_MAX][B_CHAR_MAX];
       char temp, c, bracket_char, bracket_char_off;
      
            
      c=getchar();
      bracket_nr++;
      if(bracket_nr>=BRACKET_MAX)
       return (diagnosis = -7); //to many tags
      diagnosis=bracket_nr+1;
       for(bracket_char=0;c!=EOF && c!='\n' && c!=B_CLOSE && bracket_char<B_CHAR_MAX;bracket_char++, c=getchar()){
          bracket[bracket_nr][bracket_char]=c;               
       };      
     
       if(bracket_char=='/'){
          while (bracket_char!='\n'){                 
      bracket_off[bracket_nr_off][bracket_char_off]=bracket[bracket_nr][bracket_char] ;
          bracket_char++;    
          bracket_char_off++;
          }; 
    };            
    bracket[bracket_nr][bracket_char]='\0';
     
   bracket_off [bracket_nr_off][bracket_char_off]='\0';
       if(c==B_CLOSE){
      return diagnosis; 
                        
     }else if (bracket_char>=B_CHAR_MAX){
        return(diagnosis=-5);
                        
         }else if (c==EOF || c=='\n')
        return (diagnosis=-3);
          return diagnosis;
 };
 
 
 
 
 
/**********************WRITETAGS FUNCTION*****************************/

int writetags (void) {
extern char bracket[BRACKET_MAX][B_CHAR_MAX];
extern int diagnosis;
extern signed char bracket_nr;
extern signed char bracket_nr_off;
extern char bracket_off [BRACKET_MAX][B_CHAR_MAX];
char bracket_char, bracket_char_off;



for(;bracket_nr>=0; bracket_nr--){
    
     for(bracket_char=0; bracket[bracket_nr][bracket_char]!='\0'; bracket_char++){
   
    putchar(bracket[bracket_nr][bracket_char]); 
      }; 
  putchar('\n');
};


printf("The off tags in your HTML are:\n");
for(;bracket_nr_off>=0; bracket_nr_off--){
     for(bracket_char_off=0; bracket_off[bracket_nr_off][bracket_char_off]!='\0'; bracket_char_off++){
    putchar(bracket_off[bracket_nr_off][bracket_char_off]);
      };
      
   putchar('\n');
  };   
};

yeah, i hate to be a whiny biatch, but i cant really read your code with all the indentations either.

im at work right now, so i dont have time to sort through it by inspection, if i can't easily read it.. i probably will be too tired/busy to get on it when i come home.

its not your fault, really, your TABS probably look just fine on your end. but here, each TAB converts into 8 spaces... perhaps you notice all the lines of code wrapping?

i get around this by using my editor's "Replace All" function to replace every TAB character with 3 spaces before posting it here.

now see, all this time i spent whining about formatting, i could have been inspecting your code! :P


EDIT:

i just peeked at your code some more, and i gotta say you got WAY too much stuff going on inside your FOR statements. put that stuff inside the loop. you're asking for logical and control errors by trying to crush 5 commands into just the FOR statement alone.

and in addition to converting TABS to spaces, you need to line up your indentions better. the sloppiness adds to the unreadability.

not trying to be a bish about it. but i just dont have time to edit your code for formatting.


.

yeah, i hate to be a whiny biatch, but i cant really read your code with all the indentations either.

im at work right now, so i dont have time to sort through it by inspection, if i can't easily read it.. i probably will be too tired/busy to get on it when i come home.

its not your fault, really, your TABS probably look just fine on your end. but here, each TAB converts into 8 spaces... perhaps you notice all the lines of code wrapping?

i get around this by using my editor's "Replace All" function to replace every TAB character with 3 spaces before posting it here.

now see, all this time i spent whining about formatting, i could have been inspecting your code! :P


EDIT:

i just peeked at your code some more, and i gotta say you got WAY too much stuff going on inside your FOR statements. put that stuff inside the loop. you're asking for logical and control errors by trying to crush 5 commands into just the FOR statement alone.

and in addition to converting TABS to spaces, you need to line up your indentions better. the sloppiness adds to the unreadability.

not trying to be a bish about it. but i just dont have time to edit your code for formatting.


.

naaaah you're not whining at all :P


Code edited and waiting for you to have a better look into it :D

why do you have a semicolon ';' after all your end brackets '}'

these do not belong. in some cases it wont hurt, but in other cases it will seriously F- up your logic.

this right here may be the source of most if not all your problem

why do you have a semicolon ';' after all your end brackets '}'

these do not belong. in some cases it wont hurt, but in other cases it will seriously F- up your logic.

this right here may be the source of most if not all your problem

I've tried without the semicolons but didn't change nothing, same results actually.

okay dude.

i just spent the time cleaning up your code, and taking out all those spurious semicolons. you still probably have some problems, but at least we can find them now.

some of what i did was preference (like whether the opening bracket '{' of a block is directly after the statement or on its own line)

but most of what i did is foundational stuff for readability. writing readable code is one of the most important things you can do. the key is "maintainability"

because even if your code is wrong, at least someone can work with it. otherwise, it will just get thrown away.

now check it out:

#include <stdio.h>
#define B_OPEN '<'
#define B_CLOSE '>'
#define BRACKET_MAX 10
#define B_CHAR_MAX 7

char bracket[BRACKET_MAX][B_CHAR_MAX];
char bracket_off [BRACKET_MAX][B_CHAR_MAX]; //new
int diagnosis;
signed char bracket_nr;
signed char bracket_nr_off;


/****************Function Prototypes*****************/
int push (void);
int writetags (void);




/****************Main Function********************/
int main (void)
{
    extern int diagnosis;
    extern signed char bracket_nr;
    extern signed char off_bracket_nr;
    char c;
    
    bracket_nr=-1;
    diagnosis=0;
    
    while((c=getchar())!=EOF && c!='\n') 
    {         
        if(c==B_OPEN) 
            diagnosis=push();    
         
        switch (diagnosis)
        {
            case -3:
                 printf("infinate bracket.\n");
                 break;
            case -5:
                printf("Too Large Bracket.\n");
                break;
            case -7:
                printf("Too many bracket.\n");
                break;              
        }
            
        if (diagnosis < 0)
            break;     
    }
    
    if (diagnosis==0)
        printf("No brackets.\n");

    else if (diagnosis > 0)
    {
        printf("You have written %d brackets; they are:\n",diagnosis);
        writetags();

        if (diagnosis%2==0)
            printf("You have entered correct number of brackets in the HTML statement\n");

        else if (diagnosis%2!=0)
            printf("You have entered incorrect number of brackets in the HTML statement\n");
    }
    else
        printf("Text contains errors.\n");       

    system ("PAUSE");
    return diagnosis;                
}


/**********************PUSH FUNCTION*****************************/
int push (void)
{
    extern char bracket [BRACKET_MAX][B_CHAR_MAX];
    extern int diagnosis;
    extern signed char bracket_nr;
    extern signed char bracket_nr_off;
    extern char bracket_off [BRACKET_MAX][B_CHAR_MAX];
    char temp, c, bracket_char, bracket_char_off;
      
    c=getchar();
    bracket_nr++;

    if(bracket_nr>=BRACKET_MAX)
        return (diagnosis = -7); //to many tags

    diagnosis=bracket_nr+1;

    for(bracket_char=0;c!=EOF && c!='\n' && c!=B_CLOSE && bracket_char<B_CHAR_MAX;bracket_char++, c=getchar())
    {
        
        bracket[bracket_nr][bracket_char]=c;               

		if(bracket_char=='/')
		{
			while (bracket_char!='\n')
			{                 
				bracket_off[bracket_nr_off][bracket_char_off]=bracket[bracket_nr][bracket_char] ;
				bracket_char++;    
				bracket_char_off++;
			}
		}            
	}
    bracket[bracket_nr][bracket_char]='\0';
    bracket_off [bracket_nr_off][bracket_char_off]='\0';

    if(c==B_CLOSE)
        return diagnosis; 

    else if (bracket_char>=B_CHAR_MAX)
        return(diagnosis=-5);

    else if (c==EOF || c=='\n')
        return (diagnosis=-3);
        
    return diagnosis;
}

 
 
 
/**********************WRITETAGS FUNCTION*****************************/

int writetags (void) 
{
    extern char bracket[BRACKET_MAX][B_CHAR_MAX];
    extern int diagnosis;
    extern signed char bracket_nr;
    extern signed char bracket_nr_off;
    extern char bracket_off [BRACKET_MAX][B_CHAR_MAX];
    char bracket_char, bracket_char_off;
    
    for(;bracket_nr>=0; bracket_nr--)
    {
        for(bracket_char=0; bracket[bracket_nr][bracket_char]!='\0'; bracket_char++)            
            putchar(bracket[bracket_nr][bracket_char]); 

        putchar('\n');
    }
    
    printf("The off tags in your HTML are:\n");

    for(;bracket_nr_off>=0; bracket_nr_off--)
    {
        for(bracket_char_off=0; bracket_off[bracket_nr_off][bracket_char_off]!='\0'; bracket_char_off++)
            putchar(bracket_off[bracket_nr_off][bracket_char_off]);
        
        putchar('\n');
    }   
}

another problem, IMO, is all that stuff you got slammed into the FOR statement, esp. the one in "push( )".

mabye it's correct and working... but damned if it doesn't make it hard to read. i don't even really even want to try and debug it.

consider breaking it out into the body of your FOR loop.

people will not pay you for clever, obfuscated code. people will pay you for readable, understandable code that is easy for others to maintain.

I think I missed something in for loop, and I've just edited the "cleaned up" code i posted in #21, above.

I admit i have neither compiled nor run either your original code or my cleaned up code. so, don't assume anything.

but Ive got to go now. im supposed to be doing work at my job

:P

hopefully someone else will come along and pick up where ive left off.

ill stop by later tonight (US PDT)

your next problem, IMO, is all that crap you got slammed into each FOR statement. mabye it's correct and working... but damned if it doesn't make it hard to read.

consider putting those commands in the body of your FOR loop.

people will not pay you for clever, obfuscated code. people will pay you for readable, understandable code that is easy for others to maintain.

loool well,


Just to make things more narrower and reduce the scope of the problem for you. In the push function, these lines:

if(bracket_char=='/')
    {
        while (bracket_char!='\n')
        {                 
            bracket_off[bracket_nr_off][bracket_char_off]=bracket[bracket_nr][bracket_char] ;
            bracket_char++;    
            bracket_char_off++;
        }
    }

And in another version of the program, I have changed that to:

for (;bracket_nr>=0; bracket_nr--, bracket_nr_off++){
                
                    if(bracket_char=='/')
                       while (bracket_char!=B_CLOSE){
                             for (bracket_char=0; bracket_char<B_CHAR_MAX; bracket_char++){
                              bracket_off[bracket_nr_off][bracket_char_off]=bracket[bracket_nr][bracket_char];
         }
    }
}

The goal of this block is to take-off the closing tags "e.g /HTML" from the original array which contains all the tags. But both blocks above do not work :(

yeah, sorry im so damn picky, but see i just spent ~30 minutes getting to the point where i could read your code.

i havent even started to debug it.

I'm by no means a software guru. perhaps others here can cipher out the intent of your code and it's errors just looking at it.

but if i cant read it, i just come to a mental stop. maybe its an emotional hangup, i dunno. but it is what it is.

good luck

Thanks for the good luck wishes, though it's not enough lol

Hope someone will just copy my code and paste into his compiler and check why it's not storing the html closing tags to the new array as it is supposed to be !!

man, look at this FOR statement for(bracket_char=0;c!=EOF && c!='\n' && c!=B_CLOSE && bracket_char<B_CHAR_MAX;bracket_char++, c=getchar()) arrrgh.

must. resist. stabbing. eyes. with. pencil.

sorry man, i can't debug this for you. and by "can't", I mean "won't" ... i mean, your code is full of this type of convoluted constructions. for me to fix your code, would require me to think in this obfuscated manner

I'm going to assume you inhereted this mess from your friend whom you are trying to help with this assignment. Your job now is to be a good friend and tell him to start this whole thing over from scratch.

SIMPLY PUT, here is probably the most basic method of how you could (and should, IMO) do it::

get your input all at once and store it before attempting to work on it.

search input for opening and closing brackets to find your tags.

if the tag is an opening tag (no slash '/' ), PUSH the tag onto a single stack

if tag is a closing tag (has leading slash '/' ) POP the a tag from the stack and compare it to the closing tag just read. the POP'ed tag should be the opening mate of the closing tag, so should match (minus the slash) ... if they dont, you have a mismatched tag error.

if all tags have matched and you've reached the end of the input, your stack should be empty ... if it's not empty, then you are missing closing tag(s)

make certain you do not "underflow" your stack. this will occur if you had extra closing tags that did not have a matching opening tag, and you tried to POP the stack anyhow. keep track of the start of the array to not allow an underflow to occur.

.

Hi Jephtha,

for(bracket_char=0;c!=EOF && c!='\n' && c!=B_CLOSE && bracket_char<B_CHAR_MAX;bracket_char++, c=getchar())

This for loop is important to get the input from the string "C", its pseudo says:

For (current_char_of_the_destination_array; IF C is not the end of file AND We didn't press Enter yet, AND C is not '>', AND current_char_of_the_destination_array is less than 7;) then

store the current character from C and put it into the two dimensional array 'bracket[bracket_nr][bracket_char]' 

then increment the column count(bracket_char), and get a new character and store it in C.

I have done that (fully storing the string into the 2d array in form of single tags). Now I am trying to take the tags that start with forward slash and store them into new array. Subtracting the last one from the original gives the array with the tags that have no forward slashes.


I have question here, I think this could solve the problem.

When I use putchar does it move the output and show it on the monitor OR shows a copy of that input and leaves the data in its variable ?

for example, if I said:

char x='A';
putchar(x);
y=x;

Will Y take the value of X after using putchar on X, or Y could get NIL ?

the answer to your specific question: Y will equal X will equal the char 'A'.

putchar does not modify the character in any way, it only displays it, and moves the stdout file pointer to the next position.

---

and yeah, i can cipher out *what* that FOR statement does well enough, what i dont understand is *why* anyone would would code it this way. it's almost as if it were deliberately confusing.

if it were just an isolated occurance, i might try and work with it. but it is part of a function that is called from within another loop in the caller that is likewise conditional upon input from stdin. (the other c=getchar() call in main)

the whole thing is a godawful mess. kill it now and start over. see my previous post on how you should do this. last night i wrote the entire program from scratch, to accept input from either stdin or a file, in about 30 minutes, although it's not fully debugged. if it takes you 5 hours to do the same, you'll be all the better for it.

But hey, you don't have to take my word for it. If you think your program is worth saving, maybe you can find someone who'll take the time to sort through it. I can't/won't. I don't have the patience.

the only way i can see is to start over and do it in a more "traditional" way with simple parsing functions, and simple push and pops to and from a simple stack.

Anyone else here, please jump in if I'm wrong.


.

Dude, it is just programming, I mean take it easy :p

So, could you simply show me in in very "simple" for loops how to read from a string, let's say "char S = "<hi><you>whatever</you></hi>"" and separates the opening tags from the closing tags of that string and stores each of them in separate array or stack ?

As simple as you think !?

This article has been dead for over six months. Start a new discussion instead.