Formatting strings

Question

Uni 0 Newbie Poster

13 Years Ago

I am supposed to read a text file containing "lyrics", then format it out based on special symbols like +,-,[] and spaces.
Here's the lyrics syntax:

Spaces are significant: A single space between two words makes the two words go in two successive beats. Each additional space causes the lyric skip to skip an extra beat.

input = "  Mary  had a little lamb.";
output = "", "", "Mary", "", "had", "a", "little", "lamb."

-: force a single word to be split into multiple beats

input = "a b- c d";
output = "a", "b-", "", "c", "d"

I can't figure out how to come up with a solution for the spaces and minus signs so what have i done so far is to remove strings enclosed in braces, break down the string into tokens then assign the first plus sign as spaces, can somebody out there help me come up with a n algorithm for the spaces and minus signs? Please don't be mean, I'm just a newbie :-/

algorithm c

Edited 13 Years Ago by Narue because: Added code tags to preserve spacing in the example strings

3 Contributors
9 Replies
127 Views
16 Hours Discussion Span
Latest Post 13 Years Ago Latest Post by Uni

All 9 Replies

Narue 5,707 Bad Cop

13 Years Ago

Without knowing the format, all I can suggest is breaking the string down into tokens where each token represents a word, a space, or a special character. Once you have a list of those tokens, you can parse them more easily according to the format's rules:

#include <ctype.h>
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MAX_TOKEN_LENGTH 50
#define MAX_TOKENS       1024

typedef enum token_type {
    WORD,
    SPLIT_BEAT,
    LONG_BEAT
} token_type;

typedef struct token {
    token_type  type;
    char       *value;
} token;

char *copy_string(const char *s)
{
    char *copy = malloc(strlen(s) + 1);
    
    if (copy != NULL)
        strcpy(copy, s);
        
    return copy;
}

bool is_word(char ch)
{
    return !isspace(ch) && ch != '-';
}

int main(void)
{
    const char *src = "a b- c d";
    
    token token_list[MAX_TOKENS];
    size_t n = 0;
    
    for (size_t i = 0; src[i] != '\0'; i++) {
        if (isspace(src[i]))
            token_list[n].type = SPLIT_BEAT;
        else if (src[i] == '-')
            token_list[n].type = LONG_BEAT;
        else {
            char word[MAX_TOKEN_LENGTH];
            size_t k = 0;
            
            /* Populate a full word */
            while (src[i] != '\0' && is_word(src[i]))
                word[k++] = src[i++];
            
            word[k] = '\0';
            --i; /* Fix the index for the next iteration */
            
            /* Eschew error checking for brevity */
            token_list[n].type = WORD;
            token_list[n].value = copy_string(word);
        }
        
        ++n;
    }
    
    const char *type_names[] = {
        "WORD",
        "SPLIT_BEAT",
        "LONG_BEAT"
    };
    
    for (size_t i = 0; i < n; i++) {
        printf("%s", type_names[token_list[i].type]);
        
        if (token_list[i].type != WORD)
            putchar('\n');
        else {
            printf("\t-- %s\n", token_list[i].value);
            
            /* Release allocated memory when you're done */
            free(token_list[i].value);
        }
    }
    
    return 0;
}

Narue 5,707 Bad Cop

13 Years Ago

It means your compiler supports the latest C standard (which I used in my code). Turn on C99 mode to enable those features.

Narue 5,707 Bad Cop

13 Years Ago

Read your compiler's documentation.

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

Uni 0 Newbie Poster · Answer 1 · 2011-10-12T18:56:45+00:00

When I tried to compile the code, I got an error that says, the for loop initial declarations are only allowed in C99 mode. What does that mean?

Uni 0 Newbie Poster · Answer 2 · 2011-10-12T19:34:38+00:00

Uni 0 Newbie Poster

13 Years Ago

How can I switch to the C99 mode?

Uni 0 Newbie Poster · Answer 3 · 2011-10-12T21:18:19+00:00

Once you have a list of those tokens, you can parse them more easily according to the format's rules:

Got to run the code finally and it outputs:

WORD   -- a
SPLIT_BEAT
WORD   -- b
LONG_BEAT
SPLIT_BEAT
WORD   -- c
SPLIT_BEAT
WORD   -- d

never knew how wo use that kind of c language, the libraries are quite unfamiliar, but thanks for the idea :) The thing i can't come up for the solution is the spaces.
I can determine the position of the spaces but if i use strtok with the spaces as delimiter, the function doesn't care whether there are multiple spaces in between characters as long as they are spaces. What I'm supposed to do is when there are more than one spaces in between characters in a line, "make the string tokens skip beats, as opposed to just make them go on successive beats", cause this is part of a bigger problem I've sunk into. So what I've thought of, as for now, is to enclose those "extra" spaces in curly braces for later modifications, and how to do that. =)

Narue 5,707 Bad Cop Team Colleague · Answer 4 · 2011-10-12T22:32:16+00:00

I can determine the position of the spaces but if i use strtok with the spaces as delimiter, the function doesn't care whether there are multiple spaces in between characters as long as they are spaces.

strtok() won't give you sufficient control for this task. Is there something wrong with the tokenizing loop from my program? It doesn't recognize '+' or square brackets (you didn't mention what they mean), but those are easily added. Multiple spaces will be recognized as multiple SPLIT_BEAT tokens, which can then be parsed into whatever result you want.

cse.avinash -1 Junior Poster · Answer 5 · 2011-10-12T22:39:41+00:00

For

1.
      input = " Mary had a little lamb.";
   2.
      output = "", "", "Mary", "", "had", "a", "little", "lamb."

and

1.
      input = "a b- c d";
   2.
      output = "a", "b-", "", "c", "d"

what pattern I am noticing is that:-
1) There is a double quote at the beginning and end of the output

2)After every space found in the string there is double quote before and after the space and separated with comma.

Uni: I think this will help you to develop the logic, prepare the code if you will still have problem,post the code we will surely help you..

Uni 0 Newbie Poster · Answer 6 · 2011-10-12T22:46:22+00:00

No there's nothing wrong with the loop. The syntax is just hard for a newbie to understand, the program works well, as for the plus signs and braces,

Braces: anything found in braces are ignored and do not appear in the final output (including the braces).

+: join two strings together

input = "w a+ b c";
output = "w", "a ", "b", "c"

input = "w+ +a b c";
output = "w ", " a", "b", "c"

input = "w+ +a b c";
output = "w ", " a", "b", "c"

input = "b ++c";
output = "b", "+c"

Just taking me a hard time to figure out how to concatenate this to your program, the way you did to the spaces. Your code is great in fact, it perfectly recognizes the extra spaces :)

Formatting strings

Recommended Answers Collapse Answers

All 9 Replies

Recommended Answers