I want to read a Source Code file and separate the lexemes (words) and want to track the line number with the separated words?
I just come up with the raw code like that...
i) A struct holding a string , int pair;
ii) making the link list of above struct

for example

class Container {
 
   struct Words{ 
      string word;
      int lineNo;
};
//
vector<Words> vWord;
int& operator[](string const& Value);  //use to get the line number corresponds to given value;
};

please tell me is there any optimal solution than this.........

Recommended Answers

All 8 Replies

What happens if the same word appears on a single line more than once?

You could try map<string,vector<int> > to associate a word with a list of line numbers.

Member Avatar for iamthwee

What? Explain better.

I m Reading a sourc program and separating the Words in the source program i need a structure to hold the information about those words with respect to their line numbers

for example

int a;
int main() 
{
       return 0;
}

i need a data structure that will allow me to store information about the words.......
say,
Words Line Numbers
int 1
a 1
int 2
main 2
( 2
etc...

BTW how do u propose to parse the words in the prog.

For eg.

int main()

how will u know main has ended and now u have to look for ( and then ). I hope u are getting wat i am trying to say.

Just curious. wanted to know which technique u using.
Bye.

Check this:

#include <iostream>
#include <vector>
#include <string>
#include <sstream>
#include <map>

using namespace std;

struct Infos
{
    int line_num;
    string word;
    int words_num;
};
int main ()
{

    map <string, int> tmp_word_num;
    int line_count = 0;
    Infos info;
    string str, word;
    vector <Infos> v;
    map <string, int> :: iterator it;
    vector <Infos> :: iterator it_v;

    while (getline(cin, str))
    {
        stringstream sstr (str);
        ++line_count;
        while (sstr >> word)
        {
            tmp_word_num[word]++;
        }
        
        for (it = tmp_word_num.begin(); it != tmp_word_num.end(); ++it)
        {
            info.line_num = line_count;
            info.word = it->first;
            info.words_num = it->second;
            v.push_back(info);
        }
        tmp_word_num.erase(tmp_word_num.begin(), tmp_word_num.end()); 
    }
    for (it_v = v.begin(); it_v != v.end(); ++it_v)
    {
        cout << "Line: " << it_v->line_num << " word: " << it_v->word << ":" << it_v->words_num << endl;
        
    }
    
    return 0;
}

maybe you'll find it helpful.

Member Avatar for iamthwee

I m Reading a sourc program and separating the Words in the source program i need a structure to hold the information about those words with respect to their line numbers

for example

int a;
int main() 
{
       return 0;
}

i need a data structure that will allow me to store information about the words.......
say,
Words Line Numbers
int 1
a 1
int 2
main 2
( 2
etc...

First of all you have to define what the hell a word is? For instance, in your example you have defined one open parenthesis as a word:

(                 2

So define what you mean by 'word' first.

iamthewee sorry~! for incomplate description... basically i m developing a psuedo compiler.. i need every word (i.e. lexemes)..

Micko Your Code is almost the same i was thinking about may be our minds resembles :d..... just kidding....

but the problem is that you are using getline(any_string_stream, string_Buffer); this causes the basic_io operations to be done more than once, instead we can do better than that i.e. we can just read the entire stream in the string and then process which might result in faster one............................ Do you agree....
although The data strcuture you are using and i am using are the same :d.... map & vector; :).

by the Way thanks...........
I'll Post my Code tomorrow regarding this.....but before that i want to write the optimum code....
can do better i know :).. with your help/discussion

This code I wrote earlier when I need to make word statistics in text. by word i assumed everything that is separated with whitespaces. Weel you coud to use rdbuf() member function of ifstream to read entire file if you want. I placed my code to help you out, it's not supposed to be solution.
Cheers

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.