Hi,
here is my problem: I'm writing a small program that needs to read a text file and write the words of this text as a list.
I could only come up with the little piece of code I wrote below and it's not even working. I believe the problem is on line 13 but I'm not so sure.
I would appreciate if you could tell me where the problem is and point me to the right direction.
Also, if you know a more cunning way to do what I intend please do tell me.

Thanks,

#include<iostream>
#include<fstream>
using namespace std;

int main()
{
   ifstream iDic("text");
   ofstream oDic("rawDic");
   char ch;
   while (iDic.get(ch))
   {
      ch= tolower(ch);
      if (ch< 'a' || ch> 'z')
            iDic.putback('\n');
      oDic << ch;
   }
   iDic.close();
   oDic.close();
   return 0;
}

Hi,
here is my problem: I'm writing a small program that needs to read a text file and write the words of this text as a list.
I could only come up with the little piece of code I wrote below and it's not even working. I believe the problem is on line 13 but I'm not so sure.
I would appreciate if you could tell me where the problem is and point me to the right direction.
Also, if you know a more cunning way to do what I intend please do tell me.

Thanks,

#include<iostream>
#include<fstream>
using namespace std;

int main()
{
   ifstream iDic("text");
   ofstream oDic("rawDic");
   char ch;
   while (iDic.get(ch))
   {
      ch= tolower(ch);
      if (ch< 'a' || ch> 'z')
            iDic.putback('\n');
      oDic << ch;
   }
   iDic.close();
   oDic.close();
   return 0;
}

Line 14 - don't try to put back '\n'. What assumptions can you make about your input file? What is a "word"? Something with only letters? What do you do with non-"words"? You need to decide this and you need to decide what your input file looks like and what assumptions you can make about "good " data and what to do with "bad" data. My guess is you want to to do something along these lines, but I don't know till you answer the questions about the data.

string word;
   while (iDic >> word)
   {
        // code here?
        oDic << word << endl;
        // code here?
   }

Make sure to put this at the top:

#include <string>

Thanks VernonDozier,
The idea was evaluate every character of a text file (say a text of a newspaper for example) and separate words. Valid words would be strings of char from 'a' to 'z' only.
At the time a space is encountered I'd make a new line. I fact if I modify the while loop and use the following code instead:

while (iDic.get (ch))
{
ch= tolower(ch);
if (ch == ' ')
iDic.putback('\n');
oDictionary<< ch;
}

the programme does pretty much what I want but I still don't know what to do the commas, etc (that is every thing having an ascii code outside the a-z range)

Edited 3 Years Ago by happygeek: fixed formatting

Thanks VernonDozier,
The idea was evaluate every character of a text file (say a text of a newspaper for example) and separate words. Valid words would be strings of char from 'a' to 'z' only.
At the time a space is encountered I'd make a new line. I fact if I modify the while loop and use the following code instead:


the programme does pretty much what I want but I still don't know what to do the commas, etc (that is every thing having an ascii code outside the a-z range)

Well you got rid of your 'a' through 'z' comparison. Again, I wouldn't use putback. Instead of putting something back, reading it in again, then outputting it, just go straight to outputting it. You can also use isalpha for checking whether it is a letter.

http://www.cplusplus.com/reference/clibrary/cctype/isalpha.html

isspace and ispunct may be helpful too in separating words:

http://www.cplusplus.com/reference/clibrary/cctype/isspace.html
http://www.cplusplus.com/reference/clibrary/cctype/ispunct.html

while (iDic.get (ch))
{
    if (isalpha (ch))
        ch= tolower(ch);
    else
        ch = '\n';

    oDictionary<< ch;
}
Comments
Great answer, just what I needed. Thanks.
This question has already been answered. Start a new discussion instead.