splitting a string using ispunct

Please support our C++ advertiser: Intel Parallel Studio Home
Reply

Join Date: Dec 2008
Posts: 57
Reputation: AdRock is an unknown quantity at this point 
Solved Threads: 0
AdRock AdRock is offline Offline
Junior Poster in Training

splitting a string using ispunct

 
0
  #1
Oct 26th, 2009
I am reading a line of text from a file and need to split it into tokens.
This is a test and test number is: test(001)
I need the tokens to be
This
is
a
test
and
test
number
is
:
test
(
001
)
How do i split a string into tokens

here is my code so far
  1. #include <fstream>
  2. #include <iostream>
  3. #include <vector>
  4. #include <string>
  5. #include <sstream>
  6. #include <stdio.h>
  7.  
  8. using namespace std;
  9.  
  10. vector <string> SplitString (string line)
  11. {
  12. //this is where the split needs to occur
  13. }
  14.  
  15. bool ispunct (char aCharacter, string delimiters)
  16. {
  17. int numDelimiters = delimiters.length ();
  18. for (int i = 0; i < numDelimiters; i++)
  19. {
  20. if (aCharacter == delimiters[i])
  21. return true;
  22. }
  23.  
  24. return false;
  25. }
  26.  
  27. int main()
  28. {
  29.  
  30. int i;
  31. int a=0;
  32. int c;
  33. char ch;
  34. string line;
  35. vector <string> tokens;
  36.  
  37. ifstream myFile("scan.cm");
  38.  
  39. if (! myFile)
  40. {
  41. cout << "Error opening output fle" << endl;
  42. return -1;
  43. }
  44.  
  45. while( getline( myFile, line ) )
  46. {
  47. a++;
  48.  
  49. vector <string> newTokens = SplitString (line);
  50.  
  51. int numNewTokens = newTokens.size();
  52.  
  53. for (int i = 0; i < numNewTokens; i++)
  54. {
  55. tokens.push_back (newTokens[i]);
  56. }
  57.  
  58. cout << "line " << a << ": " << endl;
  59.  
  60. }
  61.  
  62. myFile.close();
  63.  
  64. return 0;
  65. }
Reply With Quote Quick reply to this message  
Join Date: Oct 2005
Posts: 329
Reputation: Clinton Portis is on a distinguished road 
Solved Threads: 37
Clinton Portis's Avatar
Clinton Portis Clinton Portis is offline Offline
Posting Whiz
 
0
  #2
Oct 26th, 2009
ispunct() may not be specifically what you are looking for to split up the string into tokens.. but I would suggest strtok()

  1. char * strtok ( char * str, const char * delimiters );

as you can see, strtok() accepts 2 arguments, the first being a c-string (char array) that you want to be tokenized; the second argument is another c-string that you populated with delimeters (any character that will signify the end of your token, such as a ' ' white space or a '.' period.. could be anything you want) The function will return a pointer to the first character of the token whenever it hits one of the delimeters. (which we will save into an array of char* pointers in the example below)

So put strtok() in a loop and let it fly.. it will return *char pointers to every token in the string that you supply whenever it detects of your delimeting characters.

I see that in your code you are using <string> class variables.. which is fine, but remember, strtok() is looking for a c-string char array.. not a <string> class object. Luckily, string objects contain a member function that will return a c-string pointer:

#include<cstring>

string input = "This is a sample string.";
char delimeters[3] = {'/', '\n', ' '};

//Dynamic array (of 'char' pointers that will contain the address of each token)
char **tokens = new char*[80];

int i=0;
while(i < input.size())
{
     //Let's turn this <string> into a c-string so strtok() will be happy teehee
     tokens[i] = strtok(input.c_str(), delimeters);

     i++;
}

i=0;
while(tokens[i] != NULL)
{
     //Dereferencing a 'point-to-a-pointer' 
     cout << "\nWord number " << i << " is " << **tokens[i];
     i++;
}

And there ye' be... using strtok() to split up <string> class objects. Ideally, strtok() works best with c-strings because string objects already contain member functions that allow for easy parsing (find(), find_first_of(), and substr() for example.) strtok() of course, is a member of the <cstring> library for a reason.

Enjoy hours of fun strtok()'ing.
Last edited by Clinton Portis; Oct 26th, 2009 at 9:09 pm.
Reply With Quote Quick reply to this message  
Join Date: Oct 2005
Posts: 329
Reputation: Clinton Portis is on a distinguished road 
Solved Threads: 37
Clinton Portis's Avatar
Clinton Portis Clinton Portis is offline Offline
Posting Whiz
 
0
  #3
Oct 27th, 2009
Minor error, here is the updated code:
while(tokens[i] != NULL)
{
     //Let's turn this <string> into a c-string so strtok() will be happy teehee
     tokens[i] = strtok(input.c_str(), delimeters);

     i++;
}


excerpt about strkok(): "This end of the token is automatically replaced by a null-character by the function, and the beginning of the token is returned by the function."

If I forced the loop with .size() it would have made strtok() work more times than it had to and would have ran off the end of the c-string char array.
Last edited by Clinton Portis; Oct 27th, 2009 at 9:40 am.
Reply With Quote Quick reply to this message  
Reply

Message:


Thread Tools Search this Thread



About Us | Contact Us | Advertise | DaniWeb | Acceptable Use Policy | RSS Feed

©2003 - 2009 DaniWeb® LLC