| | |
splitting a string using ispunct
Please support our C++ advertiser: Intel Parallel Studio Home
![]() |
•
•
Join Date: Dec 2008
Posts: 57
Reputation:
Solved Threads: 0
I am reading a line of text from a file and need to split it into tokens.
I need the tokens to be
How do i split a string into tokens
here is my code so far
•
•
•
•
This is a test and test number is: test(001)
•
•
•
•
This
is
a
test
and
test
number
is
:
test
(
001
)
here is my code so far
C++ Syntax (Toggle Plain Text)
#include <fstream> #include <iostream> #include <vector> #include <string> #include <sstream> #include <stdio.h> using namespace std; vector <string> SplitString (string line) { //this is where the split needs to occur } bool ispunct (char aCharacter, string delimiters) { int numDelimiters = delimiters.length (); for (int i = 0; i < numDelimiters; i++) { if (aCharacter == delimiters[i]) return true; } return false; } int main() { int i; int a=0; int c; char ch; string line; vector <string> tokens; ifstream myFile("scan.cm"); if (! myFile) { cout << "Error opening output fle" << endl; return -1; } while( getline( myFile, line ) ) { a++; vector <string> newTokens = SplitString (line); int numNewTokens = newTokens.size(); for (int i = 0; i < numNewTokens; i++) { tokens.push_back (newTokens[i]); } cout << "line " << a << ": " << endl; } myFile.close(); return 0; }
0
#2 Oct 26th, 2009
ispunct() may not be specifically what you are looking for to split up the string into tokens.. but I would suggest strtok()
as you can see, strtok() accepts 2 arguments, the first being a c-string (char array) that you want to be tokenized; the second argument is another c-string that you populated with delimeters (any character that will signify the end of your token, such as a ' ' white space or a '.' period.. could be anything you want) The function will return a pointer to the first character of the token whenever it hits one of the delimeters. (which we will save into an array of char* pointers in the example below)
So put strtok() in a loop and let it fly.. it will return *char pointers to every token in the string that you supply whenever it detects of your delimeting characters.
I see that in your code you are using <string> class variables.. which is fine, but remember, strtok() is looking for a c-string char array.. not a <string> class object. Luckily, string objects contain a member function that will return a c-string pointer:
And there ye' be... using strtok() to split up <string> class objects. Ideally, strtok() works best with c-strings because string objects already contain member functions that allow for easy parsing (find(), find_first_of(), and substr() for example.) strtok() of course, is a member of the <cstring> library for a reason.
Enjoy hours of fun strtok()'ing.
C++ Syntax (Toggle Plain Text)
char * strtok ( char * str, const char * delimiters );
as you can see, strtok() accepts 2 arguments, the first being a c-string (char array) that you want to be tokenized; the second argument is another c-string that you populated with delimeters (any character that will signify the end of your token, such as a ' ' white space or a '.' period.. could be anything you want) The function will return a pointer to the first character of the token whenever it hits one of the delimeters. (which we will save into an array of char* pointers in the example below)
So put strtok() in a loop and let it fly.. it will return *char pointers to every token in the string that you supply whenever it detects of your delimeting characters.
I see that in your code you are using <string> class variables.. which is fine, but remember, strtok() is looking for a c-string char array.. not a <string> class object. Luckily, string objects contain a member function that will return a c-string pointer:
#include<cstring>
string input = "This is a sample string.";
char delimeters[3] = {'/', '\n', ' '};
//Dynamic array (of 'char' pointers that will contain the address of each token)
char **tokens = new char*[80];
int i=0;
while(i < input.size())
{
//Let's turn this <string> into a c-string so strtok() will be happy teehee
tokens[i] = strtok(input.c_str(), delimeters);
i++;
}
i=0;
while(tokens[i] != NULL)
{
//Dereferencing a 'point-to-a-pointer'
cout << "\nWord number " << i << " is " << **tokens[i];
i++;
}And there ye' be... using strtok() to split up <string> class objects. Ideally, strtok() works best with c-strings because string objects already contain member functions that allow for easy parsing (find(), find_first_of(), and substr() for example.) strtok() of course, is a member of the <cstring> library for a reason.
Enjoy hours of fun strtok()'ing.
Last edited by Clinton Portis; Oct 26th, 2009 at 9:09 pm.
0
#3 Oct 27th, 2009
Minor error, here is the updated code:
excerpt about strkok(): "This end of the token is automatically replaced by a null-character by the function, and the beginning of the token is returned by the function."
If I forced the loop with .size() it would have made strtok() work more times than it had to and would have ran off the end of the c-string char array.
while(tokens[i] != NULL)
{
//Let's turn this <string> into a c-string so strtok() will be happy teehee
tokens[i] = strtok(input.c_str(), delimeters);
i++;
}excerpt about strkok(): "This end of the token is automatically replaced by a null-character by the function, and the beginning of the token is returned by the function."
If I forced the loop with .size() it would have made strtok() work more times than it had to and would have ran off the end of the c-string char array.
Last edited by Clinton Portis; Oct 27th, 2009 at 9:40 am.
![]() |
Similar Threads
- Splitting a string (Java)
- Splitting a string? (C)
- splitting a string to many integers (C++)
Other Threads in the C++ Forum
- Previous Thread: How to write a simple programe on C++??
- Next Thread: relocation error? allocate exception?
| Thread Tools | Search this Thread |
api array arrays based beginner binary c++ c/c++ calculator char char* class classes code compile compiler console conversion count delete deploy desktop directshow dll download dynamic dynamiccharacterarray encryption error file forms fstream function functions game givemetehcodez google graph gui homeworkhelp homeworkhelper iamthwee ifstream input int integer java lib linkedlist linker linux list loop looping loops map math matrix memory news number numbertoword output parameter pointer problem program programming project python random read recursion recursive reference return rpg sorting string strings struct temperature template templates test text text-file tree unix url variable vector video visual visualstudio win32 windows winsock wordfrequency wxwidgets





