Ways to Split a String in C++
1. Introduction
Strings are one of the most interesting parts of the software development. We usually get input from a user as a string then process it to obtain the results. Dealing with the strings is a bit tricky so we need to know ways to handle frequent problems. Splitting a string(also known as tokenizing a string) can be perceived as dividing it into parts which we are more interested. Suppose that you need to extract words from a sentence where these words are separated by comma . You need an efficient and simple way to get the part you are interested in — the words — . Same thing applies for other types, you may need to extract numbers, special characters and even whitespaces. In this article, we will inspect the methods we can use when we need to split a string.
2. Logic of Splitting a String
We need to know essentials of this process:
String: The string we need to split.
Delimiter: The character or characters, even maybe another string that be used to divide the string from wherever it is found.
Given a string with different characters below and assume that our delimiter is the comma(‘,’) character :
This, is a, string, to split
The splitted string would be like this:
This
is a
string
to split
Where strings in each row represents a token. Now, if you have figured it out, we can continue with the implementations.
3. Using C++ Built-in(Native) Functions
C++ has dozens of functions that are dedicated for string operations. Also, to split a string, you can combine these functions.
3.1. Using strtok() function
This function exists in a C library which is named “string.h” or <cstring> in C++. You can also use it in C++, but first you have to convert C++ string to classical C string since it is a C function, then, you can convert it back to C++ string.
The strtok() Prototype and Explanation:
char * strtok ( char * str, const char * delimiters );
It returns the token, takes string and delimiters as input. For the first call, you need to pass a string for the first argument. But then, you have to pass a null pointer every time to continue splitting.
Algorithm:
1.Get input as string
2.Convert C++ string to C string
3.while delimiter is found in string:
3.1.Split the string using the strtok() function
4.Postprocess
Code:
Output:
Splitted string:
Split
this
string.
3.2. Using string::rfind() function with string::substr() function
We can use string::rfind() function with string::substr() function. We use string::rfind() function to find the position of the delimiter(s) which we are interested in, starting from right(or reverse) and string::substr() function to create the substring which is the token itself and erase the found token from string.
Algorithm:
1.Get input as string
2.while delimiter is found in string:
2.1.Use rfind() function to find the position of the rightmost delimiter
2.2.Create a substring by excluding the part of the string which starts after rightmost delimiter using substr()
2.3.Create a substring which is token itself using substr()
3.Postprocess
Code:
Output:
5
4
3
2
1
3.3. Using string::rfind() function with string::substr() function and string::erase() function
It is similar to 3.2. except we use string::erase() which is an inplace function to erase a part of a string. First argument for erase() function is position and second argument is the length for spanning. In example, if first parameter is 0 and second parameter is 8, it starts from position 0 and erases the 8 characters after this position.
Algorithm:
1.Get input as string
2.while delimiter is found in string:
2.1.Use rfind() function to find the position of the rightmost delimiter
2.2.Create a substring by excluding the part of the string which starts after rightmost delimiter using erase()
2.3.Create a substring which is token itself using substr()
3.Postprocess
Code:
Output:
5
4
3
2
1
3.4. Using string::find() function with string::substr() function
We can use string::find() function with string::substr() function. We use string::find() function to find the position of the delimiter(s) which we are interested in, starting from left(or beginning) and string::substr() function to create a substring which is the token itself and erase the found token from string.
Algorithm:
1.Get input as string
2.while delimiter is found in string:
2.1.Use find() function to find the position of the leftmost delimiter
2.2.Create a substring by excluding the part of the string which starts after leftmost delimiter using substr()
2.3.Create a substring which is token itself using substr()
3.Postprocess
Code:
Output:
1
2
3
4
5
3.5. Using stringstream with getline() function
We can create a stringstream object with our string and separate it using getline().
Algorithm:
1.Get input as string
2.Create stringstream object by passing the input string
3.while the results returned by getline() is not empty:
3.1.Process tokens
4.Postprocess
Code:
Output:
1
2s
3
4
5
6
4. Storing Tokens
After we have splitted the string, we may need to store them in an array, a vector, a list or any other data structure.
4.1. Using Arrays
We can use the classical C++ arrays to store the tokens:
Output:
1
2s
3
4
5
6
4.2. Using Vectors
We can also use the STL vectors to store our tokens:
Output:
1
2s
3
4
5
6
5. Notes
- Since it is a tutorial article, I have used namespace std but I recommend to not use it for a better practice.
- There are lots of ways to get the same results, which I have not mentioned in this article. You can explore them by examining <string> library.
- I strongly recommend Boost library for string utilities.
Further
You can feel free to get in touch for questions.
My LinkedIn:
My E-mail:
emrecankuran21@gmail.com