Splitting a String by Another String in C++: A Flexible Utility Function
In this post, we will explore a flexible utility function for splitting a string based on a given delimiter using C++ and the standard 
The C++ Utility Function to Split a String by Another String
Background: regular expressions are an essential tool for text processing and pattern matching. They provide a concise and powerful way to express complex search patterns and are widely used for tasks such as text validation, data extraction, and string manipulation. The C++ Standard Library offers the 
The <regex> library in C++ includes several key components, such as the std::regex class which represents a compiled regular expression, std::regex_iterator and std::sregex_iterator which are iterators for traversing matches in a given input, the std::regex_replace and std::regex_search which are algorithms for searching and replacing patterns within strings, and the std::regex_token_iterator and std::sregex_token_iterator for tokenizing strings based on a given pattern.
Here’s the code snippet for our utility function making use of the regex standard library:
#include <regex>
std::vector<std::string>
split_str(const std::string& str, const std::string& delim_str) {
  std::regex delim{delim_str};
  std::vector<std::string> results;
  std::sregex_token_iterator end;
  std::sregex_token_iterator iter(str.begin(), str.end(), delim, -1);
  for ( ; iter != end; ++iter) {
    std::string split(*iter);
    if (split.size()) results.push_back(split);
  }
  return results;
}Breaking Down the String Splitting C++ Function Implementation
Let’s go through the code step by step:
- First, we include the <regex>header, which provides us with the necessary tools to work with regular expressions in C++.
- We define a function called split_strthat takes two parameters: a conststd::string&calledstr, which is the input string to be split, and a conststd::string&calleddelim_str, which is the delimiter string to be used for splitting.
- We create a std::regexobject calleddelimwith the delimiterdelim_str. This is the pattern that will be used to split the input string.
- We declare a std::vector<std::string>calledresultsto store the resulting substrings after splitting.
- We define two std::sregex_token_iteratorobjects:endanditer. Theendobject serves as a sentinel value indicating the end of the sequence. Theiterobject is initialized with the beginning and end of the input string, the delimiter pattern, and-1as the submatch value. The-1value tells the iterator to return the unmatched parts of the input (i.e., the substrings between the delimiters).
- We use a for loop to iterate through the tokens returned by the iterator. Inside the loop, we create a std::stringobject calledsplitand initialize it with the current token.
- We check if the size of the split string is non-zero. If it is, we add it to the resultsvector.
- Finally, we return the resultsvector containing the substrings.
Using the C++ Utility Function to Split a String
Here’s a C++ example of how you can use the split_str function:
#include <iostream>
#include <vector>
#include <string>
#include <regex>
std::vector<std::string>
split_str(const std::string& str, const std::string& delim_str) {
  std::regex delim{delim_str};
  std::vector<std::string> results;
  std::sregex_token_iterator end;
  std::sregex_token_iterator iter(str.begin(), str.end(), delim, -1);
  for ( ; iter != end; ++iter) {
    std::string split(*iter);
    if (split.size()) results.push_back(split);
  }
  return results;
}
int main() {
  std::string input = "Hello::World::from::C++";
  std::string delimiter = "::";
  std::vector<std::string> results = split_str(input, delimiter);
  for (const auto& word : results) {
    std::cout << word << std::endl;
  }
  return 0;
}This code snippet would output:
$ g++ -std=c++20 split-string-by-string-example.cpp -o s && ./s
Hello
World
from
C++That’s it! We’ve created a flexible and reusable utility function to split a string using the <regex> library. You can easily modify the delimiter string to fit your needs, making this function highly adaptable for various text processing tasks.