How to split and iterate a string separated by a specific character in C++?

String splitting in C++

Splitting strings by a delimiter is a common task in systems programming. C++ doesn’t provide a built-in split function like Python, but you can implement one efficiently using the standard library.

Using std::istringstream and getline()

The standard approach uses std::istringstream with std::getline(). The getline() function accepts a delimiter parameter, allowing you to extract tokens until that character is encountered.

Here’s a practical implementation:

#include <string>
#include <sstream>
#include <vector>

std::vector<std::string> split(const std::string& str, char delim)
{
  std::vector<std::string> result;
  std::istringstream ss{str};
  std::string token;
  while (std::getline(ss, token, delim)) {
    if (!token.empty()) {
      result.push_back(token);
    }
  }
  return result;
}

Note the use of a const reference for the input string — this avoids unnecessary copying.

Complete example

#include <iostream>
#include <string>
#include <sstream>
#include <vector>

std::vector<std::string> split(const std::string& str, char delim)
{
  std::vector<std::string> result;
  std::istringstream ss{str};
  std::string token;
  while (std::getline(ss, token, delim)) {
    if (!token.empty()) {
      result.push_back(token);
    }
  }
  return result;
}

int main()
{
  auto v1 = split("a string   separated by space", ' ');
  std::cout << "Split by space:" << std::endl;
  for (const auto& s : v1) {
    std::cout << "  " << s << std::endl;
  }

  auto v2 = split("a,string,separated,by,comma", ',');
  std::cout << "\nSplit by comma:" << std::endl;
  for (const auto& s : v2) {
    std::cout << "  " << s << std::endl;
  }

  return 0;
}

Output:

Split by space:
  a
  string
  separated
  by
  space

Split by comma:
  a
  string
  separated
  by
  comma

Handling empty tokens

The implementation above skips empty tokens with the if (!token.empty()) check. This is useful when you have consecutive delimiters or leading/trailing delimiters. If you need to preserve empty fields, remove that condition:

std::vector<std::string> split_keep_empty(const std::string& str, char delim)
{
  std::vector<std::string> result;
  std::istringstream ss{str};
  std::string token;
  while (std::getline(ss, token, delim)) {
    result.push_back(token);
  }
  return result;
}

With this version, "a,,b" split by ',' produces ["a", "", "b"] instead of ["a", "b"].

Splitting by multiple delimiters or strings

For more complex splitting scenarios with multiple delimiter characters, you can extend the approach:

std::vector<std::string> split_any(const std::string& str, const std::string& delims)
{
  std::vector<std::string> result;
  size_t start = 0;
  size_t end = str.find_first_of(delims);

  while (end != std::string::npos) {
    if (end > start) {
      result.push_back(str.substr(start, end - start));
    }
    start = end + 1;
    end = str.find_first_of(delims, start);
  }

  if (start < str.length()) {
    result.push_back(str.substr(start));
  }

  return result;
}

This version handles input like "a:b;c:d" split by ":;" producing ["a", "b", "c", "d"].

Performance considerations

  • Use const references for string parameters to avoid copies
  • Reserve space in the vector if you know the approximate token count: result.reserve(count)
  • For very large strings, consider parsing in-place or using string views (C++17+)
  • The istringstream approach is idiomatic and efficient for most use cases

For C++17 and later, you might also explore std::string_view for zero-copy token extraction, though it requires more manual management of iteration state.

Similar Posts