samedi 29 juillet 2023

What is key difference between regex_iterator and regex_token_iterator?

Looking at the regex_iterator and regex_token_iterator I see that the key difference is the value_type which is:

  • match_results<BidirIt> for the regex_iterator
  • sub_match<BidirIt> for the regex_token_iterator

At the same time the examples for them (on these pages) show opposite behavior:

  • regex_iterator splits by definition of the token itself in regex
  • regex_token_iterator splits by delimiters description in regex

although, this in not specified in the aforementioned documents.

In the What is the difference between regex_token_iterator and regex_iterator? it is specified that regex_token_iterator could have last parameter -1, 0, or 1 and I can’t find this at the regex_token_iterator. Is this is a kind of common knowledge that I miss or the document misses this?

My specific question is what makes them so different that the code

#include <iostream>
#include <string>
#include <regex>

int main()
{
    std::string input_str = "hi, world";
    const std::regex  reg_ex(R"(\S+\w+|[,.])");

    std::vector<std::string> tokens { 
        std::sregex_token_iterator(input_str.begin(), input_str.end(), reg_ex, 0), 
        std::sregex_token_iterator() 
    };

    for (auto& item : tokens)
    {
        std::cout << item << std::endl;
    }
}

compiles and works without any issues and the same code based on the sregex_iterator doesn’t compile with many error messages which hide the information about the real issue. Actually, it can't make vector<string> from the iterators.

See the demo with the issue.

Is there any way to handle results of the regex_iterator in the same way as results of the sregex_token_iterator and pack them in vector<string> directly as in the example above?

Aucun commentaire:

Enregistrer un commentaire