lundi 18 mai 2020

How to find the exact substring with regex in c++11?

I am trying to find substrings that are not surrounded by other a-zA-Z0-9 symbols.

For example: I want to find substring hello, so it won't match hello1 or hellow but will match Hello and heLLo!@#$%. And I have such sample below.

    std::string s = "1mySymbol1, /_mySymbol_ mysymbol";
    const std::string sub = "mysymbol";
    std::regex rgx("[^a-zA-Z0-9]*" + sub + "[^a-zA-Z0-9]*", std::regex::icase);
    std::smatch match;

    while (std::regex_search(s, match, rgx)) {
        std::cout << match.size() << "match: " << match[0] << '\n';
        s = match.suffix();
    }

The result is:

1match: mySymbol
1match: , /_mySymbol_
1match: mysymbol

But I don't understand why first occurance 1mySymbol1 also matches my regex?

How to create a proper regex that will ignore such strings?

UDP

If I do like this

std::string s = "mySymbol, /_mySymbol_ mysymbol";
    const std::string sub = "mysymbol";
    std::regex rgx("[^a-zA-Z0-9]+" + sub + "[^a-zA-Z0-9]+", std::regex::icase);

then I find only substring in the middle

1match: , /_mySymbol_

And don't find substrings at the beggining and at the end.

Aucun commentaire:

Enregistrer un commentaire