dimanche 1 janvier 2017

C++11 regex ranges understanding

I'm trying to write a regex that will match particular symbols and ranges of symbols in ascii-table. That regex should be quite complex but I failed to create it at once and so I decided to make a much simpler regex first but even this time I failed. So I ask about this simple one regex.

#include <iostream>
#include <string>
#include <regex>

int main ()
{
    std::string s ("$");
    std::smatch m;

    std::regex e ("([\\x21\\x23-\\x25]+)");

    std::regex_search (s,m,e);

    for (int i = 1; i < m.size(); i++)
    {
        std::cout << m[i] << " ";
    }

    std::cout << std::endl;

    return 0;
}

Here the ascii-codes used in the regex:

Decimal  Octal  Hex   Binary          Value
033      041    021   00100001        !    (exclamation mark)

035      043    023   00100011        #    (number sign)
036      044    024   00100100        $    (dollar sign)
037      045    025   00100101        %    (percent)

So in my code example I try to get '$' matched but the regex fails: I get empty match. However, if I use

std::regex e ("([\\x23-\\x25]+)");      //or
std::regex e ("([\\x23-\\x25\\x21]+)"); //or
std::regex e ("([\\x21\\x24-\\x25]+)"); //or
std::regex e ("([\\x21\\x23-\\x24]+)"); 

the '$' is matched properly and I get non-empty match result.

So I really fail to understand the logic of all this. Could you please give me a hint what's the problem because as I know the order of the ranges (for example, a-z) and single symbols (for example, _) in regex groups [ ] is not relevant?

Aucun commentaire:

Enregistrer un commentaire