I need to parse a file using C++11 regex (not Boost, not external libraries, not calling bash or python routines, and so forth).
First of all, I need to discard the lines which do not contain any URL domain.
Then, from those kept lines, I need to extract the domains and return them in a container.
Example: from http://ift.tt/2xzMNms, I need to preserve only: www.xjr.com.
So, I built up my own following regex:
std::regex urlRgx("((([[:print:]])?)*(https|http)://(www|ww2|web))\\.
([[:alnum:]]{2,256}.[[:alpha:]]{2,4})((/([[:alnum:]])?)*(([[:print:]])?)*)");
Because of different reasons (and headaches with regex incompatibilities), I finished compiling and running my program, without issues, in CodeBlocks 16.01, under Windows 10 (using MinGW: g++-6.*).
BUT, when I came to my desired environment, Ubuntu 14.04, the program compiled successfully but, while running, it simply gets stuck, and provides no output.
Always in Ubuntu 14.04, I tried:
-
Compiling on CodeBlocks 16.01, with g++-6.* and clang 3.8. Result: no output.
-
Compiling from command line, with same compilers. Result: no output.
Note: I run the program, but it never ends; it just remains stuck, with the blinking cursor...
Of course, I commented the regex, and tried other outputs and arrived to identify that the issue is in the regex itself.
I wonder if there is some "automatic" conversion done by minGW, in Windows, which is hiding some error in the regex. And then, in POSIX, it is simply not working.
Could you please provide any useful advice?
Aucun commentaire:
Enregistrer un commentaire