Below, I produce both broken code and a fixed version of the same. The problem is that I am not able to fully explain to myself why the former doesn't work but the latter does. I obviously need to review some very basic concept of the C++ language: could you provide pointers as to what I should review, and possibly also give an explanation as to why I get the results I get with the broken code.
In the '../docs/' directory referred in the code, I simply used the 'touch' command on linux to create a number of doc......html files of various length.
#include <iostream>
#include <regex>
#include <boost/filesystem.hpp>
namespace fs = boost::filesystem;
int main() {
fs::path p("../docs/");
for (auto& dir_it : fs::directory_iterator(p)) {
std::regex re = std::regex("^(doc[a-z]+)\\.html$");
std::smatch matches;
// BROKEN HERE:
if (std::regex_match(dir_it.path().filename().string(), matches, re)) {
std::cout << "\t\t" <<dir_it.path().filename().string();
std::cout << "\t\t\t" << matches[1] << std::endl;
}
}
return 0;
}
Produces:
documentati.html ati
documentationt.html �}:ationt
document.html document
documenta.html documenta
docume.html docume
documentat.html documentat
docum.html docum
documentatio.html ��:atio
documen.html documen
docu.html docu
documentation.html ��:ation
documaeuaoeu.html ��:aoeu
Note 1: The bug above is triggered with filenames which are above a certain length. I only understand that's because the std::string object is resizing itself.
Note 2: The above code is very similar than the code used in the following question, but with boost::regex_match instead of std::regex_match: Can I use a mask to iterate files in a directory with Boost?
It used to work for me before as well, but now I use GCC 5.4 instead of GCC 4.6, std::regex instead of boost::regex, C++11, and a much newer version of boost::filesystem. Which change is relevant, that caused working code to get broken?
Fixed:
#include <iostream>
#include <regex>
#include <boost/filesystem.hpp>
namespace fs = boost::filesystem;
int main() {
fs::path p("../docs/");
for (auto& dir_it : fs::directory_iterator(p)) {
std::regex re = std::regex("^(doc[a-z]+)\\.html$");
std::smatch matches;
std::string p = dir_it.path().filename().string();
if (std::regex_match(p, matches, re)) {
std::cout << "\t\t" <<dir_it.path().filename().string();
std::cout << "\t\t\t" << matches[1] << std::endl;
}
}
return 0;
}
produces:
documentati.html documentati
documentationt.html documentationt
document.html document
documenta.html documenta
docume.html docume
documentat.html documentat
docum.html docum
documentatio.html documentatio
documen.html documen
docu.html docu
documentation.html documentation
documaeuaoeu.html documaeuaoeu
Using boost 1.62.0-r1 and gcc (Gentoo 5.4.0-r3), the boost::filesystem documentation does not appear to provide any clear indication as to what path().filename().string() returns: http://ift.tt/2upPFE1
It appears that it depends:
Why does boost::filesystem::path::string() return by value on Windows and by reference on POSIX?
Aucun commentaire:
Enregistrer un commentaire