vendredi 26 février 2016

std::ctype::narrow segmentation fault with utf-8 locale

I was trying to better understand the regular expression scanner implementation (std::__details::_Scanner<>) in <regex> on g++ (GCC) 4.9.2 (Linux).

Running the scanner in the POSIX locale worked fine. However, when I switched to a UTF-8 locale, I started getting segmentation fault in std::ctype::narrow.

Some sample code which reproduces what happens in the scanner is shown below.

#include <iostream>
#include <locale>

typedef std::ctype<char> CTYPE;

int
main ()
{
  const char* p;
  const char* s = "asdb";
  const CTYPE& ctype(std::use_facet<CTYPE>(std::locale("POSIX")));

  for (p = s; *p != '\0'; ++p)
  {
    char ch = ctype.narrow(*p,'\0');
    std::cout << *p << ": " << (int) ch << std::endl;
  }
}

Setting to locale to "en_GB.UTF-8" instead of "POSIX" (on Linux) causes a segmentation fault.

Is this a bug or have a missed something fundamental (my experience with C++ is limited and my experience with anything related to locales on C++ is practically non-existant)?

Aucun commentaire:

Enregistrer un commentaire