So I am using C++11, and the input is a text file encoded in UTF-8 encoding, and what the program does is to read the text file line by line, and search whether a given character is present in the line, since UTF-8 is compatible with ASCII, that means new line is same in ASCII and UTF-8, I'm not using wstring
, I'm just using string
, what I do is get the UTF-8 encoded bytes, and create a std::string
of that, e.g., I need to search for 值(U+503C)
in each line, and from here we can see the UTF-8 encoded bytes for this character is 0xE5 0x80 0xBC
, so I have something like this, does this look right and will work?
ifstream input(utf8file);
string line;
const string t("\xe5\x80\xbc"); // utf8 bytes for 值
while (input) {
getline(input, line);
if (line.find(t) != string::npos) {
do_found();
} else {
not_found();
}
}
Aucun commentaire:
Enregistrer un commentaire