Using the utfcpp lib, one could split a string ('哈哈哈') encoded in utf8 into several uint32_ts (or symbols(21704, 21704, 21704)) which act like chars for std::string.
In this situation, what's the best solution store the uint32_t ('character') sequences (as a 'string')?
For example, putting (21704, 21704, 21704) into a vector<uint32_t> will require iterating the vector for 'string comparison', which seems slower than the real version of std::string.
Thanks in advance.
Aucun commentaire:
Enregistrer un commentaire