vendredi 18 mai 2018

How to store multiple utf8 symbols (uint32_ts) from utfcpp as a string?

Using the utfcpp lib, one could split a string ('哈哈哈') encoded in utf8 into several uint32_ts (or symbols(21704, 21704, 21704)) which act like chars for std::string.

In this situation, what's the best solution store the uint32_t ('character') sequences (as a 'string')?

For example, putting (21704, 21704, 21704) into a vector<uint32_t> will require iterating the vector for 'string comparison', which seems slower than the real version of std::string.

Thanks in advance.

Aucun commentaire:

Enregistrer un commentaire