Using the utfcpp
lib, one could split a string ('哈哈哈'
) encoded in utf8
into several uint32_t
s (or symbols(21704, 21704, 21704)
) which act like char
s for std::string
.
In this situation, what's the best solution store the uint32_t
('character') sequences (as a 'string')?
For example, putting (21704, 21704, 21704)
into a vector<uint32_t>
will require iterating the vector for 'string comparison', which seems slower than the real version of std::string
.
Thanks in advance.
Aucun commentaire:
Enregistrer un commentaire