I am looking to read a C++ std::string, then passing that std::string to a function which would analyse it, then extract Unicode symbols & simple ASCII symbols from it.
I searched many tutorials online, but all of them mentioned that standard C++ does not fully support Unicode format. Many of them mentioned to use ICU C++.
This is my C++ program for understanding the very basic of above functionalities. It reads the raw string, converts to ICU Unicode String & prints that:
#include <iostream>
#include <string>
#include "unicode/unistr.h"
int main()
{
std::string s="Hello☺";
// at this point s contains a line of text
// which may be ANSI or UTF-8 encoded
// convert std::string to ICU's UnicodeString
icu::UnicodeString ucs = icu::UnicodeString::fromUTF8(icu::StringPiece(s.c_str()));
// convert UnicodeString to std::wstring
std::wstring ws;
for (int i = 0; i < ucs.length(); ++i)
ws += static_cast<wchar_t>(ucs[i]);
std::wcout << ws << std::endl;
}
Expected Output:
Hello☺
Actual Output:
Hello?
Please suggest what am I doing wrong. Also suggest any alternative/simpler approaches
Thanks
Aucun commentaire:
Enregistrer un commentaire