c++icu

Looking for simple practical C++ examples of how to use ICU


I am looking for simple practical C++ examples on how to use ICU.
The ICU home page is not helpful in this regard.
I am not interested on what and why Unicode.
The few demos are not self contained and not compilable examples ( where are the includes? )
I am looking for something like 'Hello, World' of:
How to open and read a file encoded in UTF-8
How to use STL / Boost string functions to manipulate UTF-8 encoded strings etc.


Solution

  • There's no special way to read a UTF-8 file unless you need to process a byte order mark (BOM). Because of the way UTF-8 encoding works, functions that read ANSI strings can also read UTF-8 strings.

    The following code will read the contents of a file (ANSI or UTF-8) and do a couple of conversions.

    #include <fstream>
    #include <string>
    
    #include <unicode/unistr.h>
    
    int main(int argc, char** argv) {
        std::ifstream f("...");
        std::string s;
        while (std::getline(f, s)) {
            // at this point s contains a line of text
            // which may be ANSI or UTF-8 encoded
    
            // convert std::string to ICU's UnicodeString
            UnicodeString ucs = UnicodeString::fromUTF8(StringPiece(s.c_str()));
    
            // convert UnicodeString to std::wstring
            std::wstring ws;
            for (int i = 0; i < ucs.length(); ++i)
                ws += static_cast<wchar_t>(ucs[i]);
        }
    }
    

    Take a look at the online API reference.

    If you want to use ICU through Boost, see Boost.Locale.