unicoderust

Convert unicode string into NFC in Rust


Let's say I have a std::String, contents unknown, that like "Mañana" has combining characters and I want to convert it to unicode NFC, a la String.prototype.normalize in Javascript or unicodedata.normalize in Python.

I found this crate on crates.io but it seems to contain only methods for working with individual characters. How would I convert an entire string? Convert to bytes and iterate pairwise and check for combining characters using the functions in that crate? What would that even look like in rust?


Solution

  • You can indeed use the unicode_normalization crate. More specifically, check out the nfc method.

    Example:

    "Mañana".nfc().collect::<String>()