How can I remove all diacritics from the given UTF8 encoded string using Go? e.g. transform the string "žůžo"
=> "zuzo"
. Is there a standard way?
You can use the libraries described in Text normalization in Go.
Here's an application of those libraries:
// Example derived from: http://blog.golang.org/normalization
package main
import (
"fmt"
"unicode"
"golang.org/x/text/transform"
"golang.org/x/text/unicode/norm"
)
func isMn(r rune) bool {
return unicode.Is(unicode.Mn, r) // Mn: nonspacing marks
}
func main() {
t := transform.Chain(norm.NFD, transform.RemoveFunc(isMn), norm.NFC)
result, _, _ := transform.String(t, "žůžo")
fmt.Println(result)
}