stringparsingunicodegoncr

How to get the decimal value for a rune in Go?


I need to parse some strings and clean them by parsing any special char as 'â' TO &#226. This is decimal encoding. I know how to parse it to Unicode with this, but I will need the decimal code. The whole idea is to replace those special chars and return the whole string with the conversion if they contain special characters. For example:

text := "chitâra"
text := parseNCRs(text) //can be by reference
parseNCRs(&text) //or passing the pointer
fmt.Println(text) //Outputs: "chitâra"

Solution

  • Range over the string to get the numeric values of the runes.

    func escape(s string) string {
      var buf bytes.Buffer
      for _, r := range s {
        if r > 127 {
            fmt.Fprintf(&buf, "&#%d;", r)
        } else {
            buf.WriteRune(r)
        }
      }
      return buf.String()
    }
    

    playground

    If you are escaping for HTML or XML, then you should also handle other special chracters:

    func escape(s string) string {
      var buf bytes.Buffer
      for _, r := range s {
        if r > 127  || r == '<' || r == '>' || r == '&' || r == '"' || r = '\'' {
            fmt.Fprintf(&buf, "&#%d;", r)
        } else {
            buf.WriteRune(r)
        }
      }
      return buf.String()
    }