stringgoslicerune

string(int), string(int32) and string([]int32) are all valid but string([]int) is invalid - what's the rationale here?


(I'm using Go 1.14.6.)

The following statements would all output the char a

Println(string(int(97) ) )
Println(string(int32(97) ) )
Println(string([]int32{97} ) )

But

Println(string([]int{97} ) )

would cause compile error

cannot convert []int literal (type []int) to type string

The behavior is confusing to me. If it handles string(int) the same as string(int32), why it handles string([]int) different from string([]int32)?


Solution

  • rune which represents a unicode code point is an alias for int32. So effectively string([]int32{}) is the same as string([]rune{}) which converts a slice of runes (something like the charaters of a string) to string. This is useful.

    int is not int32 nor rune, so it's not logical what converting []int to string should be, it's ambiguous, so it's not allowed by the language spec.

    Converting an integer number to string results in a string value with a single rune. Spec: Conversions:

    Conversions to and from a string type

    1. Converting a signed or unsigned integer value to a string type yields a string containing the UTF-8 representation of the integer. Values outside the range of valid Unicode code points are converted to "\uFFFD".

    This is confusing to many, as many expects the conversion result to be the (decimal) representation as string. The Go authors have recognized this, and have taken steps to depcecate and remove it from the language in the future. In Go 1.15, go vet already warns for such conversion. Go 1.15 release notes: Vet:

    New warning for string(x)

    The vet tool now warns about conversions of the form string(x) where x has an integer type other than rune or byte. Experience with Go has shown that many conversions of this form erroneously assume that string(x) evaluates to the string representation of the integer x. It actually evaluates to a string containing the UTF-8 encoding of the value of x. For example, string(9786) does not evaluate to the string "9786"; it evaluates to the string "\xe2\x98\xba", or "☺".

    Code that is using string(x) correctly can be rewritten to string(rune(x)). Or, in some cases, calling utf8.EncodeRune(buf, x) with a suitable byte slice buf may be the right solution. Other code should most likely use strconv.Itoa or fmt.Sprint.

    This new vet check is enabled by default when using go test.

    We are considering prohibiting the conversion in a future release of Go. That is, the language would change to only permit string(x) for integer x when the type of x is rune or byte. Such a language change would not be backward compatible. We are using this vet check as a first trial step toward changing the language.