When I have string "hogemogehogemogehogemoge世界世界世界" which code is better to get last rune with avoiding memory allocation?
There are similar question about to get last X character of Golang String.
I want to make sure which is prefered if I just want to get last rune, without any additional operation.
package main
import (
"fmt"
"unicode/utf8"
)
func main() {
// which is more better for memory allocation?
s := "hogemogehogemogehogemoge世界世界世界a"
getLastRune(s, 3)
getLastRune2(s, 3)
}
func getLastRune(s string, c int) {
// DecodeLastRuneInString
j := len(s)
for i := 0; i < c && j > 0; i++ {
_, size := utf8.DecodeLastRuneInString(s[:j])
j -= size
}
lastByRune := s[j:]
fmt.Println(lastByRune)
}
func getLastRune2(s string, c int) {
// string -> []rune
r := []rune(s)
lastByRune := string(r[len(r)-c:])
fmt.Println(lastByRune)
}
世界a
世界a
Whenever performance and allocations are the question, you should run benchmarks.
First let's modify your functions to not print but rather return the result:
func getLastRune(s string, c int) string {
j := len(s)
for i := 0; i < c && j > 0; i++ {
_, size := utf8.DecodeLastRuneInString(s[:j])
j -= size
}
return s[j:]
}
func getLastRune2(s string, c int) string {
r := []rune(s)
if c > len(r) {
c = len(r)
}
return string(r[len(r)-c:])
}
And the benchmark functions:
var s = "hogemogehogemogehogemoge世界世界世界a"
func BenchmarkGetLastRune(b *testing.B) {
for i := 0; i < b.N; i++ {
getLastRune(s, 3)
}
}
func BenchmarkGetLastRune2(b *testing.B) {
for i := 0; i < b.N; i++ {
getLastRune2(s, 3)
}
}
Running them:
go test -bench . -benchmem
Results:
BenchmarkGetLastRune-4 30000000 36.9 ns/op 0 B/op 0 allocs/op
BenchmarkGetLastRune2-4 10000000 165 ns/op 0 B/op 0 allocs/op
getLastRune()
is more than 4 times faster. Neither of them is making any allocations, but this is due to a compiler optimization (converting a string
to []rune
and back generally requires allocation).
If we run the benchmarks with optimizations disabled:
go test -gcflags '-N -l' -bench . -benchmem
Results:
BenchmarkGetLastRune-4 30000000 46.2 ns/op 0 B/op 0 allocs/op
BenchmarkGetLastRune2-4 10000000 197 ns/op 16 B/op 1 allocs/op
Compiler optimizations or not, getLastRune()
is the clear winner.