Does the conversion from string to rune slice make a copy?

1.3k views Asked by At

I'm teaching myself Go from a C background. The code below works as I expect (the first two Printf() will access bytes, the last two Printf() will access codepoints).

What I am not clear is if this involves any copying of data.

package main

import "fmt"

var a string

func main() {
    a = "èe"
    fmt.Printf("%d\n", a[0])
    fmt.Printf("%d\n", a[1])
    fmt.Println("")
    fmt.Printf("%d\n", []rune(a)[0])
    fmt.Printf("%d\n", []rune(a)[1])
}

In other words:

does []rune("string") create an array of runes and fill it with the runes corresponding to "string", or it's just the compiler that figures out how to get runes from the string bytes?

2

There are 2 answers

0
blackgreen On BEST ANSWER

It involves a copy because:

  • strings are immutable; if the conversion []rune(s) didn't make a copy, you would be able to index the rune slice and change the string contents
  • a string value is a "(possibly empty) sequence of bytes", where byte is an alias of uint8, whereas a rune is a "an integer value identifying a Unicode code point" and an alias of int32. The types are not identical and even the lengths may not be the same:
    a = "èe"
    r := []rune(a)
    fmt.Println(len(a)) // 3 (3 bytes)
    fmt.Println(len(r)) // 2 (2 Unicode code points)
0
Peter On

It is not possible to turn []uint8 (i.e. a string) into []int32 (an alias for []rune) without allocating an array.

Also, strings are immutable in Go but slices are not, so the conversion to both []byte and []rune must copy the string's bytes in some way or another.