One of the new concepts developers encounter when working with strings in Go is the rune. As shown in the code below, s[0] is a byte, and its ASCII code is 72.
package main
import "fmt"
func main() {
s := "HELP!"
var b byte
b = s[0]
fmt.Println(b) // prints 72
}
However, the code below results in a compilation error, stating that ch is of type rune, not byte. If you examine Go’s internal libraries, you’ll see that rune is an alias for int32, which is 4 bytes.
package main
import "fmt"
func main() {
s := "HELP!"
var b byte
for _, ch := range s {
b = ch
fmt.Println(b)
}
}
error: cannot use ch (variable of type rune) as byte value in assignment
The issue is that in UTF-8, not all characters have the same encoded length. Common characters use fewer bytes, while less common ones use more. This results in compression within the encoding. Go is designed with UTF-8 in mind, so when you iterate over a string, you get runes, even though a string is actually a slice of bytes.
Read more:
1. Strings, bytes, runes and characters in Go
2. The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)