- UTF-8 encoding: Go strings use UTF-8, 1-4 bytes per character
rune: Alias forint32, represents a single UTF-8 characterlen()vs character count:len()returns bytes, not character count- Character count: Use
len([]rune(str))to get actual character count - Range iteration: Use
rangeto iterate over runes, not bytes - Immutable: Strings are read-only slices of bytes
- Internal structure: String = pointer + length (like slice header)
- Copy behavior: Copying strings copies pointer and length, not underlying bytes
In this note, I will just list down the things I need to remember about strings in Go.
- Go use "UTF-8" encoding for strings. And a single UTF-8 needs 1 to 4 bytes to represent.
- Go use
rune(alias forint32) to represent a single character in a string.
- Go use
len()returns the number of bytes in a string.- To get the number of actual characters in a string, use
[]runeto convert the string to a slice of runes.
func main() {
str := "안녕, World!"
fmt.Println(len(str)) // 12
fmt.Println(len([]rune(str))) // 11
}- To iterate over a string, use
rangeinstead oflen().
func main() {
str := "안녕, World!"
for i, r := range str {
fmt.Println(i, r)
}
}- String in Go is a read-only (immutable) slice of bytes.
- "read-only" means the bytes cannot be modified.
type StringHeader struct {
Data uintptr
Len int
}- String is basically pointer and length.
- Which means when a string is copied, pointer and length are copied.
- Which means the copied string will also point to the same underlying bytes.