I wasn’t sure what the performance impact of preallocating a slice with make
vs.
just letting the slice grow via append
is, so I investigated a little.
Arrays
In order to understand what a slice is one needs to first understand how arrays work in go.
An array’s type definition specifies a length and an element type.
For example, the type [3]int
represents an array of three integers.
An array’s size is fixed, its length is part of the type, so [2]int
and [3]int
are incompatible,
e.g. a variable of type [2]int
cannot be assigned to a variable of type [3]int
.
a := [2]int{1, 2}
var b [3]int
b = a
// Output: cannot use a (type [2]int) as type [3]int in assignment
Arrays are initialized right away, the zero value of an array is an array whose elements have their zero value.
var b [2]int
fmt.Println(b[1])
// Output: 0
In-memory a variable of type [3]int
is just 3 integers laid out sequentially.
An array is a value, an [3]int
variable are just 3 integers, being passed around.
Slices
First of all what is a slice exactly, a slice can be thought of a struct with the following values:
type slice struct {
len int // the length of the slice
cap int // the capacity of the slice i.e. the max possible len without allocating new memory
ptr unsafe.Pointer // pointer to the first Element of the underlying array
}
append
append
needs to allocate a new underlying array, if the elements to append exceed the capacity of the underlying array,
in this case a new bigger array is allocated, all elements copied from the old to the new array,
and the new elements are copied after the last element of the copied slice.
cap
is set to the length of the newly allocated array,len
is to the index of the last element added viaappend
(+ 1)ptr
is set to point to the underlying array
For performance it’s important how often the values are copied around in memory.
This is what I came up with, to determine how often the slice is reallocated and copied, I also print some useful information about the allocated capacity of the slice.
package main
import (
"fmt"
)
func main() {
elmCount := 100_000
fmt.Printf("%8v | %8v | %8v | %v\n", "len", "prev cap", "new cap", "ratio added cap")
var foo []int
for i := 0; i < elmCount; i++ {
if i == 0 {
foo = append(foo, i)
continue
}
first := &foo[0]
beforeCap := cap(foo)
foo = append(foo, i)
firstNew := &foo[0]
// if the start of the underlying array changed
if firstNew != first {
afterCap, afterLen := cap(foo), len(foo)
capRatio := float64(afterCap) / float64(beforeCap)
fmt.Printf("%8v | %8v | %8v | %.2fx\n", afterLen, beforeCap, afterCap, capRatio)
}
}
}
In the above code a slice of integers with 100 000 elements is created, by using only append
.
The above code outputs:
len | prev cap | new cap | ratio added cap
2 | 1 | 2 | 2.00x
3 | 2 | 4 | 2.00x
5 | 4 | 8 | 2.00x
9 | 8 | 16 | 2.00x
17 | 16 | 32 | 2.00x
33 | 32 | 64 | 2.00x
65 | 64 | 128 | 2.00x
129 | 128 | 256 | 2.00x
257 | 256 | 512 | 2.00x
513 | 512 | 1024 | 2.00x
1025 | 1024 | 1280 | 1.25x
1281 | 1280 | 1696 | 1.32x
1697 | 1696 | 2304 | 1.36x
2305 | 2304 | 3072 | 1.33x
3073 | 3072 | 4096 | 1.33x
4097 | 4096 | 5120 | 1.25x
5121 | 5120 | 7168 | 1.40x
7169 | 7168 | 9216 | 1.29x
9217 | 9216 | 12288 | 1.33x
12289 | 12288 | 15360 | 1.25x
15361 | 15360 | 19456 | 1.27x
19457 | 19456 | 24576 | 1.26x
24577 | 24576 | 30720 | 1.25x
30721 | 30720 | 38912 | 1.27x
38913 | 38912 | 49152 | 1.26x
49153 | 49152 | 61440 | 1.25x
61441 | 61440 | 76800 | 1.25x
76801 | 76800 | 96256 | 1.25x
96257 | 96256 | 120832 | 1.26x
Interesting it looks like append is smart and doesn’t only grow the slice to the currently needed capacity, but doubles the capacity for smaller capacities (under 1024) and grows it by ~1.25x for larger capacities (over 1024), to reduce the overhead of reallocating arrays.
In the runtime this is implemented as follows (full implementation):
func growslice(et *_type, old slice, cap int) slice {
...
newcap := old.cap
doublecap := newcap + newcap
if cap > doublecap {
newcap = cap
} else {
if old.len < 1024 {
newcap = doublecap
} else {
// Check 0 < newcap to detect overflow
// and prevent an infinite loop.
for 0 < newcap && newcap < cap {
newcap += newcap / 4
}
// Set newcap to the requested cap when
// the newcap calculation overflowed.
if newcap <= 0 {
newcap = cap
}
}
}
...
}
Performance
I benchmarked what the performance impact of preallocating a slice with make
vs auto growing via append
is, the benchmark populates a slice with N integers them via a for loop.
func plainAppend(elms int) {
var foo []int
for i := 0; i < elms; i++ {
foo = append(foo, i)
}
}
func withMake(elms int) {
foo := make([]int, 0, elms)
for i := 0; i < elms; i++ {
foo = append(foo, i)
}
}
Full Benchmark: https://gist.github.com/lu4p/4906f5d1b88f23a1f740bb8895f25e40
$ go test -bench=. -count=10 -benchtime=100x > bench.out
# https://godoc.org/golang.org/x/perf/cmd/benchstat
$ benchstat bench.out
name time/op
10/append-8 729ns ±62%
10/with_make-8 119ns ±84%
100/append-8 1.46µs ±84%
100/with_make-8 454ns ±99%
1k/append-8 5.28µs ±33%
1k/with_make-8 2.61µs ±54%
10k/append-8 111µs ±23%
10k/with_make-8 31.4µs ±11%
100k/append-8 1.16ms ± 8%
100k/with_make-8 248µs ±11%
1Million/append-8 8.57ms ± 3%
1Million/with_make-8 2.83ms ± 1%
So in a simple use case preallocating the slice with make
seems to be about 2-4 times faster.
Although this is significant, in real-world scenarios it often doesn’t really matter because appending is fast enough regardless.
If you don’t know the number of elements a to which a slice will grow, it often makes little sense to allocate memory with make beforehand. Of course there are exceptions from this, sometimes you know at least the order of magnitude to which a slice will grow, then it can make sense to allocate a slice with a cap similar to the ballpark of expected elements.
Thanks to Hu1buerger, for helping with the investigation.