Skip to content

Commit 50957a7

Browse files
Deleplaceeliben
authored andcommitted
blog: robust generic functions on slices
Change-Id: Ib18522588d620b3ce7b2b25ac477c65e50c9771d Reviewed-on: https://go-review.googlesource.com/c/website/+/564456 Reviewed-by: Eli Bendersky <[email protected]> Reviewed-by: David Chase <[email protected]> LUCI-TryBot-Result: Go LUCI <[email protected]>
1 parent 824cca5 commit 50957a7

File tree

6 files changed

+2377
-0
lines changed

6 files changed

+2377
-0
lines changed
Lines changed: 173 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,173 @@
1+
---
2+
title: Robust generic functions on slices
3+
date: 2024-02-22
4+
by:
5+
- Valentin Deleplace
6+
summary: Avoiding memory leaks in the slices package.
7+
---
8+
9+
The [slices](/pkg/slices) package provides functions that work for slices of any type.
10+
In this blog post we'll discuss how you can use these functions more effectively by understanding how slices are represented in memory and how that affects the garbage collector, and we'll cover how we recently adjusted these functions to make them less surprising.
11+
12+
With [Type parameters](/blog/deconstructing-type-parameters) we can write functions like [slices.Index](/pkg/slices#Index) once for all types of slices of comparable elements:
13+
14+
```
15+
// Index returns the index of the first occurrence of v in s,
16+
// or -1 if not present.
17+
func Index[S ~[]E, E comparable](s S, v E) int {
18+
for i := range s {
19+
if v == s[i] {
20+
return i
21+
}
22+
}
23+
return -1
24+
}
25+
```
26+
27+
It is no longer necessary to implement `Index` again for each different type of element.
28+
29+
The [slices](/pkg/slices) package contains many such helpers to perform common operations on slices:
30+
31+
```
32+
s := []string{"Bat", "Fox", "Owl", "Fox"}
33+
s2 := slices.Clone(s)
34+
slices.Sort(s2)
35+
fmt.Println(s2) // [Bat Fox Fox Owl]
36+
s2 = slices.Compact(s2)
37+
fmt.Println(s2) // [Bat Fox Owl]
38+
fmt.Println(slices.Equal(s, s2)) // false
39+
```
40+
41+
Several new functions (`Insert`, `Replace`, `Delete`, etc.) modify the slice. To understand how they work, and how to properly use them, we need to examine the underlying structure of slices.
42+
43+
A slice is a view of a portion of an array. [Internally](/blog/slices-intro), a slice contains a pointer, a length, and a capacity. Two slices can have the same underlying array, and can view overlapping portions.
44+
45+
For example, this slice `s` is a view on 4 elements of an array of size 6:
46+
47+
{{image "generic-slice-functions/1_sample_slice_4_6.svg" 450}}
48+
49+
If a function changes the length of a slice passed as a parameter, then it needs to return a new slice to the caller. The underlying array may remain the same if it doesn't have to grow. This explains why [append](/blog/slices) and `slices.Compact` return a value, but `slices.Sort`, which merely reorders the elements, does not.
50+
51+
Consider the task of deleting a portion of a slice. Prior to generics, the standard way to delete the portion `s[2:5]` from the slice `s` was to call the [append](/ref/spec#Appending_and_copying_slices) function to copy the end portion over the middle portion:
52+
53+
```
54+
s = append(s[:2], s[5:]...)
55+
```
56+
57+
The syntax was complex and error-prone, involving subslices and a variadic parameter. We added [slice.Delete](/pkg/slices#Delete) to make it easier to delete elements:
58+
59+
```
60+
func Delete[S ~[]E, E any](s S, i, j int) S {
61+
return append(s[:i], s[j:]...)
62+
}
63+
```
64+
65+
The one-line function `Delete` more clearly expresses the programmer's intent. Let’s consider a slice `s` of length 6 and capacity 8, containing pointers:
66+
67+
{{image "generic-slice-functions/2_sample_slice_6_8.svg" 600}}
68+
69+
This call deletes the elements at `s[2]`, `s[3]`, `s[4]` from the slice `s`:
70+
71+
```
72+
s = slices.Delete(s, 2, 5)
73+
```
74+
75+
{{image "generic-slice-functions/3_delete_s_2_5.svg" 600}}
76+
77+
The gap at the indices 2, 3, 4 is filled by shifting the element `s[5]` to the left, and setting the new length to `3`.
78+
79+
`Delete` need not allocate a new array, as it shifts the elements in place. Like `append`, it returns a new slice. Many other functions in the `slices` package follow this pattern, including `Compact`, `CompactFunc`, `DeleteFunc`, `Grow`, `Insert`, and `Replace`.
80+
81+
When calling these functions we must consider the original slice invalid, because the underlying array has been modified. It would be a mistake to call the function but ignore the return value:
82+
83+
```
84+
slices.Delete(s, 2, 5) // incorrect!
85+
// s still has the same length, but modified contents
86+
```
87+
88+
## A problem of unwanted liveness
89+
90+
Before Go 1.22, `slices.Delete` didn't modify the elements between the new and original lengths of the slice. While the returned slice wouldn't include these elements, the "gap" created at the end of the original, now-invalidated slice continued to hold onto them. These elements could contain pointers to large objects (a 20MB image), and the garbage collector would not release the memory associated with these objects. This resulted in a memory leak that could lead to significant performance issues.
91+
92+
In this above example, we’re successfully deleting the pointers `p2`, `p3`, `p4` from `s[2:5]`, by shifting one element to the left. But `p3` and `p4` are still present in the underlying array, beyond the new length of `s`. The garbage collector won’t reclaim them. Less obviously, `p5` is not one of the deleted elements, but its memory may still leak because of the `p5` pointer kept in the gray part of the array.
93+
94+
This could be confusing for developers, if they were not aware that "invisible" elements were still using memory.
95+
96+
So we had two options:
97+
98+
* Either keep the efficient implementation of `Delete`. Let users set obsolete pointers to `nil` themselves, if they want to make sure the values pointed to can be freed.
99+
* Or change `Delete` to always set the obsolete elements to zero. This is extra work, making `Delete` slightly less efficient. Zeroing pointers (setting them to `nil`) enables the garbage collection of the objects, when they become otherwise unreachable.
100+
101+
It was not obvious which option was best. The first one provided performance by default, and the second one provided memory frugality by default.
102+
103+
## The fix
104+
105+
A key observation is that "setting the obsolete pointers to `nil`" is not as easy as it seems. In fact, this task is so error-prone that we should not put the burden on the user to write it. Out of pragmatism, we chose to modify the implementation of the five functions `Compact`, `CompactFunc`, `Delete`, `DeleteFunc`, `Replace` to "clear the tail". As a nice side effect, the cognitive load is reduced and users now don’t need to worry about these memory leaks.
106+
107+
In Go 1.22, this is what the memory looks like after calling Delete:
108+
109+
{{image "generic-slice-functions/4_delete_s_2_5_nil.svg" 600}}
110+
111+
The code changed in the five functions uses the new built-in function [clear](/pkg/builtin#clear) (Go 1.21) to set the obsolete elements to the zero value of the element type of `s`:
112+
113+
{{image "generic-slice-functions/5_Delete_diff.png" 800}}
114+
115+
The zero value of `E` is `nil` when `E` is a type of pointer, slice, map, chan, or interface.
116+
117+
## Tests failing
118+
119+
This change has led to some tests that passed in Go 1.21 now failing in Go 1.22, when the slices functions are used incorrectly. This is good news. When you have a bug, tests should let you know.
120+
121+
If you ignore the return value of `Delete`:
122+
123+
```
124+
slices.Delete(s, 2, 3) // !! INCORRECT !!
125+
```
126+
127+
then you may incorrectly assume that `s` does not contain any nil pointer. [Example in the Go Playground](/play/p/NDHuO8vINHv).
128+
129+
If you ignore the return value of `Compact`:
130+
131+
```
132+
slices.Sort(s) // correct
133+
slices.Compact(s) // !! INCORRECT !!
134+
```
135+
136+
then you may incorrectly assume that `s` is properly sorted and compacted. [Example](/play/p/eFQIekiwlnu).
137+
138+
If you assign the return value of `Delete` to another variable, and keep using the original slice:
139+
140+
```
141+
u := slices.Delete(s, 2, 3) // !! INCORRECT, if you keep using s !!
142+
```
143+
144+
then you may incorrectly assume that `s` does not contain any nil pointer. [Example](/play/p/rDxWmJpLOVO).
145+
146+
If you accidentally shadow the slice variable, and keep using the original slice:
147+
148+
```
149+
s := slices.Delete(s, 2, 3) // !! INCORRECT, using := instead of = !!
150+
```
151+
152+
then you may incorrectly assume that `s` does not contain any nil pointer. [Example](/play/p/KSpVpkX8sOi).
153+
154+
155+
## Conclusion
156+
157+
The API of the `slices` package is a net improvement over the traditional pre-generics syntax to delete or insert elements.
158+
159+
We encourage developers to use the new functions, while avoiding the "gotchas" listed above.
160+
161+
Thanks to the recent changes in the implementation, a class of memory leaks is automatically avoided, without any change to the API, and with no extra work for the developers.
162+
163+
164+
## Further reading
165+
166+
The signature of the functions in the `slices` package is heavily influenced by the specifics of the representation of slices in memory. We recommend reading
167+
168+
* [Go Slices: usage and internals](/blog/slices-intro)
169+
* [Arrays, slices: The mechanics of 'append'](/blog/slices)
170+
* The [dynamic array](https://en.wikipedia.org/wiki/Dynamic_array) data structure
171+
* The [documentation](/pkg/slices) of the package slices
172+
173+
The [original proposal](/issue/63393) about zeroing obsolete elements contains many details and comments.

0 commit comments

Comments
 (0)