Commit 987f691
authored
ksym: Introduce disk-based symbolizer (#1587)
Context
======
There are several use-cases, such as symbolization, that conceptually
boil
down to a list of tuples, each formed of an identifier and a string,
where
we want to efficiently find the string for a particular identifier.
A possible approach to solve this problem is to create a large list
in memory, sorted by identifier, so we can binary-search over the ids
and find the entry for which entry_i.Id <= Id < entry_i+1.Id.
While this works well and it's easy to understand and maintain, when
the list grows too large, this can be a large source of retained memory.
This issue is particularly bad when memory offloading (swap/zswap) is
not
enabled, as the "cold" anonymous memory won't have a way to be moved to
secondary storage.
This implementation produces a simple binary format that's easy to write
and read, but most importantly, it should be efficient to query.
```
┌─────────┬────────────────────────────┬────────────────────────────────────────────┐
│ │ │ │
│ Header │ Strings with nul endings │ Sorted ids + meta information on strings │
│ │ │ │
└─────────┴────────────────────────────┴────────────────────────────────────────────┘
```
The strings aren't deduplicated or optimized in any way, to reduce the
memory
usage during the write phase.
The file is read with `mmap(2)`, to avoid performing any read system
calls
while binary searching over the identifiers, and leveraging the caching
layer
of the filesystem. As we now have a backing file, rather than being
anonymous
memory, the OS can remove cached pages if there's a need for more
memory.
Test Plan
=======
- ksym tests pass;
- ran the agent for 1h without issues, spot-checked several kernel
symbols and they looked good;
In terms of efficiency it would be best to check on Demo / prod, but
this is the early data I've gotten after running the Agent for 10mins
### Memory
**before**
```
[javierhonduco@fedora parca-agent]$ cat /proc/`pidof parca-agent-debug`/status | grep -i rss
VmRSS: 102212 kB
RssAnon: 62148 kB
RssFile: 40064 kB
RssShmem: 0 kB
```
**after**
```
[javierhonduco@fedora parca-agent]$ cat /proc/`pidof parca-agent-debug`/status | grep -i rss
VmRSS: 83996 kB
RssAnon: 39324 kB
RssFile: 40192 kB
RssShmem: 4480 kB
```
### CPU
<details>
**before**

**after**

No significant difference in CPU usage
</details>
This is roughly a 20% decrease in RSS, which is expected for the number
of symbols that my box has. The more symbols the bigger the savings! :)File tree
5 files changed
+502
-80
lines changed- cmd/parca-agent
- pkg/ksym
5 files changed
+502
-80
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
490 | 490 | | |
491 | 491 | | |
492 | 492 | | |
493 | | - | |
| 493 | + | |
494 | 494 | | |
495 | 495 | | |
496 | 496 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
| 18 | + | |
18 | 19 | | |
19 | 20 | | |
20 | | - | |
| 21 | + | |
21 | 22 | | |
22 | 23 | | |
23 | 24 | | |
24 | 25 | | |
25 | 26 | | |
26 | 27 | | |
27 | 28 | | |
28 | | - | |
29 | 29 | | |
30 | 30 | | |
31 | | - | |
32 | 31 | | |
33 | 32 | | |
34 | 33 | | |
35 | | - | |
36 | | - | |
37 | 34 | | |
38 | | - | |
39 | | - | |
40 | | - | |
41 | | - | |
42 | | - | |
43 | 35 | | |
| 36 | + | |
44 | 37 | | |
45 | | - | |
46 | 38 | | |
47 | 39 | | |
48 | 40 | | |
49 | | - | |
50 | 41 | | |
| 42 | + | |
51 | 43 | | |
52 | 44 | | |
53 | 45 | | |
54 | 46 | | |
55 | 47 | | |
56 | 48 | | |
57 | | - | |
| 49 | + | |
58 | 50 | | |
59 | 51 | | |
60 | 52 | | |
61 | 53 | | |
62 | | - | |
63 | | - | |
64 | | - | |
65 | | - | |
66 | | - | |
67 | | - | |
68 | | - | |
69 | | - | |
70 | | - | |
71 | | - | |
72 | | - | |
73 | | - | |
74 | | - | |
75 | | - | |
76 | | - | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
77 | 58 | | |
78 | 59 | | |
79 | 60 | | |
80 | 61 | | |
81 | 62 | | |
82 | | - | |
| 63 | + | |
83 | 64 | | |
84 | 65 | | |
85 | 66 | | |
| |||
102 | 83 | | |
103 | 84 | | |
104 | 85 | | |
105 | | - | |
106 | | - | |
| 86 | + | |
107 | 87 | | |
108 | | - | |
| 88 | + | |
109 | 89 | | |
110 | 90 | | |
111 | 91 | | |
112 | 92 | | |
113 | 93 | | |
114 | 94 | | |
115 | | - | |
| 95 | + | |
116 | 96 | | |
117 | | - | |
118 | | - | |
119 | 97 | | |
120 | | - | |
121 | | - | |
122 | | - | |
123 | | - | |
124 | | - | |
125 | | - | |
126 | | - | |
127 | | - | |
128 | | - | |
| 98 | + | |
129 | 99 | | |
130 | | - | |
131 | 100 | | |
132 | | - | |
| 101 | + | |
133 | 102 | | |
134 | 103 | | |
135 | 104 | | |
136 | | - | |
137 | | - | |
| 105 | + | |
138 | 106 | | |
139 | | - | |
| 107 | + | |
140 | 108 | | |
141 | | - | |
| 109 | + | |
142 | 110 | | |
143 | 111 | | |
144 | 112 | | |
145 | | - | |
146 | | - | |
147 | | - | |
148 | | - | |
149 | | - | |
150 | | - | |
151 | | - | |
152 | | - | |
153 | 113 | | |
154 | 114 | | |
155 | 115 | | |
| |||
160 | 120 | | |
161 | 121 | | |
162 | 122 | | |
163 | | - | |
164 | | - | |
165 | | - | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
166 | 158 | | |
167 | 159 | | |
168 | 160 | | |
| |||
207 | 199 | | |
208 | 200 | | |
209 | 201 | | |
210 | | - | |
| 202 | + | |
211 | 203 | | |
212 | 204 | | |
213 | 205 | | |
214 | 206 | | |
215 | 207 | | |
216 | | - | |
217 | | - | |
218 | 208 | | |
219 | 209 | | |
220 | 210 | | |
221 | 211 | | |
222 | | - | |
| 212 | + | |
223 | 213 | | |
224 | 214 | | |
225 | 215 | | |
226 | | - | |
227 | | - | |
228 | | - | |
229 | | - | |
| 216 | + | |
| 217 | + | |
230 | 218 | | |
| 219 | + | |
| 220 | + | |
231 | 221 | | |
232 | 222 | | |
233 | 223 | | |
234 | 224 | | |
235 | 225 | | |
236 | 226 | | |
237 | | - | |
| 227 | + | |
238 | 228 | | |
239 | 229 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
20 | | - | |
21 | 20 | | |
22 | 21 | | |
23 | 22 | | |
| |||
113 | 112 | | |
114 | 113 | | |
115 | 114 | | |
116 | | - | |
117 | | - | |
118 | | - | |
119 | | - | |
120 | 115 | | |
121 | | - | |
| 116 | + | |
122 | 117 | | |
123 | 118 | | |
| 119 | + | |
124 | 120 | | |
125 | 121 | | |
126 | 122 | | |
| |||
248 | 244 | | |
249 | 245 | | |
250 | 246 | | |
251 | | - | |
| 247 | + | |
252 | 248 | | |
253 | 249 | | |
| 250 | + | |
254 | 251 | | |
255 | 252 | | |
256 | 253 | | |
257 | | - | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
258 | 258 | | |
259 | 259 | | |
260 | 260 | | |
| |||
263 | 263 | | |
264 | 264 | | |
265 | 265 | | |
266 | | - | |
| 266 | + | |
267 | 267 | | |
268 | 268 | | |
| 269 | + | |
269 | 270 | | |
270 | 271 | | |
271 | 272 | | |
| |||
0 commit comments