Commit 2f80d82
fix: break string references to allow GC of DecisionsStreamResponse (#126)
* fix(memory): intern origin strings and clone country codes to allow GC
## Problem
When processing decisions from the stream, strings extracted from Decision
structs (e.g., `*decision.Origin`) shared their backing byte array with the
original struct. This prevented GC from reclaiming the entire
DecisionsStreamResponse until those strings were no longer referenced,
causing unnecessary memory retention.
## Solution
1. **String interning for origins** - Origins have low cardinality (crowdsec,
cscli, lists:scenario). Interning deduplicates them and breaks references
to Decision memory. 100k decisions with same origin = 1 string allocation.
2. **Clone country codes** - Use `strings.Clone()` to break reference to
Decision struct memory for country scope decisions.
3. **Clear decisions after processing** - Set `decisions.New` and
`decisions.Deleted` to nil after processing to help GC reclaim the
DecisionsStreamResponse immediately.
## Performance
```
BenchmarkInternString/existing_string 81M ops 17.35 ns/op 0 B/op
BenchmarkInternString/new_string 33M ops 38.01 ns/op 0 B/op
```
The fast path (existing string lookup) is ~17ns with zero allocations.
New strings are cloned once and cached for future use.
* refactor: address Copilot review feedback
- Use original string as key in LoadOrStore to avoid unnecessary clone
allocation when another goroutine wins the race
- Add O(n) performance warning to InternedPoolSize documentation
- Simplify benchmark pre-population (only need one call per unique string)
- Pre-generate unique strings in new_string benchmark to isolate interning
cost from string construction overhead
- Restructure origin interning to avoid intermediate string when not needed
* refactor: address valid Copilot feedback (2 of 3)
- Fix benchmark timer ordering: clear pool before b.ResetTimer()
- Store *decision.Origin in variable to avoid double dereference
Note: Rejected Copilot's suggestion to revert LoadOrStore key from 's'
back to 'cloned'. Using 's' as key is correct because:
1. sync.Map compares keys by value (string contents), not pointer
2. The key is only used for lookup - the VALUE (cloned) is what's stored
3. Using 's' avoids wasting a clone allocation when another goroutine wins
* Update pkg/dataset/root.go
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Update pkg/dataset/root.go
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>1 parent d1049e5 commit 2f80d82
2 files changed
+21
-6
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
189 | 189 | | |
190 | 190 | | |
191 | 191 | | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
192 | 197 | | |
193 | 198 | | |
194 | 199 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
48 | 48 | | |
49 | 49 | | |
50 | 50 | | |
51 | | - | |
52 | | - | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
53 | 55 | | |
| 56 | + | |
| 57 | + | |
54 | 58 | | |
55 | 59 | | |
56 | 60 | | |
| |||
88 | 92 | | |
89 | 93 | | |
90 | 94 | | |
91 | | - | |
| 95 | + | |
| 96 | + | |
92 | 97 | | |
93 | 98 | | |
94 | 99 | | |
| |||
129 | 134 | | |
130 | 135 | | |
131 | 136 | | |
132 | | - | |
133 | | - | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
134 | 141 | | |
| 142 | + | |
| 143 | + | |
135 | 144 | | |
136 | 145 | | |
137 | 146 | | |
| |||
165 | 174 | | |
166 | 175 | | |
167 | 176 | | |
168 | | - | |
| 177 | + | |
| 178 | + | |
169 | 179 | | |
170 | 180 | | |
171 | 181 | | |
| |||
0 commit comments