Skip to content

Conversation

@pashneal
Copy link
Contributor

@pashneal pashneal commented Jan 29, 2026

Description

This PR fixes an issue seen in customer environments where logs forwarded are corrupted/malformed.

Because the internal []byte buffer under bufio.Scanner.Bytes() is possibly mutated on each call to bufio.Scanner.Scan(), it is not safe to reuse that []byte slice directly as was done through Log.Content when PII rules were not set. On large enough logs, bufio.Scanner overwrites the buffer - and therefore Log.Content contained junk data.

This affected 2%-3% of forwarded logs for a number of customers - and is the root cause of some additional issues.

Visual Explanation

bufio.Scanner.Bytes() returns a slice pointing directly into the scanner's internal buffer - not a copy. The buffer is 5MB; when total scanned data exceeds this, the buffer wraps and earlier entries get overwritten.

Step 1: Scanning Fills Buffer (total data < 5MB)

Scanner Internal Buffer (5MB):

Offset Content
0x1000 {"data":"AAAA"}
0x1020 {"data":"BBBB"}
... (unused)

logs[].Content:

Index Pointer Value
0 → 0x1000 {"data":"AAAA"}
1 → 0x1020 {"data":"BBBB"}

Step 2: Buffer Wraps (total data > 5MB)

The buffer fills up and is forced to overwrite earlier segments of the buffer to avoid allocation.

Scanner Internal Buffer (5MB):

Offset Content
0x1000 {"data":"CCCCCCCCCCCCCCCCCC"}
... (used)

logs[].Content:

Index Pointer Value Status
0 → 0x1000 {"data":"CCCCCC WAS "AAAA"
1 → 0x1020 CCCCCCCCCC"} (!!!) GARBAGE (!!!)

All Log.Content slices point into the same 5MB buffer. Once input exceeds buffer capacity, new data spills back to the start - and longer entries overflow into adjacent memory, corrupting multiple earlier entries at once.

Fix

To fix this issue, we clone the data for each log instead of reusing the direct slice from bufio.Scanner.Bytes().

Metadata

Jira issue:

This change affects:

  • Control Plane Tasks
  • Forwarder
  • ARM Templates (Bicep)
  • Uninstall Script
  • CI/Documentation

Testing

A regression test verifying the behavior is fixed has been added. All other test pass.

Rollout

  • I have verified that this change is backwards compatible.

@pashneal pashneal changed the title [CLOUDS-7101] [CLOUDS-7233] [CLOUDS-7238] Fix Memory Corruption From Internal Buffer Aliasing Fix Memory Corruption From Internal Buffer Aliasing Jan 29, 2026
@pashneal pashneal requested a review from mattsp1290 January 29, 2026 17:04
@pashneal pashneal marked this pull request as ready for review January 29, 2026 17:04
@pashneal pashneal requested a review from a team as a code owner January 29, 2026 17:04
@cit-pr-commenter-54b7da
Copy link

Coverage Report

Control Plane Coverage:

Filename Stmts Miss Cover Missing
control_plane/cache/resources_cache.py 42 1 97.62% 104
control_plane/cache/user_config.py 18 3 83.33% 27-29
control_plane/scripts/initial_run.py 46 3 93.48% 44, 95-97
control_plane/scripts/uninstall.py 522 248 52.49% 276-277, 291, 310, 339-347, 352-360, 364-373, 377-381, 390-419, 426-461, 491-494, 546-547, 581, 617-618, 643-646, 680-683, 691-696, 715-716, 729, 766-769, 773-784, 788-801, 805-813, 817-843, 857-874, 895, 908-912, 925-950, 962-971, 997, 1024-1078, 1103-1172, 1176-1177
control_plane/scripts/tests/test_uninstall.py 308 3 99.03% 107-109
control_plane/tasks/deployer_task.py 115 6 94.78% 127-128, 180, 188-189, 205
control_plane/tasks/diagnostic_settings_task.py 155 25 83.87% 168-207, 324, 327, 329-332, 371
control_plane/tasks/resources_task.py 70 1 98.57% 128
control_plane/tasks/scaling_task.py 296 23 92.23% 219, 235, 259, 323-329, 342-343, 380-383, 412-429, 479, 547-550, 698
control_plane/tasks/task.py 120 24 80.00% 75-79, 105-110, 117-118, 144-145, 200-209
control_plane/tasks/telemetry.py 11 7 36.36% 14-21
control_plane/tasks/client/log_forwarder_client.py 234 1 99.57% 255
control_plane/tasks/client/resource_client.py 108 4 96.30% 81, 194-196
TOTAL 4549 349 92.33%

Forwarder Coverage:

Filename Stmts Miss Cover Missing
cmd/forwarder/forwarder.go 332 128 61.45% 49-55, 63-67, 81-83, 95-96, 107-108, 114-115, 175-177, 199-202, 208-210, 301-303, 323-328, 348-350, 355-450
internal/collections/funcslices.go 15 0 100.00%
internal/collections/iterator.go 37 37 0.00% 25-75
internal/cursor/cursor.go 67 16 76.12% 106, 114-117, 40-42, 54-58, 77-79
internal/deadletterqueue/dead_letter_queue.go 59 27 54.24% 44, 53-55, 85-88, 91-93, 95-97, 102-117
internal/environment/variables.go 9 3 66.67% 34-36
internal/logs/client.go 53 17 67.92% 60-62, 70-72, 79-82, 84-87, 103-105
internal/logs/datadog.go 58 31 46.55% 47-83
internal/logs/hooks.go 23 0 100.00%
internal/logs/javascriptobjects.go 56 17 69.64% 23-25, 33-35, 43-44, 55-57, 66-68, 79-81
internal/logs/models.go 246 51 79.27% 37-39, 51, 80, 100-102, 118-120, 147, 228-230, 236-238, 242-244, 247-249, 254-256, 261-263, 273-279, 286-288, 339-341, 344-346, 396-398, 401-403
internal/logs/parse.go 162 65 59.88% 52-56, 62-64, 269, 82-86, 90-94, 96-99, 127-135, 140-151, 155-157, 191-195, 198-202, 208-210, 228-232, 234-236
internal/logs/piiscrubber.go 14 0 100.00%
internal/logs/tags.go 14 3 78.57% 32-34
internal/logs/mocks/mock_datadog.go 24 24 0.00% 33-63
internal/logs/mocks/mock_piiscrubber.go 18 18 0.00% 29-52
internal/metrics/metrics.go 23 23 0.00% 26-57, 46-51
internal/pointer/pointer.go 3 3 0.00% 8-10
internal/storage/blobs.go 94 19 79.79% 76-86, 121-123, 142-144, 147-149
internal/storage/client.go 5 0 100.00%
internal/storage/containers.go 21 8 61.90% 50-57
internal/storage/errors.go 3 3 0.00% 13-15
internal/storage/segments.go 20 0 100.00%
internal/storage/mocks/mock_client.go 61 61 0.00% 33-109, 54-115
TOTAL 1417 554 60.90%

@pashneal pashneal changed the title Fix Memory Corruption From Internal Buffer Aliasing Fix Data Corruption From Internal Buffer Aliasing Jan 29, 2026
Copy link
Member

@mattsp1290 mattsp1290 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! nice catch!

@pashneal pashneal merged commit 9bf7dfb into main Feb 2, 2026
20 checks passed
@pashneal pashneal deleted the neal.powell/CLOUDS-7101/CLOUDS-7233/CLOUDS-7238/fix.memory.corruption.from.internal.buffer branch February 2, 2026 13:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants