Skip to content

Commit 3730a2e

Browse files
committed
Remove base64 encoding
codetracer-python-recorder/Cargo.lock: codetracer-python-recorder/Cargo.toml: codetracer-python-recorder/src/runtime/mod.rs: design-docs/adr/0008-line-aware-io-capture.md: design-docs/io-capture-line-proxy-implementation-plan.md: Signed-off-by: Tzanko Matev <[email protected]>
1 parent 2b10d70 commit 3730a2e

File tree

5 files changed

+4
-11
lines changed

5 files changed

+4
-11
lines changed

codetracer-python-recorder/Cargo.lock

Lines changed: 0 additions & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

codetracer-python-recorder/Cargo.toml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,6 @@ serde = { version = "1.0", features = ["derive"] }
3131
serde_json = "1.0"
3232
uuid = { version = "1.10", features = ["v4"] }
3333
recorder-errors = { version = "0.1.0", path = "crates/recorder-errors" }
34-
base64 = "0.22"
3534

3635
[dev-dependencies]
3736
pyo3 = { version = "0.25.1", features = ["auto-initialize"] }

codetracer-python-recorder/src/runtime/mod.rs

Lines changed: 2 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -47,9 +47,6 @@ use crate::runtime::io_capture::{
4747
IoChunk, IoChunkConsumer, IoChunkFlags, IoEventSink, IoStream, IoStreamProxies, ProxySink,
4848
ScopedMuteIoCapture,
4949
};
50-
51-
use base64::engine::general_purpose::STANDARD as BASE64;
52-
use base64::Engine;
5350
use serde::Serialize;
5451
use serde_json;
5552

@@ -389,7 +386,7 @@ impl RuntimeTracer {
389386
};
390387

391388
let metadata = self.build_io_metadata(&chunk);
392-
let content = BASE64.encode(&chunk.payload);
389+
let content = String::from_utf8_lossy(&chunk.payload).into_owned();
393390

394391
TraceWriter::add_event(
395392
&mut self.writer,
@@ -814,8 +811,6 @@ mod tests {
814811
use super::*;
815812
use crate::monitoring::CallbackOutcome;
816813
use crate::policy;
817-
use base64::engine::general_purpose::STANDARD as BASE64;
818-
use base64::Engine;
819814
use pyo3::types::{PyAny, PyCode, PyModule};
820815
use pyo3::wrap_pyfunction;
821816
use runtime_tracing::{FullValueRecord, StepRecord, TraceLowLevelEvent, ValueRecord};
@@ -1112,8 +1107,7 @@ result = compute()\n"
11121107
.filter_map(|event| match event {
11131108
TraceLowLevelEvent::Event(record) => {
11141109
let metadata: IoMetadata = serde_json::from_str(&record.metadata).ok()?;
1115-
let payload = BASE64.decode(&record.content).ok()?;
1116-
Some((metadata, payload))
1110+
Some((metadata, record.content.as_bytes().to_vec()))
11171111
}
11181112
_ => None,
11191113
})

design-docs/adr/0008-line-aware-io-capture.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@
2828
7. Guard our own logging with a `ScopedMuteIoCapture` RAII helper. It sets a thread-local flag so proxy callbacks short-circuit when the recorder writes to stderr.
2929
8. Add an optional best-effort FD mirror for `stdout`/`stderr`. When enabled it duplicates the file descriptors and spawns a reader that only handles writes not seen by the proxies. We track bytes seen by proxies in a per-stream ledger that keeps a FIFO byte buffer plus a monotonic sequence ID. The mirror removes ledger bytes from every read chunk with a streaming diff: scan left to right, skip native bytes, and peel ledger entries whenever their bytes appear, even when native writes arrive first in the chunk. Whatever bytes remain become mirror-only output. The GIL keeps Python `write` calls serial, so proxy order matches the order the OS sees. When native code writes directly to the FD it appears in the diff as leftover bytes we record with a `FdMirror` source tag.
3030
9. Expose a policy flag `policy.io_capture.line_proxies` defaulting to `true`. The FD mirror stays off by default and hides behind `policy.io_capture.fd_fallback`.
31+
10. Encode captured payloads as raw UTF-8 strings when forwarding to `runtime_tracing`. We trim the old manual base64 layer so downstream tooling, including the Codetracer UI, can consume the bytes without a second decode pass. Non UTF-8 input falls back to lossless bytes from the mirror or replacement characters when proxies surface decoded text.
3132

3233
## Consequences
3334
- **Pros:** We align IO chunks with the current Python frame, match C extensions that honour `sys.stdout`, and keep console behaviour untouched. The design lives inside the existing lifecycle code.

design-docs/io-capture-line-proxy-implementation-plan.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ This plan replaces the old pipe-based capture plan. Sentences stay short for eas
2121
- Add `IoChunk` struct holding `{stream, payload, thread_id, snapshot, timestamp, flags}`.
2222
- `IoEventSink` groups writes by thread and stream. Batches flush when we hit newline, explicit flush, a Step boundary, or a 5 ms timer. Use `parking_lot::Mutex` and a `once_cell::sync::Lazy` timer wheel to keep locking simple.
2323
- Provide `flush_before_step(thread_id)` API. The monitoring callbacks call it right before they emit a Step event, then record the Step, then update the snapshot store. This enforces the `Step -> IO -> next Step` ordering.
24-
- Convert chunks into runtime trace events right after batching. Reuse existing encoders.
24+
- Convert chunks into runtime trace events right after batching. Feed raw payload bytes into `runtime_tracing` so consumers do not need to undo an extra base64 layer.
2525
- Integrate with the recorder error macros for faults (`usage!`, `ioerr!`).
2626
- Tests: unit tests for batching rules, timer flush, newline handling, guard on recursion, and the Step ordering API.
2727
- Exit: sink drops zero events during stress tests that flood stdout with short writes.

0 commit comments

Comments
 (0)