Skip to content

feat: request-time data encryption#349

Open
infiniteregrets wants to merge 45 commits intomainfrom
m/encryption
Open

feat: request-time data encryption#349
infiniteregrets wants to merge 45 commits intomainfrom
m/encryption

Conversation

@infiniteregrets
Copy link
Copy Markdown
Member

No description provided.

@infiniteregrets infiniteregrets marked this pull request as draft March 23, 2026 14:46
@infiniteregrets infiniteregrets changed the title feat: request-time data encryption [WIP] feat: request-time data encryption Mar 23, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 23, 2026

Greptile Summary

This WIP PR adds request-time data encryption to the S2 platform, primarily targeting S2 Lite. Clients supply an encryption header containing an algorithm identifier and a 32-byte hex key; the server encrypts records at append time and decrypts at read time using the per-request key with the basin and stream name as AEAD additional-authenticated data. An attest mode lets clients declare they will handle their own encryption without sending a key.\n\nKey highlights:\n\n- Ciphertext format is well-designed: a version byte followed by an algorithm byte, a random nonce, ciphertext, and an authentication tag. The format comment and comprehensive tests (round-trips, wrong key, wrong AAD, truncated ciphertext, bit-flip detection) are good engineering practice.\n- SDK: The encryption header is stored separately from default_headers and injected only at append and read call sites, correctly scoping key material to data-plane requests.\n- Command-line interface: EncryptionArgs and DecryptionArgs cleanly separate the write path (algorithm required) from the read path (algorithm inferred from stored ciphertext); early hex-key validation gives users a clear error.\n- Minor security hygiene (P2): In parse_encryption_key, the Vec holding decoded key bytes is correctly zeroized, but try_into() copies those bytes into an intermediate stack-allocated array that is never zeroized. Decoding directly into a heap-allocated struct would eliminate this copy.\n- One test has a misleading name; the actual failure is a missing key parameter, not a semicolon issue.

Confidence Score: 5/5

Safe to merge as a WIP; all remaining findings are P2 style/hygiene items that do not block correctness.

The encryption logic is sound: authenticated encryption with stream-bound AAD, version and algorithm bytes embedded in ciphertext, and a comprehensive test suite covering wrong-key, wrong-AAD, truncated ciphertext, and bit-flip scenarios. The only unaddressed findings are a P2 zeroization hygiene issue and a misleading test name, neither of which affects runtime correctness or security in a meaningful way for a WIP branch.

common/src/encryption.rs — zeroization of the intermediate stack copy of key bytes in parse_encryption_key.

Important Files Changed

Filename Overview
common/src/encryption.rs Core encryption module: well-structured AEAD ciphertext format with version + alg bytes, correct use of random nonces, strong test coverage. Minor: intermediate stack copy of key bytes in parse_encryption_key is not zeroized; misleading test name parse_header_malformed_no_semicolon.
sdk/src/api.rs Encryption header is stored separately from default_headers and injected only at append/read call sites via set_encryption_header, correctly scoping it to data-plane requests only.
lite/src/handlers/v1/records.rs Server-side encryption/decryption wired cleanly via EncryptionContext; algorithm required for append, derived from ciphertext for read; all three read modes (unary, SSE, S2s) handled correctly.
lite/src/backend/error.rs Adds ReadError::Encryption variant backed by s2_common::encryption::EncryptionError; straightforward error propagation addition.
lite/src/handlers/v1/error.rs Maps new ReadError::Encryption to HTTP 400 ErrorCode::Invalid; reasonable mapping for a malformed-key or decryption-failure scenario.
sdk/src/types.rs Adds EncryptionConfig enum and S2Config::with_encryption; Debug impl redacts the key value.
cli/src/cli.rs Adds EncryptionArgs (for append) and DecryptionArgs (for read/tail) with proper clap conflicts/groups; algorithm flag correctly requires a key source.
cli/src/main.rs Correctly dispatches encryption config to append and decryption config to read/tail; hex validation is consistent with server-side checks.
cli/src/error.rs Adds InvalidEncryptionKey error variant with an appropriate diagnostic help message.
common/src/types/config.rs Adds EncryptionAlgorithm enum with strum serialization for the two supported algorithms; straightforward addition.

Sequence Diagram

sequenceDiagram
    participant Client
    participant LiteServer as S2 Lite Server
    participant Storage

    Note over Client,Storage: Append with server-side encryption
    Client->>LiteServer: POST /records (encryption header with alg + key)
    LiteServer->>LiteServer: encrypt_append_input()
    LiteServer->>Storage: Persist ciphertext records
    LiteServer-->>Client: AppendAck

    Note over Client,Storage: Read with server-side decryption
    Client->>LiteServer: GET /records (encryption header with key)
    LiteServer->>Storage: Fetch ciphertext records
    LiteServer->>LiteServer: decrypt_read_batch()
    LiteServer-->>Client: Plaintext records

    Note over Client,Storage: Attest mode
    Client->>LiteServer: POST /records (attest header, pre-encrypted body)
    LiteServer->>Storage: Persist as-is
    LiteServer-->>Client: AppendAck
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: common/src/encryption.rs
Line: 177-192

Comment:
**Key bytes copied to unzeroized stack array**

`hex::decode` writes into `key_bytes` (a `Vec`), which is correctly zeroized before returning. However, the `try_into()` call at line 180 **copies** those bytes into `key_array: [u8; 32]`, a plain stack-allocated array. This copy is never zeroized — a Rust move does not clear the source location, so the raw key material can linger on the stack until the frame is overwritten.

A cleaner approach is to decode directly into a heap-allocated `KeyBytes` struct, avoiding the intermediate stack copy entirely:

```rust
let mut key_box = Box::new(KeyBytes([0u8; 32]));
hex::decode_to_slice(key_hex, &mut key_box.0)
    .map_err(|e| EncryptionError::MalformedHeader(format!("key is not valid hex: {e}")))?;
Ok(SecretBox::new(key_box))
```

This keeps the key bytes only in the heap-allocated box, which `KeyBytes`'s `Zeroize` impl will clear on drop.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: common/src/encryption.rs
Line: 654-663

Comment:
**Misleading test name**

The test is named `parse_header_malformed_no_semicolon`, but the header value `"alg=aegis-256"` does not fail due to a missing semicolon. Splitting by `;` yields one part — `"alg=aegis-256"` — which is parsed successfully as `alg=Some(Aegis256)`. The error is actually `MalformedHeader("missing 'key=' parameter")`. Consider renaming to something like `parse_header_alg_only_missing_key` to accurately reflect which validation fails.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (2): Last reviewed commit: "chore: deduplicate resolve_encryption/re..." | Re-trigger Greptile

infiniteregrets and others added 12 commits March 23, 2026 21:43
…r layer

Drop seq_num from AAD per team decision. AAD is now stream_id (BLAKE3
hash of basin+stream), matching StreamId::new in lite. This allows
encryption to happen in the HTTP handler before the backend, removing
all encryption plumbing from the streamer pipeline.

- Replace effective_aad_v1(base, seq_num) with stream_id_aad(basin, stream)
- Replace encrypt_sequenced_records with encrypt_append_input (pre-sequencing)
- Remove AppendEncryption struct from streamer/backend
- Encrypt in handler before backend.append, decrypt after backend.read

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
These tests manually encrypted before backend.append() and decrypted
after backend.read(), but encryption now happens in the handler layer.
The roundtrip logic is already covered by unit tests in common/src/encryption.rs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Compute stream_id AAD lazily (only when encryption header present)
- Remove unused secrecy dev-dependency from lite

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
infiniteregrets and others added 11 commits March 26, 2026 14:05
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Previously, --encryption-algorithm was silently ignored when no key was
provided, which could lead users to believe encryption was enabled.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The algorithm is auto-detected from the alg_id byte in the ciphertext
envelope during decryption, so specifying it on reads is unnecessary
and confusing. Split EncryptionArgs into EncryptionArgs (append, with
algorithm) and DecryptionArgs (read/tail, key only).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…roup

Replace runtime check with clap's `requires = "encryption_key_source"`
group constraint, so clap rejects the invalid combination at parse time.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Algorithm is only needed for encryption (appends). For decryption
(reads/tail), the alg_id byte in the ciphertext envelope is
authoritative. Making alg Optional removes the need to hardcode a
dummy algorithm on the decrypt path.

- EncryptionDirective::Key.alg: Option<EncryptionAlgorithm>
- SDK EncryptionConfig::Key.alg: Option<EncryptionAlgorithm>
- Header format supports key-only: "key=<hex>" (no alg)
- Lite handler requires alg on append, ignores on read
- CLI resolve_decryption passes alg: None

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… helpers

- Extract shared resolve_encryption_config to eliminate duplication
- Rename make_key_fn/make_wrong_key_fn to test_key/wrong_test_key

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@infiniteregrets infiniteregrets marked this pull request as ready for review March 27, 2026 20:43
@infiniteregrets infiniteregrets changed the title [WIP] feat: request-time data encryption feat: request-time data encryption Mar 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants