fix: skip oversized JSONL lines instead of failing the session by wesm · Pull Request #20 · wesm/agentsview

wesm · 2026-02-24T04:18:44Z

Summary

Replace bufio.Scanner with a custom lineReader that uses bufio.Reader.ReadLine() to skip lines exceeding 64MB instead of aborting the entire session parse
Propagate I/O errors through lineReader.Err() (matching bufio.Scanner pattern) rather than silently swallowing them
Replace all three scan sites (Claude parser, Claude hints extractor, Codex parser)

Closes #12

Test plan

TestLineReader — normal lines, oversized skip, all oversized, empty input, blank lines, no trailing newline, exact limit, one over limit
TestLineReaderIOError — custom io.Reader that yields data then returns a non-EOF error; verifies lines are returned and Err() exposes the I/O error
TestParseCodexSessionOversizedLineSkipped — integration test placing an oversized line between normal JSONL entries
All Go tests pass (CGO_ENABLED=1 go test -tags fts5 ./...)

🤖 Generated with Claude Code

Replace bufio.Scanner with a lineReader that silently skips lines exceeding 64MB and continues parsing. Previously, a single oversized tool_result line (e.g. 22MB command output) caused the entire session to be dropped with "bufio.Scanner: token too long". The lineReader starts with a 64KB buffer and grows on demand. Oversized lines are skipped — the session appears with a gap but the remaining messages are preserved. Closes #12 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

roborev-ci · 2026-02-24T04:22:33Z

roborev: Combined Review (`608c5f4`)

Summary: This commit introduces a custom lineReader to handle oversized JSONL lines, but introduces significant regressions related to silent I/O error swallowing and untracked data loss
.

High

1. Silent swallowing of non-EOF read errors

Locations: internal/parser/linereader.go:30-33, internal/parser/claude.go:47-48, internal/parser/codex.go:5 09-510
Description: The next() function in lineReader silently ignores all errors returned by reading operations, mapping any failure to a clean end-of-file condition (return "", false). This means underlying I/O errors (such as disk failures or dropped network shares) are
completely hidden, causing partial or truncated data to be ingested as if the file was parsed successfully. This is a regression from the previous bufio.Scanner implementation, which allowed checking scanner.Err().
Remediation: Update next() to propagate errors to the caller (e.g.,
func (lr *lineReader) next() (string, bool, error)), or add an Err() error method to lineReader to capture and surface non-EOF errors. Ensure that the caller parsers check for this error and fail the parsing process instead of silently returning partial results.

Medium

2. Silent discarding of oversized lines

Locations: internal/parser/linereader.go:26, internal/parser/linereader.go:65-71, internal/parser/parser_test.go:1432
Description: Lines exceeding the maxLineSize limit are silently discarded. Dropping data without telemetry or logging makes it difficult to debug missing session data or identify systemic issues with log generation. Furthermore, if the system relies on these parsed logs for tracking agent behavior or security auditing, this creates an evasion vector:
an attacker could intentionally pad sensitive actions to exceed the size limit, effectively hiding the action from the system.
Remediation: Expose visibility into dropped lines by tracking skipped-line count, offset, and size. At a minimum, log a warning or emit a metric when an oversized line is skipped. For security-
critical ingestion, consider adding a configurable strict mode that fails the parse entirely rather than quietly dropping data.

Synthesized from 4 reviews (agents: codex, gemini | types: default, security)

lineReader.next() was treating all read errors as EOF, silently discarding I/O failures. Add err field and Err() method (matching bufio.Scanner pattern), restore post-loop error checks in both parsers, and add TestLineReaderIOError. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

roborev-ci · 2026-02-24T04:33:42Z

roborev: Combined Review (`fdf516d`)

Summary Verdict: The review identified 2 medium severity issues related to error handling and silent data skipping, while the core line-reader logic appears secure against memory exhaustion.

Medium

Oversized lines are silently skipped
- Files: internal/parser/linereader.go :77, internal/parser/linereader.go:80, internal/parser/codex.go:507, internal/parser/claude.go:46
- Issue: Oversized lines are silently skipped and parsing continues successfully.

Risk: If session JSONL files are attacker-controlled (or used for audit/forensics), a malicious actor can place critical records in lines exceeding maxLineSize to evade detection or alter downstream security analysis without triggering failure.
- Remediation: Fail closed by default on oversized lines,
  or return structured warnings/errors (line numbers + skipped count) that callers must handle explicitly. Consider a strict mode for security-sensitive consumers.

I/O errors can be silently dropped in Claude project-hint extraction
- Files: internal/parser/claude.go: 169
- Issue: The ExtractClaudeProjectHints function uses lineReader but does not check lr.Err() after the lr.next() loop. If a read fails, it can return partial/empty hints without signaling an I/O failure.

Remediation: Add a post-loop lr.Err() check and propagate or log it. If the function currently cannot return an error, update its signature or add explicit error reporting.

Synthesized from 4 reviews (agents: codex, gemini | types: default, security)

The hints extractor used lineReader but didn't check lr.Err() after the loop, silently dropping I/O failures. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

roborev-ci · 2026-02-24T04:39:00Z

roborev: Combined Review (`b8180c0`)

The PR introduces a medium-severity issue where oversized lines are silently dropped, potentially masking data loss or malicious activity.

Medium

Silent data loss when oversized lines are skipped
Files:

[internal/parser/linereader.go:67-75](/home/roborev/.roborev/clones/wesm/agentsview/internal/parser/linereader.
go#L67)
internal/parser/claude.go:46
internal/parser/codex. go:506-510

Issue: lineReader intentionally drops oversized JSONL lines (fail-open) with
no signal to callers. Sessions can appear "successful" while missing messages. If an attacker can influence session files, they can hide specific records by making them exceed maxLineSize, causing downstream analysis, audit, and security checks to miss those events without an explicit parse failure.

Remediation: Track the
skipped-line count in lineReader and surface it as an error or warning to callers (e.g., log once per file, or return in parser metadata). Consider adding a strict mode that fails on any skipped line for security/audit use cases, and optionally log sanitized line numbers for skipped records.

Synthesized from 4 reviews (agents: codex, gemini | types: default, security)

fix: log I/O errors in ExtractClaudeProjectHints

b8180c0

The hints extractor used lineReader but didn't check lr.Err() after the loop, silently dropping I/O failures. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

wesm merged commit 0af2152 into main Feb 24, 2026
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: skip oversized JSONL lines instead of failing the session#20

fix: skip oversized JSONL lines instead of failing the session#20
wesm merged 3 commits intomainfrom
large-buffers

wesm commented Feb 24, 2026 •

edited

Loading

Uh oh!

roborev-ci bot commented Feb 24, 2026

Uh oh!

roborev-ci bot commented Feb 24, 2026

Uh oh!

roborev-ci bot commented Feb 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

wesm commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

roborev-ci bot commented Feb 24, 2026

roborev: Combined Review (608c5f4)

High

Uh oh!

roborev-ci bot commented Feb 24, 2026

roborev: Combined Review (fdf516d)

Medium

Uh oh!

roborev-ci bot commented Feb 24, 2026

roborev: Combined Review (b8180c0)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

wesm commented Feb 24, 2026 •

edited

Loading

roborev: Combined Review (`608c5f4`)

roborev: Combined Review (`fdf516d`)

roborev: Combined Review (`b8180c0`)