Skip to content

[TASK] Refactor Read ToolΒ #59

@edenreich

Description

@edenreich

Summary

Make Read deterministic and robust for long files and PDFs.

Scope

  1. Enforce absolute file_path. Reject relative paths.
  2. Defaults: read from line 1, up to 2000 lines.
  3. Truncate any line > 2000 characters.
  4. Output format: cat -n style (6-wide right-justified number + tab), numbering starts at 1.
  5. Support offset and limit for paging.
  6. PDFs: extract text page by page. Insert page headers. Basic image handling optional.
  7. If file exists but is empty, return a clear reminder line instead of empty output.
  8. Return consistent errors.

Input schema

{
  "type": "object",
  "additionalProperties": false,
  "properties": {
    "file_path": { "type": "string", "description": "Absolute path" },
    "limit": { "type": "integer", "minimum": 1 },
    "offset": { "type": "integer", "minimum": 1 }
  },
  "required": ["file_path"]
}

Acceptance criteria

  • Relative paths return NOT_ABSOLUTE_PATH.
  • Nonexistent files return NOT_FOUND.
  • Exact cat -n prefix: "%6d\t%s".
  • Defaults applied: offset=1, limit=2000, per-line cap 2000 chars.
  • Empty file returns a single reminder line and a FILE_EMPTY warning.
  • PDFs render text per page with page headers; parse failures return PDF_PARSE_ERROR.
  • Binary or undecodable text returns UNREADABLE_BINARY.
  • Unit tests cover large files, long lines, paging edges, PDFs, and golden tests for numbering at 1, 9, 10, 99, 100, 1000.
  • Other read Bash tool whitelisted commands are removed and Read will be used instead

Metadata

Metadata

Assignees

No one assigned

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions