Skip to content

Feature Request: Verbose output mode to show file sizes and token counts #65

@danshapiro

Description

@danshapiro

Problem

When processing large directories, it's hard to know which files and subdirectories are contributing the most to the output size. This makes it difficult to optimize what to include/exclude for LLM context windows.

Proposed Solution

Add a --verbose or -v flag that shows a summary after processing, including:

  • Number of files processed vs ignored
  • Total size in tokens (estimated)
  • Top N largest files by token count
  • Token counts by subdirectory
  • Total lines processed

Example

files-to-prompt . --verbose

Would output the normal content, followed by:

Top 20 file size token count breakdown:
17,563  engine/analysis.py
15,036  ui/run_benchmark_tab.py
11,975  tests/test_benchmark.py
10,719  engine/call_llm.py
10,014  docs/future_work/litellm.txt
9,254   tests/test_cli_batch.py
9,109   engine/io_xlsx_export.py
9,081   ui/app.py
8,513   engine/rate_limiter.py
6,063   tests/test_analysis.py
5,921   cli/generate_report.py
5,855   tests/test_rate_limiter.py
5,836   docs/ui_refactor_plan.md
5,296   tests/test_integration.py
5,185   tests_nodeids.txt
4,459   tests/test_llm_integration.py
4,011   cli/cli.py
3,851   aibo_mcp_server.py
3,754   tests/fixtures.py
3,688   tests/test_ui.py

First-level subdirectories:
tests: 62,835 tokens
engine: 62,082 tokens
ui: 25,627 tokens
docs: 15,850 tokens
cli: 15,701 tokens
root: 11,708 tokens
cosinesim: 8,361 tokens
.streamlit: 155 tokens
.claude: 95 tokens
.cursor: 63 tokens

Total number of files processed: 69
Total tokens in all files: 202,477
Total lines: 21,363

Implementation Notes

  • Token would use tiktoken or fall back on a simple approximation (chars/3)
  • Would you want summary to go to stderr so it doesn't interfere with piped output?

Use Case

  • When the output is too big, easily identify how to trim
  • Note surprisingly large files & subdirectories to omit

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions