feat: add Claude Code CLI as VLM provider by biecho · Pull Request #115 · llmsresearch/paperbanana

biecho · 2026-03-24T08:39:08Z

Summary

Adds a claude_code VLM provider that uses the locally installed claude CLI as the backend for planner, stylist, and critic agents
No API key needed, uses the user's existing Claude Code subscription
Maintains conversation context across pipeline steps via --resume, so the critic knows what the planner intended

Usage

paperbanana generate \
  --vlm-provider claude_code \
  --vlm-model sonnet \
  --image-model gemini-2.5-flash-image \
  -i input.txt -o output.png

Changes since review

Addressed all feedback from @dippatel1994:

Secure temp files — replaced tempfile.mktemp() with mkstemp(); temp images cleaned up in try/finally (even on subprocess or OSError)
Fixed image ordering — preamble built in-order via list, prepended once
Concurrency safety — asyncio.Lock serialises generate() calls to prevent _session_id races
CLI validation — is_available() check in ProviderRegistry.create_vlm() with a clear error message
Tests — 24 tests covering registry, JSON parsing, session chaining, prompt construction, image ordering, temp cleanup, error handling, unsupported-param warning, and concurrency
temperature/max_tokens — warning logged when non-default values are passed (CLI has no flags for these)
Lint — all checks pass

Test plan

Verified claude -p --output-format json returns structured output with session_id
Verified --resume <session_id> maintains conversation context
End-to-end test: full paperbanana pipeline (planner + stylist + 2 critic iterations) with claude_code VLM + Gemini image gen
24 unit tests pass (pytest tests/test_providers/test_claude_code_vlm.py)

Copilot

Pull request overview

Adds a new VLM provider that routes PaperBanana agent calls through the locally installed claude CLI (“Claude Code”), wiring it into the provider registry so it can be selected via --vlm-provider claude_code.

Changes:

Introduces ClaudeCodeVLM, a VLMProvider implementation that shells out to claude -p --output-format json and tracks session_id for --resume.
Updates ProviderRegistry.create_vlm() to support vlm_provider == "claude_code".

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 8 comments.

File	Description
`paperbanana/providers/vlm/claude_code.py`	New Claude Code CLI-backed VLM provider with session resumption and image handling via temp files.
`paperbanana/providers/registry.py`	Registers the new `claude_code` provider option in the VLM factory.

Copilot · 2026-03-24T08:44:11Z

paperbanana/providers/registry.py

+        elif provider == "claude_code":
+            from paperbanana.providers.vlm.claude_code import ClaudeCodeVLM
+
+            return ClaudeCodeVLM(model=settings.vlm_model)


ProviderRegistry.create_vlm() creates ClaudeCodeVLM without validating that the claude executable is installed. If it’s missing, the first call will fail with a low-level FileNotFoundError. Consider checking shutil.which("claude") (or provider.is_available()) here and raising a helpful ValueError with installation instructions.

Copilot · 2026-03-24T08:44:11Z

paperbanana/providers/registry.py

+        elif provider == "claude_code":
+            from paperbanana.providers.vlm.claude_code import ClaudeCodeVLM
+
+            return ClaudeCodeVLM(model=settings.vlm_model)


There are existing registry/provider creation tests (e.g., tests/test_providers/test_registry.py) but no coverage for the new claude_code branch. Add tests that (a) assert ProviderRegistry.create_vlm() returns ClaudeCodeVLM when shutil.which('claude') is patched to a path, and (b) asserts a helpful error when it’s missing (if you add the validation).

Copilot · 2026-03-24T08:44:12Z