Skip to content

feat: add document input support for text modality#249

Merged
Kamilbenkirane merged 1 commit intomainfrom
feat/docs-input
Mar 28, 2026
Merged

feat: add document input support for text modality#249
Kamilbenkirane merged 1 commit intomainfrom
feat/docs-input

Conversation

@Kamilbenkirane
Copy link
Copy Markdown
Member

Summary

  • Add DocumentArtifact + DocumentMimeType for PDF, DOCX, XLSX, PPTX, CSV, TXT, HTML, MD
  • Add Domain.DOCUMENTS with celeste.documents.analyze() namespace (async/sync/streaming)
  • Add document= parameter to TextClient.analyze() and all namespace variants
  • Wire provider-specific document content blocks for OpenAI, Anthropic, Google, and Mistral

Usage

import celeste
from celeste.artifacts import DocumentArtifact

pdf = DocumentArtifact(path="report.pdf")

# Via documents namespace
output = await celeste.documents.analyze(
    pdf,
    prompt="Summarize this document",
    model="gpt-4o",
)

# Via text client
client = celeste.create_client(modality="text", model="claude-sonnet-4-5")
output = await client.analyze("Summarize", document=pdf)

# Streaming
async for chunk in celeste.documents.stream.analyze(
    pdf, prompt="Extract key findings", model="gemini-2.5-flash"
):
    print(chunk.content, end="")

Provider support

Provider Format Content block
OpenAI (Responses API) base64, URL input_file
Anthropic base64, URL document
Google Gemini base64, URL (via build_media_part) inline_data / file_data
Mistral (Chat Completions) base64, URL document_url

Architecture

Follows the exact same pattern as image/video/audio input:

  • DocumentArtifact(Artifact) with DocumentMimeTypesrc/celeste/artifacts.py
  • DocumentConstraint / DocumentsConstraintsrc/celeste/constraints.py
  • InputType.DOCUMENT + Domain.DOCUMENTSsrc/celeste/core.py
  • TextParameter.DOCUMENT on 36 models across 4 providers
  • DocumentsNamespace with analyze() (async/sync/streaming) — src/celeste/namespaces/domains.py
  • _check_media_support() validates document capability at runtime

Test plan

  • 553 unit tests pass (7 new: content block formatting, media support validation, signature checks)
  • mypy: 0 errors across 331 source files
  • ruff: 0 errors
  • Integration tests: 36/36 models pass (Anthropic 8, Google 7, Mistral 7, OpenAI 14)
  • All pre-commit hooks pass

Closes #207

Add DocumentArtifact, DocumentMimeType, and Domain.DOCUMENTS with
celeste.documents.analyze() for PDF, DOCX, XLSX, and other document
formats across OpenAI, Anthropic, Google, and Mistral providers.

Follows the same architecture as image/video/audio input — typed
artifacts, constraint-based model validation, per-provider content
block formatting, and full async/sync/streaming support.
@claude
Copy link
Copy Markdown

claude bot commented Mar 28, 2026

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

@Kamilbenkirane Kamilbenkirane merged commit 9ea6aac into main Mar 28, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for docs input (pdf, xlsx, etc..) for text modality

1 participant