RAG-based VTK Python code generation with prompt clarification, task decomposition, and sequential code generation.
This project turns a user prompt into runnable VTK Python code via three stages:
- Prompt clarification (
ClarificationSession) - Task decomposition (
DecompositionSession) - Code generation (
GenerationSession)
- Python 3.10+
- uv - Fast Python package manager
# Install uv if not already installed
curl -LsSf https://astral.sh/uv/install.sh | sh./setup.shThis creates a .venv virtual environment using uv and installs dependencies interactively.
Or install manually:
# Create virtual environment
uv venv .venv
source .venv/bin/activate
# Install package with dev dependencies
uv pip install -e ".[dev]"
# Optional extras
uv pip install -e ".[llm]" # LLM providers
uv pip install -e ".[mcp]" # VTK API tooling
uv pip install -e ".[rag]" # RAG (requires Qdrant)
uv pip install -e ".[vtk]" # VTK runtime
# All extras
uv pip install -e ".[dev,llm,mcp,rag,vtk]"cp .env.example .env
# Edit .env with your LLM API keydocker run -d -p 6333:6333 qdrant/qdrantYou'll need to index your VTK documentation. The data files are:
data/vtk-python-docs.jsonl(61 MB) - API documentationdata/raw/vtk-python-examples.jsonl(5.4 MB) - Code examplesdata/raw/vtk-python-tests.jsonl(4.8 MB) - Test cases
Note: Indexing tools are in the parent vtk-rag repository. You need to build the Qdrant index before querying.
source .venv/bin/activate
vtk-st --help
# Evaluate prompt clarity
vtk-st evaluate "Read a VTK file and visualize it"
# Clarify a prompt (interactive by default)
vtk-st query "Read a VTK file and visualize it"
# Decompose into tasks
vtk-st decompose "Read volume.vti and create an isosurface at value 135"
# Full pipeline
vtk-st pipeline "Read volume.vti and create an isosurface at value 135"vtk-sequential-thinking/
├── pyproject.toml
├── README.md
├── setup.sh
├── examples/
├── tests/
└── vtk_sequential_thinking/
User Prompt
-> ClarificationSession (optional, interactive)
-> DecompositionSession (LLM + MCP tooling)
-> GenerationSession (LLM + MCP + RAG)
-> Python code output
vtk_sequential_thinking/
├── __init__.py
├── cli.py
├── config.py
├── llm/
│ ├── __init__.py
│ ├── client.py
│ └── json_protocol.py
├── mcp/
│ ├── __init__.py
│ ├── client.py
│ └── persistent_client.py
├── prompt_clarification/
│ ├── __init__.py
│ ├── models.py
│ ├── prompts.py
│ ├── clarifier.py
│ └── session.py
├── task_decomposition/
│ ├── __init__.py
│ ├── models.py
│ ├── prompts.py
│ ├── decomposer.py
│ └── session.py
├── sequential_generation/
│ ├── __init__.py
│ ├── models.py
│ ├── prompts.py
│ ├── generator.py
│ ├── code_assembler.py
│ └── session.py
└── rag/
├── __init__.py
├── client.py
├── models.py
└── ranking.py
The library exports three “session” entry points:
ClarificationSession(prompt -> synthesized prompt)DecompositionSession(prompt -> tasks)GenerationSession(tasks -> code)
They are exported from vtk_sequential_thinking/__init__.py as aliases of the internal Session classes in each subpackage.
Key data:
SessionResponse.status: one ofclear,needs_clarification,ready_to_synthesize,synthesized,restart,skippedSessionResponse.prompt: the original promptSessionResponse.questions: pending questions (if any)SessionResponse.synthesized_prompt: only set after synthesis
Key files:
vtk_sequential_thinking/prompt_clarification/models.pyvtk_sequential_thinking/prompt_clarification/clarifier.pyvtk_sequential_thinking/prompt_clarification/session.py
Key data:
Task:{id, task_type, description, search_query, depends_on, vtk_classes, from_prompt}DecompositionResult:{tasks, output_type, reasoning}
The decomposition session supports:
decompose(prompt)refine(modifications, additions)finalize()
Key files:
vtk_sequential_thinking/task_decomposition/models.pyvtk_sequential_thinking/task_decomposition/prompts.pyvtk_sequential_thinking/task_decomposition/decomposer.pyvtk_sequential_thinking/task_decomposition/session.py
Flow:
tasks[]
-> Generator.generate(task)
- (optional) retrieve examples via RAG
- tool loop via MCP (VTK API grounding)
- JSONProtocol decoding into TaskResult
-> CodeAssembler.add_snippet(...)
-> CodeAssembler.assemble() -> final_code
Output:
PipelineResult.code: final assembled codePipelineResult.task_results: per-task outputs
Key files:
vtk_sequential_thinking/sequential_generation/session.pyvtk_sequential_thinking/sequential_generation/generator.pyvtk_sequential_thinking/sequential_generation/code_assembler.pyvtk_sequential_thinking/sequential_generation/models.py
The CLI is implemented in vtk_sequential_thinking/cli.py using Typer.
vtk-st evaluate: clarity evaluation onlyvtk-st query: interactive clarification (outputs synthesized prompt)vtk-st decompose: prompt -> tasks JSONvtk-st generate: tasks JSON -> codevtk-st pipeline: clarify -> decompose -> generate
Tests are split into offline-safe unit tests and CLI-level integration tests:
tests/unit/tests/integration/
Many integration tests monkeypatch external clients so they can run without live services.
examples/clarification_example.py: clarification onlyexamples/decomposition_example.py: decomposition/refinement onlyexamples/generation_example.py: generation onlyexamples/pipeline_example.py: full pipeline demonstration
The LLM client supports multiple providers:
- OpenAI
- Anthropic
# LLM Provider (choose one)
LLM_PROVIDER=anthropic # anthropic, openai, google, local
# API Keys
ANTHROPIC_API_KEY=...
OPENAI_API_KEY=...
GOOGLE_API_KEY=...
# Model Selection
ANTHROPIC_MODEL=...
OPENAI_MODEL=...
GOOGLE_MODEL=...
# VTK API docs (used by vtkapi-mcp tooling)
VTK_API_DOCS_PATH=data/vtk-python-docs.jsonl
# Qdrant (RAG)
QDRANT_URL=http://localhost:6333
QDRANT_CODE_COLLECTION=vtk_codefrom vtk_sequential_thinking import (
ClarificationSession,
DecompositionSession,
GenerationSession,
LLMClient,
MCPClient,
load_config,
)
config = load_config()
llm_client = LLMClient(app_config=config)
mcp_client = MCPClient(app_config=config)
# 1) Clarify
clarify = ClarificationSession.from_config(config, llm_client=llm_client)
resp = clarify.submit_prompt("Read a VTK file and visualize it")
if resp.status != "clear":
# In a real app, you'd iterate questions and then call synthesize()
resp = clarify.synthesize()
synthesized_prompt = resp.prompt if resp.status == "clear" else (resp.synthesized_prompt or "")
# 2) Decompose
decomposer = DecompositionSession.from_config(config, llm_client=llm_client, mcp_client=mcp_client)
decomp = decomposer.decompose(synthesized_prompt)
# 3) Generate
generator = GenerationSession.from_config(config, llm_client=llm_client, mcp_client=mcp_client)
result = generator.generate(tasks=decomp.tasks, original_prompt=synthesized_prompt)
print(result.code)uv run pytest testsuv run ruff check vtk_sequential_thinking/ tests/uv run pytest tests --cov=vtk_sequential_thinking --cov-report=term-missingpydantic- Data validationpython-dotenv- Environment configurationtyper/rich- CLIanthropic/openai/google-generativeai- LLM providersmcp- MCP client for VTK API validationvtkapi-mcp- VTK API MCP server
- Indexing Tools - Use parent vtk-rag repository
- RAG requires Qdrant:
vtk_sequential_thinking.rag.clientexpects a live Qdrant server. - VTK API tooling: configure
VTK_API_DOCS_PATHfor MCP-based API grounding/validation.
- vtk-rag - Related RAG/indexing tooling
- vtkapi-mcp - VTK API validation MCP server
MIT