VTK Sequential Thinking

RAG-based VTK Python code generation with prompt clarification, task decomposition, and sequential code generation.

Overview

This project turns a user prompt into runnable VTK Python code via three stages:

Prompt clarification (ClarificationSession)
Task decomposition (DecompositionSession)
Code generation (GenerationSession)

Quick Start

Prerequisites

Python 3.10+
uv - Fast Python package manager

# Install uv if not already installed
curl -LsSf https://astral.sh/uv/install.sh | sh

1. Run Setup

./setup.sh

This creates a .venv virtual environment using uv and installs dependencies interactively.

Or install manually:

# Create virtual environment
uv venv .venv
source .venv/bin/activate

# Install package with dev dependencies
uv pip install -e ".[dev]"

# Optional extras
uv pip install -e ".[llm]"   # LLM providers
uv pip install -e ".[mcp]"   # VTK API tooling
uv pip install -e ".[rag]"   # RAG (requires Qdrant)
uv pip install -e ".[vtk]"   # VTK runtime

# All extras
uv pip install -e ".[dev,llm,mcp,rag,vtk]"

2. Configure Environment

cp .env.example .env
# Edit .env with your LLM API key

3. Start Qdrant

docker run -d -p 6333:6333 qdrant/qdrant

4. Index Your Data

You'll need to index your VTK documentation. The data files are:

data/vtk-python-docs.jsonl (61 MB) - API documentation
data/raw/vtk-python-examples.jsonl (5.4 MB) - Code examples
data/raw/vtk-python-tests.jsonl (4.8 MB) - Test cases

Note: Indexing tools are in the parent vtk-rag repository. You need to build the Qdrant index before querying.

5. Use the CLI

source .venv/bin/activate
vtk-st --help

# Evaluate prompt clarity
vtk-st evaluate "Read a VTK file and visualize it"

# Clarify a prompt (interactive by default)
vtk-st query "Read a VTK file and visualize it"

# Decompose into tasks
vtk-st decompose "Read volume.vti and create an isosurface at value 135"

# Full pipeline
vtk-st pipeline "Read volume.vti and create an isosurface at value 135"

Repository Structure

vtk-sequential-thinking/
├── pyproject.toml
├── README.md
├── setup.sh
├── examples/
├── tests/
└── vtk_sequential_thinking/

Architecture

High-level pipeline

User Prompt
  -> ClarificationSession (optional, interactive)
  -> DecompositionSession (LLM + MCP tooling)
  -> GenerationSession (LLM + MCP + RAG)
  -> Python code output

Project structure (current)

vtk_sequential_thinking/
├── __init__.py
├── cli.py
├── config.py
├── llm/
│   ├── __init__.py
│   ├── client.py
│   └── json_protocol.py
├── mcp/
│   ├── __init__.py
│   ├── client.py
│   └── persistent_client.py
├── prompt_clarification/
│   ├── __init__.py
│   ├── models.py
│   ├── prompts.py
│   ├── clarifier.py
│   └── session.py
├── task_decomposition/
│   ├── __init__.py
│   ├── models.py
│   ├── prompts.py
│   ├── decomposer.py
│   └── session.py
├── sequential_generation/
│   ├── __init__.py
│   ├── models.py
│   ├── prompts.py
│   ├── generator.py
│   ├── code_assembler.py
│   └── session.py
└── rag/
    ├── __init__.py
    ├── client.py
    ├── models.py
    └── ranking.py

Public API (library)

The library exports three “session” entry points:

ClarificationSession (prompt -> synthesized prompt)
DecompositionSession (prompt -> tasks)
GenerationSession (tasks -> code)

They are exported from vtk_sequential_thinking/__init__.py as aliases of the internal Session classes in each subpackage.

Stage 1: Prompt clarification

Key data:

SessionResponse.status: one of clear, needs_clarification, ready_to_synthesize, synthesized, restart, skipped
SessionResponse.prompt: the original prompt
SessionResponse.questions: pending questions (if any)
SessionResponse.synthesized_prompt: only set after synthesis

Key files:

vtk_sequential_thinking/prompt_clarification/models.py
vtk_sequential_thinking/prompt_clarification/clarifier.py
vtk_sequential_thinking/prompt_clarification/session.py

Stage 2: Task decomposition

Key data:

Task: {id, task_type, description, search_query, depends_on, vtk_classes, from_prompt}
DecompositionResult: {tasks, output_type, reasoning}

The decomposition session supports:

decompose(prompt)
refine(modifications, additions)
finalize()

Key files:

vtk_sequential_thinking/task_decomposition/models.py
vtk_sequential_thinking/task_decomposition/prompts.py
vtk_sequential_thinking/task_decomposition/decomposer.py
vtk_sequential_thinking/task_decomposition/session.py

Stage 3: Sequential code generation

Flow:

tasks[]
  -> Generator.generate(task)
     - (optional) retrieve examples via RAG
     - tool loop via MCP (VTK API grounding)
     - JSONProtocol decoding into TaskResult
  -> CodeAssembler.add_snippet(...)
  -> CodeAssembler.assemble() -> final_code

Output:

PipelineResult.code: final assembled code
PipelineResult.task_results: per-task outputs

Key files:

vtk_sequential_thinking/sequential_generation/session.py
vtk_sequential_thinking/sequential_generation/generator.py
vtk_sequential_thinking/sequential_generation/code_assembler.py
vtk_sequential_thinking/sequential_generation/models.py

CLI mapping

The CLI is implemented in vtk_sequential_thinking/cli.py using Typer.

vtk-st evaluate: clarity evaluation only
vtk-st query: interactive clarification (outputs synthesized prompt)
vtk-st decompose: prompt -> tasks JSON
vtk-st generate: tasks JSON -> code
vtk-st pipeline: clarify -> decompose -> generate

Tests

Tests are split into offline-safe unit tests and CLI-level integration tests:

tests/unit/
tests/integration/

Many integration tests monkeypatch external clients so they can run without live services.

Examples

examples/clarification_example.py: clarification only
examples/decomposition_example.py: decomposition/refinement only
examples/generation_example.py: generation only
examples/pipeline_example.py: full pipeline demonstration

LLM providers

The LLM client supports multiple providers:

OpenAI
Anthropic
Google

Configuration

Environment Variables (.env)

# LLM Provider (choose one)
LLM_PROVIDER=anthropic          # anthropic, openai, google, local

# API Keys
ANTHROPIC_API_KEY=...
OPENAI_API_KEY=...
GOOGLE_API_KEY=...

# Model Selection
ANTHROPIC_MODEL=...
OPENAI_MODEL=...
GOOGLE_MODEL=...

# VTK API docs (used by vtkapi-mcp tooling)
VTK_API_DOCS_PATH=data/vtk-python-docs.jsonl

# Qdrant (RAG)
QDRANT_URL=http://localhost:6333
QDRANT_CODE_COLLECTION=vtk_code

Usage Examples

Programmatic usage

from vtk_sequential_thinking import (
    ClarificationSession,
    DecompositionSession,
    GenerationSession,
    LLMClient,
    MCPClient,
    load_config,
)

config = load_config()
llm_client = LLMClient(app_config=config)
mcp_client = MCPClient(app_config=config)

# 1) Clarify
clarify = ClarificationSession.from_config(config, llm_client=llm_client)
resp = clarify.submit_prompt("Read a VTK file and visualize it")
if resp.status != "clear":
    # In a real app, you'd iterate questions and then call synthesize()
    resp = clarify.synthesize()
synthesized_prompt = resp.prompt if resp.status == "clear" else (resp.synthesized_prompt or "")

# 2) Decompose
decomposer = DecompositionSession.from_config(config, llm_client=llm_client, mcp_client=mcp_client)
decomp = decomposer.decompose(synthesized_prompt)

# 3) Generate
generator = GenerationSession.from_config(config, llm_client=llm_client, mcp_client=mcp_client)
result = generator.generate(tasks=decomp.tasks, original_prompt=synthesized_prompt)
print(result.code)

Development

Tests

uv run pytest tests

Lint

uv run ruff check vtk_sequential_thinking/ tests/

Coverage (terminal)

uv run pytest tests --cov=vtk_sequential_thinking --cov-report=term-missing

Core Dependencies

pydantic - Data validation
python-dotenv - Environment configuration
typer / rich - CLI
anthropic / openai / google-generativeai - LLM providers
mcp - MCP client for VTK API validation
vtkapi-mcp - VTK API MCP server

Not Included

Indexing Tools - Use parent vtk-rag repository

Notes

RAG requires Qdrant: vtk_sequential_thinking.rag.client expects a live Qdrant server.
VTK API tooling: configure VTK_API_DOCS_PATH for MCP-based API grounding/validation.

Related Projects

vtk-rag - Related RAG/indexing tooling
vtkapi-mcp - VTK API validation MCP server

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
examples		examples
tests		tests
vtk_sequential_thinking		vtk_sequential_thinking
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
clean.py		clean.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.sh		setup.sh
uv.lock		uv.lock

License

patrickoleary/vtk-sequential-thinking

Folders and files

Latest commit

History

Repository files navigation

VTK Sequential Thinking

Overview

Quick Start

Prerequisites

1. Run Setup

2. Configure Environment

3. Start Qdrant

4. Index Your Data

5. Use the CLI

Repository Structure

Architecture

High-level pipeline

Project structure (current)

Public API (library)

Stage 1: Prompt clarification

Stage 2: Task decomposition

Stage 3: Sequential code generation

CLI mapping

Tests

Examples

LLM providers

Configuration

Environment Variables (.env)

Usage Examples

Programmatic usage

Development

Tests

Lint

Coverage (terminal)

Core Dependencies

Not Included

Notes

Related Projects

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages