Skip to content

patrickoleary/vtk-sequential-thinking

Repository files navigation

VTK Sequential Thinking

RAG-based VTK Python code generation with prompt clarification, task decomposition, and sequential code generation.

Overview

This project turns a user prompt into runnable VTK Python code via three stages:

  • Prompt clarification (ClarificationSession)
  • Task decomposition (DecompositionSession)
  • Code generation (GenerationSession)

Quick Start

Prerequisites

  • Python 3.10+
  • uv - Fast Python package manager
# Install uv if not already installed
curl -LsSf https://astral.sh/uv/install.sh | sh

1. Run Setup

./setup.sh

This creates a .venv virtual environment using uv and installs dependencies interactively.

Or install manually:

# Create virtual environment
uv venv .venv
source .venv/bin/activate

# Install package with dev dependencies
uv pip install -e ".[dev]"

# Optional extras
uv pip install -e ".[llm]"   # LLM providers
uv pip install -e ".[mcp]"   # VTK API tooling
uv pip install -e ".[rag]"   # RAG (requires Qdrant)
uv pip install -e ".[vtk]"   # VTK runtime

# All extras
uv pip install -e ".[dev,llm,mcp,rag,vtk]"

2. Configure Environment

cp .env.example .env
# Edit .env with your LLM API key

3. Start Qdrant

docker run -d -p 6333:6333 qdrant/qdrant

4. Index Your Data

You'll need to index your VTK documentation. The data files are:

  • data/vtk-python-docs.jsonl (61 MB) - API documentation
  • data/raw/vtk-python-examples.jsonl (5.4 MB) - Code examples
  • data/raw/vtk-python-tests.jsonl (4.8 MB) - Test cases

Note: Indexing tools are in the parent vtk-rag repository. You need to build the Qdrant index before querying.

5. Use the CLI

source .venv/bin/activate
vtk-st --help

# Evaluate prompt clarity
vtk-st evaluate "Read a VTK file and visualize it"

# Clarify a prompt (interactive by default)
vtk-st query "Read a VTK file and visualize it"

# Decompose into tasks
vtk-st decompose "Read volume.vti and create an isosurface at value 135"

# Full pipeline
vtk-st pipeline "Read volume.vti and create an isosurface at value 135"

Repository Structure

vtk-sequential-thinking/
├── pyproject.toml
├── README.md
├── setup.sh
├── examples/
├── tests/
└── vtk_sequential_thinking/

Architecture

High-level pipeline

User Prompt
  -> ClarificationSession (optional, interactive)
  -> DecompositionSession (LLM + MCP tooling)
  -> GenerationSession (LLM + MCP + RAG)
  -> Python code output

Project structure (current)

vtk_sequential_thinking/
├── __init__.py
├── cli.py
├── config.py
├── llm/
│   ├── __init__.py
│   ├── client.py
│   └── json_protocol.py
├── mcp/
│   ├── __init__.py
│   ├── client.py
│   └── persistent_client.py
├── prompt_clarification/
│   ├── __init__.py
│   ├── models.py
│   ├── prompts.py
│   ├── clarifier.py
│   └── session.py
├── task_decomposition/
│   ├── __init__.py
│   ├── models.py
│   ├── prompts.py
│   ├── decomposer.py
│   └── session.py
├── sequential_generation/
│   ├── __init__.py
│   ├── models.py
│   ├── prompts.py
│   ├── generator.py
│   ├── code_assembler.py
│   └── session.py
└── rag/
    ├── __init__.py
    ├── client.py
    ├── models.py
    └── ranking.py

Public API (library)

The library exports three “session” entry points:

  • ClarificationSession (prompt -> synthesized prompt)
  • DecompositionSession (prompt -> tasks)
  • GenerationSession (tasks -> code)

They are exported from vtk_sequential_thinking/__init__.py as aliases of the internal Session classes in each subpackage.


Stage 1: Prompt clarification

Key data:

  • SessionResponse.status: one of clear, needs_clarification, ready_to_synthesize, synthesized, restart, skipped
  • SessionResponse.prompt: the original prompt
  • SessionResponse.questions: pending questions (if any)
  • SessionResponse.synthesized_prompt: only set after synthesis

Key files:

  • vtk_sequential_thinking/prompt_clarification/models.py
  • vtk_sequential_thinking/prompt_clarification/clarifier.py
  • vtk_sequential_thinking/prompt_clarification/session.py

Stage 2: Task decomposition

Key data:

  • Task: {id, task_type, description, search_query, depends_on, vtk_classes, from_prompt}
  • DecompositionResult: {tasks, output_type, reasoning}

The decomposition session supports:

  • decompose(prompt)
  • refine(modifications, additions)
  • finalize()

Key files:

  • vtk_sequential_thinking/task_decomposition/models.py
  • vtk_sequential_thinking/task_decomposition/prompts.py
  • vtk_sequential_thinking/task_decomposition/decomposer.py
  • vtk_sequential_thinking/task_decomposition/session.py

Stage 3: Sequential code generation

Flow:

tasks[]
  -> Generator.generate(task)
     - (optional) retrieve examples via RAG
     - tool loop via MCP (VTK API grounding)
     - JSONProtocol decoding into TaskResult
  -> CodeAssembler.add_snippet(...)
  -> CodeAssembler.assemble() -> final_code

Output:

  • PipelineResult.code: final assembled code
  • PipelineResult.task_results: per-task outputs

Key files:

  • vtk_sequential_thinking/sequential_generation/session.py
  • vtk_sequential_thinking/sequential_generation/generator.py
  • vtk_sequential_thinking/sequential_generation/code_assembler.py
  • vtk_sequential_thinking/sequential_generation/models.py

CLI mapping

The CLI is implemented in vtk_sequential_thinking/cli.py using Typer.

  • vtk-st evaluate: clarity evaluation only
  • vtk-st query: interactive clarification (outputs synthesized prompt)
  • vtk-st decompose: prompt -> tasks JSON
  • vtk-st generate: tasks JSON -> code
  • vtk-st pipeline: clarify -> decompose -> generate

Tests

Tests are split into offline-safe unit tests and CLI-level integration tests:

  • tests/unit/
  • tests/integration/

Many integration tests monkeypatch external clients so they can run without live services.


Examples

  • examples/clarification_example.py: clarification only
  • examples/decomposition_example.py: decomposition/refinement only
  • examples/generation_example.py: generation only
  • examples/pipeline_example.py: full pipeline demonstration

LLM providers

The LLM client supports multiple providers:

  • OpenAI
  • Anthropic
  • Google

Configuration

Environment Variables (.env)

# LLM Provider (choose one)
LLM_PROVIDER=anthropic          # anthropic, openai, google, local

# API Keys
ANTHROPIC_API_KEY=...
OPENAI_API_KEY=...
GOOGLE_API_KEY=...

# Model Selection
ANTHROPIC_MODEL=...
OPENAI_MODEL=...
GOOGLE_MODEL=...

# VTK API docs (used by vtkapi-mcp tooling)
VTK_API_DOCS_PATH=data/vtk-python-docs.jsonl

# Qdrant (RAG)
QDRANT_URL=http://localhost:6333
QDRANT_CODE_COLLECTION=vtk_code

Usage Examples

Programmatic usage

from vtk_sequential_thinking import (
    ClarificationSession,
    DecompositionSession,
    GenerationSession,
    LLMClient,
    MCPClient,
    load_config,
)

config = load_config()
llm_client = LLMClient(app_config=config)
mcp_client = MCPClient(app_config=config)

# 1) Clarify
clarify = ClarificationSession.from_config(config, llm_client=llm_client)
resp = clarify.submit_prompt("Read a VTK file and visualize it")
if resp.status != "clear":
    # In a real app, you'd iterate questions and then call synthesize()
    resp = clarify.synthesize()
synthesized_prompt = resp.prompt if resp.status == "clear" else (resp.synthesized_prompt or "")

# 2) Decompose
decomposer = DecompositionSession.from_config(config, llm_client=llm_client, mcp_client=mcp_client)
decomp = decomposer.decompose(synthesized_prompt)

# 3) Generate
generator = GenerationSession.from_config(config, llm_client=llm_client, mcp_client=mcp_client)
result = generator.generate(tasks=decomp.tasks, original_prompt=synthesized_prompt)
print(result.code)

Development

Tests

uv run pytest tests

Lint

uv run ruff check vtk_sequential_thinking/ tests/

Coverage (terminal)

uv run pytest tests --cov=vtk_sequential_thinking --cov-report=term-missing

Core Dependencies

  • pydantic - Data validation
  • python-dotenv - Environment configuration
  • typer / rich - CLI
  • anthropic / openai / google-generativeai - LLM providers
  • mcp - MCP client for VTK API validation
  • vtkapi-mcp - VTK API MCP server

Not Included

  • Indexing Tools - Use parent vtk-rag repository

Notes

  • RAG requires Qdrant: vtk_sequential_thinking.rag.client expects a live Qdrant server.
  • VTK API tooling: configure VTK_API_DOCS_PATH for MCP-based API grounding/validation.

Related Projects


License

MIT

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published