Six Thinking Hats - Multi-Agent Evaluation System

A multi-agent system for evaluating scenarios using Edward de Bono's Six Thinking Hats framework, built on dapr-agents.

Overview

This system orchestrates multiple AI agents to analyze scenarios from six perspectives:

Hat	Color	Thinking Style
White	⚪	Facts, data, information
Red	🔴	Emotions, intuition, gut feelings
Black	⚫	Caution, risks, problems
Yellow	🟡	Benefits, optimism, value
Green	🟢	Creativity, alternatives, new ideas
Blue	🔵	Process control, synthesis, decisions

Key Features

Parallel agents per hat: Multiple personas contribute in parallel
Structured synthesis: Raw contributions → aggregated insights → decisions
Human-in-the-loop: Act as Blue Hat, or run fully automated
Observable: Full tracing with Phoenix Arize
Evaluable: Automated quality scoring and config comparison

Project Status

🚧 In Development - Phase 2 in progress

Completed:

✅ Workspace schema with append-only contributions (ADR-010)
✅ Black Hat agent implementation with LLM integration
✅ Dapr integration for LLM calls via sidecar
✅ End-to-end demo working (examples/black_hat_demo.py)
✅ Phoenix Arize tracing infrastructure (ADR-006)

Next:

Evaluation framework (LLM-as-judge scoring)
Token tracking in agent spans
Fan-out to multiple personas per hat

Architecture

┌─────────────────────────────────────────────────────────────┐
│                     Orchestrator                            │
│                  (Dapr Workflow)                            │
└─────────────────────────┬───────────────────────────────────┘
                          │
        ┌─────────────────┼─────────────────┐
        │                 │                 │
        ▼                 ▼                 ▼
   ┌─────────┐      ┌─────────┐      ┌─────────┐
   │ Agent 1 │      │ Agent 2 │      │ Agent 3 │
   │(persona)│      │(persona)│      │(persona)│
   └────┬────┘      └────┬────┘      └────┬────┘
        │                │                │
        └────────────────┼────────────────┘
                         │
                         ▼
                  ┌─────────────┐
                  │ Aggregator  │
                  └──────┬──────┘
                         │
                         ▼
                  ┌─────────────┐
                  │  Workspace  │
                  │   (State)   │
                  └─────────────┘

See Architecture Decision Records for detailed design decisions.

Quick Start

Prerequisites

Python 3.11+
Dapr CLI
uv package manager
OpenAI API key (or other LLM provider)

Installation

# Clone and enter the repo
cd sixhats

# Install dependencies
uv sync

# Initialize Dapr
dapr init

# Create secrets file with your API key
echo '{"openai-api-key": "sk-proj-your-key"}' > ../secrets.json

Running the Black Hat Demo

# Start Phoenix Arize for observability (optional)
uv run phoenix serve

# Run the Black Hat agent demo via Dapr
dapr run --app-id black-hat-demo --resources-path ./components -- \
    uv run python examples/black_hat_demo.py

The demo:

Creates a scenario (Zero Trust Security Implementation)
Runs the Black Hat agent to identify risks
Displays the workspace with contribution and audit trail
Traces are visible in Phoenix at http://localhost:6006

Project Structure

sixhats/
├── docs/
│   └── adr/                 # Architecture Decision Records
├── src/
│   ├── agents/              # Hat agent implementations
│   │   ├── base.py          # Base agent class
│   │   └── black_hat.py     # Black Hat (risks/problems)
│   ├── schemas/             # Pydantic models
│   │   └── workspace.py     # Workspace, Contribution, Audit
│   ├── observability/       # Tracing configuration
│   │   └── tracing.py       # Phoenix/OTEL setup
│   ├── services/            # Business logic services
│   ├── workflows/           # Dapr workflow definitions
│   └── evals/               # Evaluation framework
├── components/              # Dapr component configurations
├── examples/                # Runnable demos
│   └── black_hat_demo.py    # End-to-end Black Hat demo
├── tests/                   # Test suite
└── scripts/                 # Dev utilities

Architecture Decision Records

See docs/adr/ for all architectural decisions. Run uv run scripts/adr-list.py to list accepted ADRs with their rules.

Action Plan

Phase 1: Foundation ✅

Set up local Dapr environment
Define workspace schema (ADR-010)
Establish observability strategy (ADR-006)

Phase 2: Single Hat Prototype 🔄

Phase 3: Evaluation ⬜

Structural validation (schema conformance)
LLM-as-judge scoring for hat outputs
Benchmark scenarios with expected themes

Phase 4: Observability 🔄

OpenTelemetry tracing setup
Phoenix Arize integration
Token tracking in agent spans
Cost estimation per run

Phase 5: Full Workflow ⬜

All 6 hats implemented
Human-in-the-loop pause/resume
End-to-end orchestrated run

Phase 6: Polish ⬜

Documentation
Demo video
Blog post

Learning Goals

This project demonstrates:

Multi-agent orchestration with dapr-agents
Distributed systems patterns (scatter-gather, saga, actor model)
LLM application observability (tracing, metrics, cost tracking)
Evaluation frameworks for AI systems
Production-grade architecture (failure handling, state management)

References

dapr-agents - The underlying framework
Six Thinking Hats - The methodology
Dapr - Distributed application runtime

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.ai		.ai
.github/workflows		.github/workflows
components		components
docs/adr		docs/adr
examples		examples
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
noxfile.py		noxfile.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Six Thinking Hats - Multi-Agent Evaluation System

Overview

Key Features

Project Status

Architecture

Quick Start

Prerequisites

Installation

Running the Black Hat Demo

Project Structure

Architecture Decision Records

Action Plan

Phase 1: Foundation ✅

Phase 2: Single Hat Prototype 🔄

Phase 3: Evaluation ⬜

Phase 4: Observability 🔄

Phase 5: Full Workflow ⬜

Phase 6: Polish ⬜

Learning Goals

References

License

About

Uh oh!

Contributors 2

Uh oh!

Languages

xverges/sixhats

Folders and files

Latest commit

History

Repository files navigation

Six Thinking Hats - Multi-Agent Evaluation System

Overview

Key Features

Project Status

Architecture

Quick Start

Prerequisites

Installation

Running the Black Hat Demo

Project Structure

Architecture Decision Records

Action Plan

Phase 1: Foundation ✅

Phase 2: Single Hat Prototype 🔄

Phase 3: Evaluation ⬜

Phase 4: Observability 🔄

Phase 5: Full Workflow ⬜

Phase 6: Polish ⬜

Learning Goals

References

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors 2

Uh oh!

Languages