Skip to content

Open WebUI Logic Tree Multi-Model LLM Project - Advanced AI reasoning with Sakana AI's AB-MCTS and sophisticated multi-model collaboration

License

Notifications You must be signed in to change notification settings

johnsonfarmsus/openwebui-ab-mcts-pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

27 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

AB-MCTS & Multi-Model Reasoning Engine for Open WebUI

Advanced reasoning models for Open WebUI using Adaptive Branching Monte Carlo Tree Search (AB-MCTS) and Multi-Model collaboration.

🎯 Project Overview

This project implements Sakana AI's AB-MCTS (Adaptive Branching Monte Carlo Tree Search) algorithm and a Multi-Model collaboration system, both integrated with Open WebUI as selectable AI models for advanced reasoning and decision-making.

Key Features

  • AB-MCTS Pipeline: Advanced tree search with LLM-as-judge quality evaluation
    • Multi-criterion evaluation (accuracy, completeness, clarity, relevance)
    • Configurable criterion weights
    • Support for 1-2 judge models for consensus
    • Real-time tree visualization
  • Multi-Model Pipeline: Multi-model collaboration for comprehensive answers
  • OpenAI-Compatible API: Native integration with Open WebUI's model system
  • Real-time Monitoring: Prometheus metrics and Grafana dashboards
  • Experiment Logging: SQLite + JSONL run tracking for research and analysis
  • Interactive Dashboard: Configure models, judges, and visualize search trees

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Open WebUI Interface                     β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Model Selection:                                           β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                  β”‚
β”‚  β”‚   ab-mcts       β”‚  β”‚  multi-model    β”‚                  β”‚
β”‚  β”‚                 β”‚  β”‚                 β”‚                  β”‚
β”‚  β”‚ β€’ Tree Search   β”‚  β”‚ β€’ Collaboration β”‚                  β”‚
β”‚  β”‚ β€’ Deep Analysis β”‚  β”‚ β€’ Multi-perspective                β”‚
β”‚  β”‚ β€’ Best Quality  β”‚  β”‚ β€’ Comprehensive β”‚                  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                β”‚
                                β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              Model Integration Service (8098)               β”‚
β”‚                  OpenAI-Compatible API                      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                β”‚
                β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                β–Ό                               β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   AB-MCTS Service       β”‚    β”‚  Multi-Model Service    β”‚
β”‚   (port 8094)           β”‚    β”‚  (port 8090)            β”‚
β”‚                         β”‚    β”‚                         β”‚
β”‚ β€’ TreeQuest Algorithm   β”‚    β”‚ β€’ Direct Collaboration  β”‚
β”‚ β€’ Thompson Sampling     β”‚    β”‚ β€’ Model Voting         β”‚
β”‚ β€’ Anti-Hallucination    β”‚    β”‚ β€’ Synthesis            β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                β”‚                               β”‚
                β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        Ollama                               β”‚
β”‚              Local LLM Inference Engine                     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“ Project Structure

openwebui-setup/
β”œβ”€β”€ README.md                           # This file
β”œβ”€β”€ docker-compose.yml                  # Docker orchestration
β”œβ”€β”€ Dockerfile                          # Container definition
β”œβ”€β”€ requirements.txt                    # Python dependencies
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ api/
β”‚   β”‚   └── main.py                     # Management API (port 8095)
β”‚   β”œβ”€β”€ services/
β”‚   β”‚   β”œβ”€β”€ proper_treequest_ab_mcts_service.py  # AB-MCTS (port 8094)
β”‚   β”‚   β”œβ”€β”€ proper_multi_model_service.py        # Multi-Model (port 8090)
β”‚   β”‚   β”œβ”€β”€ experiment_logger.py                 # Run logging
β”‚   β”‚   └── config_manager.py                    # Configuration
β”‚   β”œβ”€β”€ model_integration.py           # OpenAI-compatible model API (8098)
β”‚   └── openwebui_integration.py       # Tool endpoints (8097)
β”œβ”€β”€ interfaces/
β”‚   β”œβ”€β”€ dashboard.html                  # Management dashboard
β”‚   └── idiots_guide.html              # Setup guide
└── logs/                               # Experiment logs and runs

πŸš€ Quick Start

Prerequisites

  • Docker and Docker Compose
  • Ollama running locally (port 11434)
  • Recommended models: llama3.2:latest, qwen2.5:latest, deepseek-r1:1.5b

Installation

  1. Clone the repository:
git clone https://github.com/yourusername/openwebui-setup.git
cd openwebui-setup
  1. Pull Ollama models:
ollama pull llama3.2:latest
ollama pull qwen2.5:latest
ollama pull deepseek-r1:1.5b
  1. Start all services:
docker-compose up -d
  1. Verify services:
docker-compose ps

All services should show "Up" status.

Connecting to Open WebUI

  1. Open Open WebUI at http://localhost:3000

  2. Add the model provider:

    • Click your profile β†’ Settings
    • Go to Connections
    • Click + Add Connection
    • Select OpenAI
    • API Base URL: http://model-integration:8098
    • API Key: dummy-key (any value works)
    • Click Verify Connection β†’ Should show "βœ“ Connected"
    • Click Save
  3. Select a model:

    • Start a new chat
    • Click the model dropdown
    • Select either:
      • ab-mcts - Advanced tree search reasoning
      • multi-model - Collaborative AI

Using the Models

AB-MCTS

Best for:

  • Complex problem solving
  • Multi-step reasoning
  • Strategic planning
  • Mathematical proofs
  • Decision trees

Example queries:

  • "Design a distributed caching system for a social media platform"
  • "Prove that the square root of 2 is irrational"
  • "What's the optimal strategy for a two-player game where..."

Note: Responses may take 30-120 seconds due to tree search exploration.

Multi-Model

Best for:

  • Comprehensive analysis
  • Multiple perspectives
  • Research questions
  • Balanced viewpoints
  • Faster responses

Example queries:

  • "Compare microservices vs monolithic architectures"
  • "Analyze the pros and cons of remote work"
  • "Explain quantum computing to different audiences"

πŸ”§ Services

Service Port Description
Open WebUI 3000 Main chat interface
Model Integration 8098 OpenAI-compatible model API
AB-MCTS Service 8094 TreeQuest AB-MCTS implementation
Multi-Model Service 8090 Multi-model collaboration
Backend API 8095 Management dashboard API
MCP Server 8096 Model Context Protocol bridge
Prometheus 9090 Metrics collection
Grafana 3001 Dashboards and visualization
HTTP Server 8081 Static interfaces

βš™οΈ Configuration

Using the Dashboard (Recommended)

Access the interactive dashboard at http://localhost:8081/dashboard.html

Features:

  • Model Selection: Choose which Ollama models power each service
  • Judge Configuration: Select 1-2 LLMs to evaluate solution quality
  • Criterion Weights: Adjust importance of accuracy, completeness, clarity, and relevance
  • Search Parameters: Configure iterations and max depth
  • Tree Visualization: View AB-MCTS search trees (Sakana AI style)
  • Run History: Browse past queries and their exploration trees

AB-MCTS Parameters

Configure via Dashboard or API:

curl -X POST http://localhost:8094/params/update \
  -H "Content-Type: application/json" \
  -d '{
    "iterations": 20,
    "max_depth": 5
  }'

Parameters:

  • iterations: Number of search iterations (1-100, default: 20)
    • Higher = better quality, slower response
    • Recommended: 10-20 for most queries
  • max_depth: Maximum tree depth (1-20, default: 5)
    • Higher = deeper reasoning, slower response
    • Recommended: 3-5 for most queries

Judge Models (LLM-as-Judge)

AB-MCTS uses LLM judges to evaluate solution quality on 4 criteria:

Criteria:

  1. Accuracy: Is it factually correct?
  2. Completeness: Does it fully answer the question?
  3. Clarity: Is it well-explained and understandable?
  4. Relevance: Is it on-topic and addresses the query?

Configuration:

# Set judge models (1-2 recommended for consensus)
curl -X POST http://localhost:8094/judges/update \
  -H "Content-Type: application/json" \
  -d '{"judge_models": ["qwen3:0.6b"]}'

# Adjust criterion weights (auto-normalizes to 100%)
curl -X POST http://localhost:8094/weights/update \
  -H "Content-Type: application/json" \
  -d '{
    "weights": {
      "accuracy": 0.4,
      "completeness": 0.3,
      "clarity": 0.2,
      "relevance": 0.1
    }
  }'

Notes:

  • Using 2 judges provides consensus and reduces bias
  • Weights persist across restarts
  • All settings are managed in the dashboard UI

Model Selection

Update which Ollama models each service uses:

# AB-MCTS models
curl -X POST http://localhost:8094/models/update \
  -H "Content-Type: application/json" \
  -d '{"models": ["llama3.2:latest", "qwen2.5:latest"]}'

# Multi-Model models
curl -X POST http://localhost:8090/models/update \
  -H "Content-Type: application/json" \
  -d '{"models": ["llama3.2:latest", "qwen2.5:latest", "deepseek-r1:1.5b"]}'

πŸ“Š Monitoring

Prometheus Metrics

Access Prometheus at http://localhost:9090

Key metrics:

  • model_integration_requests_total - Total requests by model
  • model_integration_success_total - Successful responses
  • model_integration_failures_total - Failed responses
  • model_integration_latency_seconds - Response time histogram
  • model_integration_active_queries - Current active queries

Grafana Dashboards

Access Grafana at http://localhost:3001 (credentials: admin/admin)

Pre-configured dashboards:

  • Request rates and success rates
  • Latency percentiles (p50, p95, p99)
  • Active query monitoring
  • Error rates by type
  • Service health status

Experiment Logs & Tree Visualization

All AB-MCTS runs are logged with complete search tree data:

  • logs/runs.db - SQLite index
  • logs/runs/YYYYMMDD/run_<id>.jsonl - Event stream per run
  • logs/selected_models_abmcts.json - Persisted configuration

View in Dashboard:

  1. Go to http://localhost:8081/dashboard.html
  2. Click "Research Explorer" tab
  3. Click any run to view:
    • Full hierarchical search tree visualization (Sakana AI style)
    • Per-node quality scores and judge evaluations
    • Model performance across iterations
    • Complete response text for each node

Tree Visualization Features:

  • D3.js interactive tree graph
  • Color-coded by model and quality
  • Zoom and pan navigation
  • Click nodes to see full details
  • Shows parent-child relationships
  • Identifies best solution path

API Access:

  • List runs: GET http://localhost:8094/runs?limit=50
  • Run details: GET http://localhost:8094/runs/{run_id}
  • Tree data: GET http://localhost:8094/runs/{run_id}/tree

πŸ› Troubleshooting

Models not appearing in Open WebUI

Check model integration service:

curl http://localhost:8098/health
curl http://localhost:8098/v1/models

Verify Open WebUI connection:

  • Settings β†’ Connections β†’ Verify the connection shows "βœ“ Connected"
  • Try refreshing the page
  • Check browser console for errors

Slow responses

Reduce AB-MCTS iterations:

curl -X POST http://localhost:8095/api/config \
  -d '{"ab_mcts_iterations": 10, "ab_mcts_max_depth": 3}'

Use faster Ollama models:

ollama pull llama3.2:1b  # Smaller, faster model

Check Ollama performance:

time curl http://localhost:11434/api/generate \
  -d '{"model":"llama3.2:latest","prompt":"test","stream":false}'

Service connection errors

Check all services are running:

docker-compose ps

View service logs:

docker logs model-integration
docker logs ab-mcts-service
docker logs multi-model-service

Restart services:

docker-compose restart

🚧 Known Issues

  • Timeouts: AB-MCTS can take 30-120s on complex queries (streaming keeps UI responsive)
  • Verbosity: AB-MCTS responses can be lengthy (working on length controls)
  • Quality drift: Occasional hallucinations (add stricter validation)

πŸ“š API Reference

Model Integration Service (port 8098)

OpenAI-Compatible Endpoints:

  • GET /v1/models - List available models
  • POST /v1/chat/completions - Chat completions

Management Endpoints:

  • GET /health - Health check
  • GET /metrics - Prometheus metrics
  • GET /performance - Performance statistics
  • GET /config - Current configuration
  • POST /config - Update configuration

AB-MCTS Service (port 8094)

  • POST /query - Run AB-MCTS query
    • Body: {"query": "...", "iterations": 20, "max_depth": 5}
  • GET /models - List available models
  • POST /models/update - Update model selection
  • GET /health - Health check
  • GET /metrics - Prometheus metrics

Multi-Model Service (port 8090)

  • POST /query - Run multi-model query
    • Body: {"query": "..."}
  • GET /models - List available models
  • POST /models/update - Update model selection
  • GET /health - Health check
  • GET /metrics - Prometheus metrics

🀝 Related Projects

πŸ“„ License

MIT License - See LICENSE file for details.

πŸ™ Acknowledgments

πŸ“– Additional Documentation

  • ARCHITECTURE.md - Detailed architecture and design
  • API_REFERENCE.md - Complete API documentation
  • DEPLOYMENT.md - Production deployment guide
  • docs/research/RESEARCH_GUIDE.md - Research and analysis guide

About

Open WebUI Logic Tree Multi-Model LLM Project - Advanced AI reasoning with Sakana AI's AB-MCTS and sophisticated multi-model collaboration

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •