Skip to content

matiasrodlo/veritas

Repository files navigation

Veritas: A Scientist for Autonomous Research

Veritas is an autonomous scientist, a system for fully automated scientific discovery that enables large language models (LLMs) to conduct research independently.

Started under the advisement of Felipe Muñoz Medina during his postdoctoral research at the Harvard T.H. Chan School of Public Health, Veritas was later selected among the top 1% of projects at Major League Hacking’s Open Source Hackathon 2025.

Tree Search System

Veritas includes advanced tree search capabilities:

  • Best-First Tree Search (BFTS): Explores multiple solution paths simultaneously
  • Multi-Stage Pipeline: Automatic progression through 4 research stages
  • Parallel Execution: Process multiple experiments in parallel
  • Automatic Paper Generation: LaTeX papers with citations and figures
  • Error Recovery: Automatic debugging and code improvement
  • Visualization: Rich tree visualization of experiment progress

Quick Start with Tree Search

from veritas.cognition.treesearch import launch_veritas

# Run complete experiment with tree search
results = launch_veritas(
    idea_file="my_idea.json",
    workspace_dir="experiments/my_experiment",
    model="mistral-local-rag",
    use_tree_search=True
)

print(f"Best metric: {results['best_node']['metric']}")
print(f"PDF paper: {results['pdf_path']}")

See docs/QUICK_START.md for detailed usage.

Installation

git clone https://github.com/matiasrodlo/veritas.git
cd veritas
python scripts/install.py

Optional: Download Mistral model (13GB+)

python scripts/install.py --download-model

Quick Start

# RAG system
python scripts/run.py

# Veritas Cognition
python scripts/run.py --system cognition

Usage

RAG System

from veritas import RAGSystem

rag = RAGSystem(
    embedding_model="sentence-transformers/all-MiniLM-L6-v2",
    llm_model="models/mistral-7b",
    index_path="models/faiss",
    device="mps"
)

response = rag.generate_rag_response(
    query="How does a RAG system work?",
    top_k=5,
    max_new_tokens=200
)

Tree Search (Recommended)

from veritas.cognition.treesearch import launch_veritas

# Create idea file (JSON)
idea = {
    "Title": "My Research Project",
    "Abstract": "Research description",
    "Short Hypothesis": "What we're testing",
    "Experiments": "Experiment description"
}

# Run experiment
results = launch_veritas(
    idea_file="idea.json",
    workspace_dir="experiments/my_experiment",
    use_tree_search=True
)

Requirements

  • Python 3.9+
  • Apple M4 Max with 128GB RAM (optimized configuration)
  • 16GB+ RAM minimum (128GB recommended for optimal performance)

Production

# Docker
docker-compose up -d

# API Server
python scripts/run_api.py --host 0.0.0.0 --port 8000

Implementation notes

  • Working directory: Code execution runs in a dedicated child process. The interpreter changes the process working directory to the experiment workspace; this is process-wide and not thread-safe. Use absolute paths when coordinating file I/O across components.

License

MIT License

About

Scientist for Autonomous Research

Topics

Resources

Stars

Watchers

Forks

Contributors