Skip to content
Change the repository type filter

All

    Repositories list

    • inspect-action

      Public
      Running UK AISI's Inspect in the Cloud
      Python
      MIT License
      10223420Updated Mar 17, 2026Mar 17, 2026
    • inspect_scout

      Public
      Python
      MIT License
      12100Updated Mar 17, 2026Mar 17, 2026
    • inspect_ai

      Public
      Inspect: A framework for large language model evaluations
      Python
      MIT License
      433402Updated Mar 17, 2026Mar 17, 2026
    • A collection of METR wrappers around Inspect agents and of METR scanners for Inspect Scout. Intended to allow consistent usage and customization.
      Python
      1544Updated Mar 16, 2026Mar 16, 2026
    • inspect-metr-task-bridge

      Public
      Python
      11111Updated Mar 16, 2026Mar 16, 2026
    • macOS GUI to view large inspect samples. Integrated with Hawk
      Swift
      0000Updated Mar 13, 2026Mar 13, 2026
    • triframe_inspect

      Public
      Python
      0470Updated Mar 11, 2026Mar 11, 2026
    • A Kubernetes sandbox environment for use with inspect_ai
      Python
      MIT License
      20200Updated Mar 10, 2026Mar 10, 2026
    • Inspect tasks <> Tinker RL envs
      Python
      MIT License
      1600Updated Mar 10, 2026Mar 10, 2026
    • Python
      Other
      8495Updated Mar 6, 2026Mar 6, 2026
    • Public repository containing METR's DVC pipeline for eval data analysis
      Python
      4524283Updated Mar 6, 2026Mar 6, 2026
    • Python
      1401Updated Feb 24, 2026Feb 24, 2026
    • Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity: https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study…
      Python
      21401Updated Feb 23, 2026Feb 23, 2026
    • Demo repo for transcripts analysis OCO for 3PRA
      Python
      0000Updated Feb 20, 2026Feb 20, 2026
    • Post-training with Tinker
      Python
      Apache License 2.0
      351000Updated Feb 18, 2026Feb 18, 2026
    • vivaria

      Public
      Vivaria is METR's tool for running evaluations and conducting agent elicitation research.
      TypeScript
      MIT License
      381352185Updated Feb 15, 2026Feb 15, 2026
    • HCL
      Apache License 2.0
      28000Updated Feb 6, 2026Feb 6, 2026
    • SCSS
      MIT License
      4302Updated Feb 4, 2026Feb 4, 2026
    • Datadog MCP Server - Comprehensive monitoring and observability tools for Datadog via Model Context Protocol
      Python
      22000Updated Jan 30, 2026Jan 30, 2026
    • Modelscan but in Inspect
      Python
      0201Updated Jan 20, 2026Jan 20, 2026
    • prime-rl

      Public
      Decentralized RL Training at Scale
      Python
      Apache License 2.0
      229000Updated Jan 20, 2026Jan 20, 2026
    • HTML
      Other
      41921Updated Jan 19, 2026Jan 19, 2026
    • HTML
      Other
      1912112Updated Jan 19, 2026Jan 19, 2026
    • Python
      0010Updated Jan 7, 2026Jan 7, 2026
    • Bridge for inspect <> verifiers.
      Python
      MIT License
      0000Updated Jan 7, 2026Jan 7, 2026
    • Build docker containers using docker build cloud without a docker daemon
      HCL
      MIT License
      0100Updated Jan 2, 2026Jan 2, 2026
    • Estimate the time horizon of AIs over time on various domains like knowledge and vision
      Python
      2500Updated Dec 3, 2025Dec 3, 2025
    • Software Engineering Agents for Inspect AI
      Python
      MIT License
      16100Updated Nov 11, 2025Nov 11, 2025
    • Python
      1000Updated Nov 5, 2025Nov 5, 2025
    • .github

      Public
      0000Updated Nov 5, 2025Nov 5, 2025