🔬 Autonomous Science Stack

A curated list of 150+ tools and libraries for building self-driving laboratories and autonomous research platforms.

From GPU compute to lab hardware, from multi-agent orchestration to experiment verification — everything you need to build systems that do science autonomously.

Maintained by Scivity Labs

Quick Links

Category	Description
🧪 Experiment Orchestration & Workflow	Schedulers, DAGs, and workflow engines that drive long-running scientific pipelines.
🤖 Multi-Agent Frameworks	Libraries for building coordinated agent systems that plan, act, and reason.
🔬 Self-Driving / Autonomous Labs	Reference projects and companies operating real autonomous laboratories.
⚙️ Hardware & Lab Automation	Drivers, protocols, and robotics frameworks that bridge code to physical instruments.
🖥️ GPU Compute Platforms	On-demand and reserved GPU infrastructure for training and inference.
📊 ML Experiment Tracking	Metrics, artifact, and hyperparameter tracking for ML workflows.
🔍 Verification & Reproducibility	Data versioning, validation, and environment-pinning tools.
📚 Scientific Knowledge Management	Literature APIs, reference managers, and citation graph tools.
📄 Scientific Data Extraction	Parsers for PDFs, tables, equations, and scientific figures.
🧠 LLM for Science	Models and agent systems purpose-built for scientific discovery.
🧬 Scientific Simulation & Modeling	Physics-based simulators used as in-silico filters in autonomous-science loops.
🎯 Bayesian Optimization & Active Learning	Decision core of closed-loop autonomous experimentation — picks the next experiment under a budget.
🔗 RL for Scientific Discovery	Reinforcement learning libraries applicable to scientific search problems.
🗄️ Vector Databases & Embeddings	Stores and models for semantic retrieval over scientific corpora.
📡 Data Pipeline & Event Systems	Message brokers and streaming systems for lab and compute events.
🛡️ Safety & Guardrails for Autonomous Systems	Policy enforcement, prompt-injection defense, and output validation.
📈 Monitoring & Observability	Metrics, logs, traces, and LLM-specific observability.

🧪 Experiment Orchestration & Workflow

Orchestrators schedule long-running, failure-prone scientific workflows — think multi-day simulations, instrument sweeps, and data ingestion pipelines. Good ones handle retries, artifact passing, and dynamic task graphs without forcing you into a rigid DSL. Pick one that matches your infra: Kubernetes-native, Python-native, or bioinformatics-specialized.

Library	Description	Link
Prefect	Python workflow orchestrator with dynamic task graphs, typed results, and work-pool-based deployment.	github.com/PrefectHQ/prefect
Apache Airflow	DAG-based scheduler with a large operator ecosystem, used widely for batch data pipelines.	github.com/apache/airflow
Dagster	Asset-oriented orchestrator with typed inputs and outputs, software-defined assets, and testable pipelines.	github.com/dagster-io/dagster
Nextflow	Dataflow workflow language widely used in bioinformatics, with first-class HPC and container support.	github.com/nextflow-io/nextflow
Snakemake	Python-like rules engine for reproducible scientific workflows, common in genomics.	github.com/snakemake/snakemake
Metaflow	Netflix's human-centric framework for data science workflows with versioning and AWS/K8s backends.	github.com/Netflix/metaflow
Argo Workflows	Kubernetes-native workflow engine where each step runs in its own container.	github.com/argoproj/argo-workflows
Luigi	Spotify's Python module for building complex batch pipelines with dependency resolution.	github.com/spotify/luigi
Kedro	Python framework for reproducible, modular data science pipelines with a catalog abstraction.	github.com/kedro-org/kedro
Flyte	Kubernetes-native orchestrator for typed, reproducible ML and data pipelines.	github.com/flyteorg/flyte
Temporal	Durable execution platform for long-running workflows with retries and versioning as primitives.	github.com/temporalio/temporal
Kubeflow Pipelines	ML workflow system for portable, scalable pipelines on Kubernetes.	github.com/kubeflow/pipelines

🤖 Multi-Agent Frameworks

Multi-agent frameworks coordinate LLM-driven workers that plan, call tools, and hand off tasks. In an autonomous-science context they sit between the orchestrator and the lab, turning a research goal into concrete experiments. The tradeoff is control vs. flexibility — graph-based frameworks are easier to debug, role-based ones move faster.

Library	Description	Link
LangGraph	Graph-based agent runtime from LangChain with checkpoints, streaming, and human-in-the-loop primitives.	github.com/langchain-ai/langgraph
CrewAI	Role-based multi-agent framework that assembles specialized agents into crews with shared tasks.	github.com/crewAIInc/crewAI
AutoGen	Microsoft's framework for multi-agent conversations with pluggable model backends and code execution.	github.com/microsoft/autogen
Swarms	Multi-agent orchestration framework with hierarchical, concurrent, and mixture-of-agents structures.	github.com/kyegomez/swarms
CAMEL	Research-oriented multi-agent framework focused on role-playing agents and agent-scaling studies.	github.com/camel-ai/camel
Agno	Stateless Python agent runtime with memory, knowledge, and a focus on low-latency instantiation.	github.com/agno-agi/agno
smolagents	Hugging Face's minimal agent library centered on code-writing agents.	github.com/huggingface/smolagents
PydanticAI	Typed agent framework that uses Pydantic schemas for inputs, outputs, and tool calls.	github.com/pydantic/pydantic-ai
OpenAI Agents SDK	OpenAI's Python SDK for building agents with handoffs, guardrails, and tracing.	github.com/openai/openai-agents-python
Claude Agent SDK	Anthropic's SDK for building agents on top of Claude with hooks, subagents, and custom tools.	github.com/anthropics/claude-agent-sdk-python
BeeAI Framework	IBM-backed framework for building agents in Python and TypeScript with MCP and A2A support.	github.com/i-am-bee/beeai-framework
Atomic Agents	Lightweight framework built around typed input/output schemas and composable atomic agents.	github.com/BrainBlend-AI/atomic-agents
Langroid	Multi-agent Python framework with first-class tools, vector stores, and message-provenance logging.	github.com/langroid/langroid

🔬 Self-Driving / Autonomous Labs

These are operational self-driving laboratories and companies — not libraries, but reference points for what a working autonomous lab looks like. Some publish papers, some ship APIs, a few run cloud labs that anyone can submit experiments to. Study their architectures when designing your own.

Project	Description	Link
A-Lab (LBNL / Berkeley)	Autonomous lab for inorganic solid-state synthesis combining robotics, ML, and literature data from the Ceder group.	ceder.berkeley.edu
Emerald Cloud Lab	Commercial cloud lab in Austin with 200+ instruments controlled remotely via a Wolfram-based command language.	emeraldcloudlab.com
Strateos	Automation-as-a-service cloud lab for drug discovery with programmatic APIs and hybrid on-prem deployments.	strateos.com
Chemify	Glasgow-based Chemputation facility that compiles digital code into physical organic-chemistry syntheses.	chemify.io
Arctoris	Oxford-based fully automated drug discovery platform running assay cascades on the Ulysses robotic system.	arctoris.com
LabGenius	London-based closed-loop platform for therapeutic antibody discovery built around the EVA robot.	labgeniustx.com
Kebotix	Boston-area self-driving lab for materials discovery combining generative models with robotic synthesis.	kebotix.com
Atinary	Bayesian-optimization and SDLabs software for closed-loop R&D, integrable with external automation.	atinary.com
PNNL Autonomous Science	DOE program applying autonomous-lab methods across chemistry, biology, and energy storage at PNNL.	pnnl.gov/autonomous-science
Lila Sciences	Flagship Pioneering company building closed-loop AI Science Factories across life, chemical, and materials science.	lila.ai
Argonne Polybot	Autonomous robotic platform for electronic-polymer discovery at Argonne's Center for Nanoscale Materials.	anl.gov polybot

⚙️ Hardware & Lab Automation

Everything physical bolts onto this layer — pipetting robots, plate readers, mass specs, flow reactors. The libraries here speak protocols (OPC UA, MQTT, VISA, SiLA2) or wrap vendor SDKs. Expect to write glue code; there is no universal driver.

Library	Description	Link
opcua-asyncio	Asyncio-based OPC UA client and server for Python, common in industrial and lab PLC integration.	github.com/FreeOpcUa/opcua-asyncio
PyVISA	Python bindings to the VISA standard for controlling test and measurement instruments over GPIB, USB, or serial.	github.com/pyvisa/pyvisa
paho-mqtt (Python)	Eclipse Paho MQTT client for Python, used for lightweight pub/sub between lab devices and controllers.	github.com/eclipse/paho.mqtt.python
Eclipse Mosquitto	Open-source MQTT broker often deployed as the message bus inside automated labs.	github.com/eclipse/mosquitto
LabVIEW	National Instruments' graphical programming environment for instrument control and DAQ.	ni.com/labview
Opentrons	Python API and firmware for the OT-2 and Flex liquid-handling robots.	github.com/Opentrons/opentrons
PyLabRobot	Hardware-agnostic Python SDK for liquid handlers, plate readers, heater-shakers, and scales.	github.com/PyLabRobot/pylabrobot
SiLA 2 Python	Python implementation of the SiLA 2 gRPC-based lab instrument interoperability standard.	gitlab.com/SiLA2/sila_python
Labman Automation	Commercial systems integrator that built Berkeley's A-Lab synthesis robot and similar custom platforms.	labmanautomation.com
ROS 2	Robot Operating System with DDS-based middleware for real-time control of robotic cells and mobile platforms.	github.com/ros2
Bluesky	Python experiment-control framework for synchrotrons and scattering beamlines, used across NSLS-II.	github.com/bluesky/bluesky

🖥️ GPU Compute Platforms

On-demand GPU platforms spare you from standing up Kubernetes just to run a fine-tune or a large batch inference job. They differ on cold-start latency, GPU availability (H100, B200, MI300), and whether they expose raw VMs or serverless functions. For autonomous-science pipelines, the serverless models integrate cleanly into workflow engines.

Platform	Description	Link
RunPod	Community and secure GPU cloud with on-demand pods and serverless endpoints that scale to zero.	runpod.io
Modal	Serverless Python platform that provisions GPU containers from decorators in under a second.	modal.com
Lambda	GPU cloud focused on H100 and Blackwell clusters with pre-installed ML frameworks and bare metal.	lambda.ai
Vast.ai	Marketplace for renting community GPUs at spot-style prices.	vast.ai
CoreWeave	Hyperscale NVIDIA-specialized cloud offering H100/B200 clusters and high-throughput networking.	coreweave.com
Paperspace (DigitalOcean)	GPU notebooks and droplets folded into DigitalOcean's AI platform.	paperspace.com
Lightning AI	Studio-based cloud workspaces on H100/H200 with persistent environments and multi-GPU training.	lightning.ai
Anyscale	Managed Ray platform for distributed training, batch inference, and serving on GPU clusters.	anyscale.com
Together AI	GPU cloud and inference API focused on open models, fine-tuning, and low-latency serving.	together.ai
Replicate	Hosted inference for open-source models with a simple HTTP API and per-second billing.	replicate.com

📊 ML Experiment Tracking

Tracking tools log metrics, parameters, artifacts, and code versions so experiments are reproducible and comparable. For autonomous science, the key feature is headless logging from long-running agents — not just interactive notebooks. Many of these integrate with orchestrators above.

Library	Description	Link
MLflow	Open-source tracking, model registry, and deployment framework with a large integration ecosystem.	github.com/mlflow/mlflow
Weights & Biases	Hosted tracking, artifact, and sweep service with deep SDK integration across frameworks.	github.com/wandb/wandb
Neptune.ai	Tracker built for foundation-model training with per-layer metrics and long-run logging.	github.com/neptune-ai/neptune-client
Comet	Experiment tracking and LLM evaluation platform with open-source Opik for LLM observability.	comet.com
Aim	Self-hosted open-source tracker with a fast UI for comparing thousands of runs.	github.com/aimhubio/aim
ClearML	End-to-end platform spanning experiment tracking, data management, agents, and serving.	github.com/clearml/clearml
Sacred	Lightweight Python library for configuring, organizing, and logging reproducible experiments.	github.com/IDSIA/sacred
DVC	Git-based data and model versioning with experiment tracking and pipeline DAGs.	github.com/iterative/dvc
TensorBoard	TensorFlow's local metric and graph visualization tool, also used with PyTorch and JAX.	github.com/tensorflow/tensorboard
Optuna	Hyperparameter optimization framework with pruning, distributed trials, and tracker integrations.	github.com/optuna/optuna

🔍 Verification & Reproducibility

Autonomous systems generate more data, configurations, and artifacts than humans can audit. These tools version datasets, validate schemas, check distributions, and pin environments so a run from six months ago still reproduces. Pair a data-versioning tool with a validation framework; each alone is half the story.

Library	Description	Link
Pachyderm	Kubernetes-native pipelines with immutable data versioning and lineage across pipeline stages.	github.com/pachyderm/pachyderm
lakeFS	Git-like branching and time travel over object stores for data-lake version control.	github.com/treeverse/lakeFS
Great Expectations	Declarative data-quality framework with expectations, profiling, and automated docs.	github.com/great-expectations/great_expectations
Evidently AI	Library for data and ML model validation, drift detection, and monitoring reports.	github.com/evidentlyai/evidently
Deepchecks	Open-source testing for data and ML models covering integrity, drift, and performance.	github.com/deepchecks/deepchecks
pytest	Python testing framework with fixtures, parametrization, and a plugin ecosystem.	github.com/pytest-dev/pytest
Hypothesis	Property-based testing for Python that auto-generates edge-case inputs.	github.com/HypothesisWorks/hypothesis
Nix	Purely functional package manager for reproducible, declarative environments.	github.com/NixOS/nix
GNU Guix	Reproducible, transactional package manager with a focus on scientific workflows.	guix.gnu.org
Pixi	Rust-based cross-platform package manager built on the conda ecosystem with workspace lockfiles.	github.com/prefix-dev/pixi

📚 Scientific Knowledge Management

Before running an experiment, an autonomous agent needs to know what's already been tried. This category covers literature APIs, citation graphs, and reference managers agents can query programmatically. Coverage, freshness, and rate limits vary — most real systems combine two or three sources.

Tool	Description	Link
Semantic Scholar API	Free API from Allen AI over hundreds of millions of papers, with paper search, citations, and recommendations.	api.semanticscholar.org
arXiv API	Official query API over arXiv's preprint corpus, returning Atom/XML metadata.	info.arxiv.org/help/api
OpenAlex	Open catalog of the global research system with a free, no-auth REST API over works, authors, and venues.	openalex.org
Zotero	Open-source reference manager with a web API, browser connectors, and a Python client.	github.com/zotero/zotero
Paperpile	Cloud-based reference manager for Google Docs and the web with an API for library access.	paperpile.com
Connected Papers	Graph-based visual explorer that surfaces similar papers via co-citation and bibliographic coupling.	connectedpapers.com
Elicit	AI research assistant with paper search, data extraction, and systematic-review workflows.	elicit.com
Consensus	Evidence-focused search engine over peer-reviewed literature with a Consensus Meter for agreement.	consensus.app
ScholarAI	Research assistant over 200M+ papers and patents with citation generation and Zotero sync.	scholarai.io
Scite	Smart Citations platform classifying citations as supporting, contrasting, or mentioning.	scite.ai

📄 Scientific Data Extraction

Most scientific knowledge is trapped in PDFs with tables, equations, and figures. These parsers extract structured content — some use layout models, some use LLMs, some are classic rule-based engines tuned for scientific papers. For high-throughput ingestion, benchmark on your actual corpus before committing.

Library	Description	Link
Docling	IBM/LF AI open-source document parser with unified DocTags output for PDFs, slides, and images.	github.com/docling-project/docling
PyMuPDF	Python bindings to MuPDF for PDF parsing, rendering, and text and image extraction.	github.com/pymupdf/PyMuPDF
Crawl4AI	LLM-friendly web crawler that outputs clean markdown and structured data for ingestion.	github.com/unclecode/crawl4ai
GROBID	Java machine-learning library that structures scholarly PDFs into TEI XML with high accuracy on references.	github.com/kermitt2/grobid
Marker	Fast PDF and office-doc to Markdown/JSON converter with optional LLM-assisted refinement.	github.com/datalab-to/marker
Nougat	Meta's vision transformer for converting scientific PDFs to Markdown with equation support.	github.com/facebookresearch/nougat
tabula-py	Maintained Python wrapper around Tabula for extracting tables from text-based PDFs.	github.com/chezou/tabula-py
Camelot	Python library focused on extracting tables from PDFs with stream and lattice parsers.	github.com/camelot-dev/camelot
Unstructured	ETL library for partitioning, cleaning, and chunking 25+ document types for LLM pipelines.	github.com/Unstructured-IO/unstructured
pdfplumber	Pure-Python PDF parser with fine-grained access to characters, tables, and layout metadata.	github.com/jsvine/pdfplumber

🧠 LLM for Science

These are models and agent systems specifically designed for scientific reasoning, hypothesis generation, or domain-specific tasks (chemistry, biology, medicine). Some are open weights, some are research prototypes with released code, some are closed APIs. Expect the frontier to shift every few months.

Model / System	Description	Link
Sakana AI Scientist v2	End-to-end agentic system that generates ideas, runs experiments, and drafts papers via agentic tree search.	github.com/SakanaAI/AI-Scientist-v2
FunSearch	DeepMind method pairing LLM program search with an evaluator, used to find new cap-set and bin-packing solutions.	github.com/google-deepmind/funsearch
ChemCrow	LangChain-based chemistry agent with 18 tool integrations for synthesis planning and molecule property lookup.	github.com/ur-whitelab/chemcrow-public
Coscientist	GPT-4-driven autonomous chemistry agent from the Gomes group, demonstrated on palladium cross-couplings.	github.com/gomesgroup/coscientist
DARWIN	Open 7B foundation model fine-tuned on physics, chemistry, and materials-science literature.	github.com/MasterAI-EAM/Darwin
Galactica	Meta's 120B scientific language model (2022); publicly retracted shortly after release due to hallucination issues — kept here as historical reference with weights on Hugging Face.	github.com/paperswithcode/galai
SciBERT	BERT variant trained on 1.14M scientific papers from Semantic Scholar, still used as a scientific-NLP baseline.	github.com/allenai/scibert
BioGPT	Microsoft biomedical GPT pretrained on PubMed abstracts, available via Hugging Face Transformers.	github.com/microsoft/BioGPT
Med-PaLM	Google Research family of medical LLMs evaluated on USMLE-style and clinical-reasoning benchmarks.	research.google med-palm
ESM3	EvolutionaryScale's multimodal protein model for joint sequence, structure, and function generation.	github.com/evolutionaryscale/esm

🧬 Scientific Simulation & Modeling

LLM-driven science agents reach wet labs faster when they pre-screen candidates in silico. The libraries here solve physics-based problems — molecular dynamics, electronic structure, finite-element PDEs, reaction kinetics — cheaply enough to rank thousands of hypotheses before committing reagent or instrument time in an autonomous loop.

Library	Description	Link
LAMMPS	Classical molecular dynamics engine scaling from single workstations to GPU supercomputers.	github.com/lammps/lammps
GROMACS	Biomolecular MD package optimized for proteins, lipids, and drug-binding free-energy work.	github.com/gromacs/gromacs
PySCF	Python quantum chemistry package covering Hartree-Fock, DFT, MP2, and coupled cluster.	github.com/pyscf/pyscf
Psi4	Open-source quantum chemistry program with a Python API for electronic-structure methods.	github.com/psi4/psi4
ASE	Python toolkit wrapping 30+ DFT and MD calculators behind a uniform Atoms object.	gitlab.com/ase/ase
RDKit	Cheminformatics toolkit for molecular descriptors, substructure search, and SMARTS filters.	github.com/rdkit/rdkit
OpenMM	GPU-accelerated biomolecular MD engine with a first-class Python scripting API.	github.com/openmm/openmm
FEniCS / DOLFINx	Python FEM framework for solving PDEs via variational forms in UFL.	github.com/FEniCS/dolfinx
Cantera	Chemical kinetics, thermodynamics, and transport library for reactors and combustion.	github.com/Cantera/cantera
deal.II	C++ finite element library with Python bindings for adaptive-mesh PDE solvers.	github.com/dealii/dealii

🎯 Bayesian Optimization & Active Learning

BO/AL is the decision core of closed-loop autonomous experimentation — given what has been observed so far, pick the next experiment to run under a limited budget. The libraries split along two lines: Gaussian-process-based frameworks (BoTorch, GPyTorch, Emukit, Trieste) that shine on smooth, low-dimensional design spaces, and algorithm-agnostic optimizers (Nevergrad, SMAC3, Hyperopt, Vizier) that scale to high-dimensional or conditional search spaces.

Library	Description	Link
BoTorch	PyTorch library for Bayesian optimization with Monte Carlo acquisition functions.	github.com/pytorch/botorch
Ax	Meta platform built on BoTorch for adaptive experimentation and closed-loop tuning.	github.com/facebook/Ax
GPyTorch	Scalable Gaussian processes in PyTorch; the GP backbone underneath BoTorch.	github.com/cornellius-gp/gpytorch
Emukit	Decision-making toolbox covering BO, experimental design, and sensitivity analysis.	github.com/EmuKit/emukit
Hyperopt	Distributed hyperparameter optimization using Tree-structured Parzen Estimators.	github.com/hyperopt/hyperopt
SMAC3	Sequential model-based algorithm configuration for AutoML and black-box tuning.	github.com/automl/SMAC3
Nevergrad	Gradient-free optimization platform with evolutionary and population-based algorithms.	github.com/facebookresearch/nevergrad
Trieste	TensorFlow/GPflow-based BO library with trust-region and multi-fidelity strategies.	github.com/secondmind-labs/trieste
Vizier	Open-source release of Google's internal black-box optimization service.	github.com/google/vizier
scikit-activeml	Active learning library with pool-based query strategies on scikit-learn models.	github.com/scikit-activeml/scikit-activeml

🔗 RL for Scientific Discovery

Reinforcement learning fits scientific problems where the agent must choose sequential experiments under uncertainty — active learning, Bayesian optimization loops, molecule design, reaction planning. The libraries below are general-purpose RL frameworks used as substrates in science RL work.

Library	Description	Link
Gymnasium	Maintained fork of OpenAI Gym with the standard RL environment API and a large environment registry.	github.com/Farama-Foundation/Gymnasium
Stable-Baselines3	PyTorch implementations of standard RL algorithms with a consistent, production-friendly API.	github.com/DLR-RM/stable-baselines3
CleanRL	Single-file PyTorch RL implementations used widely for reproducible benchmarks and teaching.	github.com/vwxyzjn/cleanrl
TRL	Hugging Face library for RLHF, DPO, PPO, and GRPO fine-tuning of language models.	github.com/huggingface/trl
Ray RLlib	Distributed RL library inside Ray with a large algorithm zoo and offline-RL support.	github.com/ray-project/ray
Tianshou	PyTorch RL library with modular agents, on-policy and off-policy algorithms, and offline RL.	github.com/thu-ml/tianshou
Sample Factory	High-throughput async PPO library optimized for single-machine, multi-GPU training.	github.com/alex-petrenko/sample-factory
PettingZoo	Multi-agent environment API and environment zoo from the Farama Foundation.	github.com/Farama-Foundation/PettingZoo
PFRL	PyTorch-based deep RL library from Preferred Networks with a suite of modern algorithms.	github.com/pfnet/pfrl

🗄️ Vector Databases & Embeddings

Retrieval-augmented agents need a vector store for papers, protocols, lab notes, and prior results. Pick based on scale and operational model — some run embedded, some are managed services, some are Postgres extensions. For embeddings, science-tuned models often beat general-purpose ones on domain corpora.

Tool	Description	Link
Qdrant	Rust-based vector database with filtering, quantization, and a managed cloud option.	github.com/qdrant/qdrant
Weaviate	Vector database with built-in hybrid search, modules for embeddings, and GraphQL API.	github.com/weaviate/weaviate
Milvus	High-scale open-source vector database with GPU-accelerated index types.	github.com/milvus-io/milvus
Chroma	Embedded and server vector database focused on developer ergonomics for RAG apps.	github.com/chroma-core/chroma
Pinecone	Managed serverless vector database with hybrid search and metadata filtering.	pinecone.io
FAISS	Meta's library for efficient similarity search and clustering of dense vectors.	github.com/facebookresearch/faiss
LanceDB	Serverless vector database built on the Lance columnar format for multimodal search.	github.com/lancedb/lancedb
pgvector	PostgreSQL extension that adds vector types, HNSW/IVF indexing, and ANN search.	github.com/pgvector/pgvector
sentence-transformers	Python library for dense sentence and passage embeddings on top of Hugging Face models.	github.com/UKPLab/sentence-transformers
BAAI bge-m3	Multi-lingual multi-function embedding model that handles dense, sparse, and ColBERT-style retrieval.	huggingface.co/BAAI/bge-m3

📡 Data Pipeline & Event Systems

Lab instruments, compute jobs, and agents emit events that need routing, buffering, and durable replay. These brokers and streaming systems form the nervous system of a distributed autonomous lab. Choose on durability guarantees and operational fit, not raw throughput.

Tool	Description	Link
NATS	High-performance cloud-native messaging system with JetStream for durable streams and KV.	github.com/nats-io/nats-server
Apache Kafka	Distributed log-based event streaming platform with an enormous connector ecosystem.	github.com/apache/kafka
RabbitMQ	AMQP-centric broker for traditional messaging patterns like work queues and routing.	github.com/rabbitmq/rabbitmq-server
Redis Streams	Append-only log data structure built into Redis with consumer groups.	redis.io streams
Apache Pulsar	Pub/sub and queue system with tiered storage and geo-replication built in.	github.com/apache/pulsar
ZeroMQ	Embeddable messaging library for low-latency socket-level patterns.	github.com/zeromq/libzmq
Celery	Python distributed task queue with a broker-based backend, common in lab automation backends.	github.com/celery/celery
Apache Flink	Stateful stream-processing engine with exactly-once semantics and event-time windows.	github.com/apache/flink
Apache Beam	Unified programming model for batch and streaming pipelines across multiple runners.	github.com/apache/beam
Redpanda	Kafka-API-compatible streaming platform written in C++ with no ZooKeeper dependency.	github.com/redpanda-data/redpanda

🛡️ Safety & Guardrails for Autonomous Systems

When agents can call tools that move real money, compounds, or lab hardware, you need input filters, output validators, and policy engines. This category ranges from prompt-injection detectors to structured-output validators to full red-teaming toolkits. Layer several — no single tool covers all failure modes.

Tool	Description	Link
Guardrails AI	Python framework for structured output validation and input/output guards via reusable validators.	github.com/guardrails-ai/guardrails
NeMo Guardrails	NVIDIA toolkit for adding programmable rails to LLM apps via the Colang dialog language.	github.com/NVIDIA/NeMo-Guardrails
LLM Guard	Protect AI's prompt and output scanner covering toxicity, PII, secrets, and prompt injection.	github.com/protectai/llm-guard
NVIDIA Garak	LLM vulnerability scanner that probes for jailbreaks, hallucination, and prompt injection.	github.com/NVIDIA/garak
DeepTeam	LLM red-teaming framework with 40+ vulnerability scanners and OWASP-aligned test suites.	github.com/confident-ai/deepteam
Lakera Guard	Commercial real-time API for detecting prompt injection, jailbreaks, and data exfiltration.	lakera.ai
Invariant	Rule-based contextual guardrails for LLM and MCP tool-calling deployed as a proxy.	github.com/invariantlabs-ai/invariant
LlamaFirewall	Meta's agent guardrail framework combining PromptGuard 2, alignment checks, and CodeShield.	github.com/meta-llama/PurpleLlama
PyRIT	Microsoft's Python risk-identification toolkit for automated adversarial testing of generative AI.	github.com/microsoft/PyRIT

📈 Monitoring & Observability

Beyond traditional infra monitoring, autonomous-science systems need traces across agent steps, tool calls, and LLM responses. The first group here covers generic infra observability; the second is LLM-specific. You want both — one tells you the cluster is healthy, the other tells you the agent is sane.

Tool	Description	Link
Prometheus	Pull-based time-series monitoring system with a powerful query language and alerting.	github.com/prometheus/prometheus
Grafana	Open-source dashboarding tool that plots metrics, logs, and traces across datasources.	github.com/grafana/grafana
Loki	Horizontally scalable log aggregation system with a label-based query model similar to Prometheus.	github.com/grafana/loki
OpenTelemetry	CNCF standard for traces, metrics, and logs with SDKs and collectors across languages.	github.com/open-telemetry
Jaeger	CNCF distributed tracing platform for investigating latency in microservice architectures.	github.com/jaegertracing/jaeger
Helicone	Open-source LLM observability and AI gateway with tracing, costs, and prompt management.	github.com/Helicone/helicone
LangSmith	LangChain's hosted platform for agent tracing, evaluation, and prompt management.	langchain.com/langsmith
Arize Phoenix	OpenTelemetry-based AI observability with tracing, eval, datasets, and experiments.	github.com/Arize-ai/phoenix
Opik	Comet's open-source LLM observability tool with tracing, automated evaluations, and dashboards.	github.com/comet-ml/opik
Langfuse	Self-hostable LLM engineering platform with tracing, prompt management, and datasets.	github.com/langfuse/langfuse

⭐ Star History

🙌 Contributing

Contributions are welcome. See CONTRIBUTING.md for the rules on what belongs, what doesn't, and how to submit a PR.

📜 License

Released under the MIT License.

Built and maintained by Scivity Labs — building the operating system for autonomous science.

If you find this useful, please ⭐ star the repository.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🔬 Autonomous Science Stack

Quick Links

🧪 Experiment Orchestration & Workflow

🤖 Multi-Agent Frameworks

🔬 Self-Driving / Autonomous Labs

⚙️ Hardware & Lab Automation

🖥️ GPU Compute Platforms

📊 ML Experiment Tracking

🔍 Verification & Reproducibility

📚 Scientific Knowledge Management

📄 Scientific Data Extraction

🧠 LLM for Science

🧬 Scientific Simulation & Modeling

🎯 Bayesian Optimization & Active Learning

🔗 RL for Scientific Discovery

🗄️ Vector Databases & Embeddings

📡 Data Pipeline & Event Systems

🛡️ Safety & Guardrails for Autonomous Systems

📈 Monitoring & Observability

⭐ Star History

🙌 Contributing

📜 License

About

Uh oh!

Releases

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

🔬 Autonomous Science Stack

Quick Links

🧪 Experiment Orchestration & Workflow

🤖 Multi-Agent Frameworks

🔬 Self-Driving / Autonomous Labs

⚙️ Hardware & Lab Automation

🖥️ GPU Compute Platforms

📊 ML Experiment Tracking

🔍 Verification & Reproducibility

📚 Scientific Knowledge Management

📄 Scientific Data Extraction

🧠 LLM for Science

🧬 Scientific Simulation & Modeling

🎯 Bayesian Optimization & Active Learning

🔗 RL for Scientific Discovery

🗄️ Vector Databases & Embeddings

📡 Data Pipeline & Event Systems

🛡️ Safety & Guardrails for Autonomous Systems

📈 Monitoring & Observability

⭐ Star History

🙌 Contributing

📜 License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Contributors

Uh oh!