Skip to content

cantemizyurek/open-evals

Repository files navigation

Open Evals

A comprehensive toolkit for generating synthetic test data and evaluating LLM applications with RAG capabilities.

Overview

Open Evals is a modular evaluation framework designed to help developers test and improve their AI applications. It provides tools for:

  • Synthetic Data Generation: Create realistic test datasets using knowledge graphs, personas, and scenarios
  • Evaluation Metrics: Pre-built and custom metrics for assessing LLM performance
  • RAG Utilities: Text splitters for retrieval-augmented generation
  • Evaluation Framework: Core abstractions for running comprehensive evaluations

Packages

This monorepo contains the following packages:

Core evaluation framework with abstractions for datasets, metrics, and evaluation pipelines.

pnpm add @open-evals/core

Synthetic test data generation using knowledge graphs, personas, and query synthesis.

pnpm add @open-evals/generator

RAG utilities including recursive character and markdown text splitters.

pnpm add @open-evals/rag

Pre-built evaluation metrics including faithfulness, factual correctness, and more.

pnpm add @open-evals/metrics

Development

This project uses pnpm workspaces for managing multiple packages.

# Install dependencies
pnpm install

# Build all packages
pnpm build

# Run tests
pnpm test

Examples

The agents/ directory contains example implementations:

  • doc-assistant: A RAG-based documentation assistant demonstrating the full stack

License

MIT

About

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages