Skip to content

Commit edbe951

Browse files
committed
docs: add README.md
1 parent efb3f25 commit edbe951

File tree

1 file changed

+138
-0
lines changed

1 file changed

+138
-0
lines changed

README.md

Lines changed: 138 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,138 @@
1+
# EvalOps CLI
2+
3+
The EvalOps CLI is a powerful tool for evaluating code against Large Language Models (LLMs) using the EvalOps platform. It allows you to define, validate, and run evaluations directly from your command line.
4+
5+
## Features
6+
7+
- **Initialize Projects**: Quickly set up a new EvalOps project with `evalops init`.
8+
- **Validate Configurations**: Ensure your `evalops.yaml` file is correctly formatted and your test cases are discoverable with `evalops validate`.
9+
- **Upload Test Suites**: Upload your evaluation configurations to the EvalOps platform with `evalops upload`.
10+
- **Local Evaluations (Coming Soon)**: Run evaluations locally against different providers with `evalops run`.
11+
- **Automatic Test Discovery**: Automatically discover test cases in your codebase defined with `@evalops_test` decorators or `evalops_test()` function calls.
12+
13+
## Installation
14+
15+
```bash
16+
npm install -g evalops-cli
17+
```
18+
19+
## Getting Started
20+
21+
1. **Initialize a new project:**
22+
23+
```bash
24+
evalops init
25+
```
26+
27+
This will create a `evalops.yaml` file in your current directory. You can use the interactive prompt to configure your project or start with a template:
28+
29+
```bash
30+
evalops init --template basic
31+
```
32+
33+
2. **Define your evaluation in `evalops.yaml`:**
34+
35+
The `evalops.yaml` file is the heart of your evaluation. Here you can define:
36+
- A description and version for your evaluation.
37+
- The prompts to be used.
38+
- The LLM providers to test against.
39+
- Default and specific test cases with assertions.
40+
41+
3. **Add test cases to your code:**
42+
43+
The CLI can automatically discover test cases in your code. You can define a test case using the `@evalops_test` decorator or the `evalops_test()` function.
44+
45+
**Using Decorator:**
46+
```typescript
47+
import { evalops_test } from 'evalops-cli';
48+
49+
@evalops_test({
50+
description: 'Test case for my function',
51+
tags: ['critical', 'refactor'],
52+
})
53+
function myFunction() {
54+
// Your code to be evaluated
55+
}
56+
```
57+
58+
**Using Function Call:**
59+
```typescript
60+
import { evalops_test } from 'evalops-cli';
61+
62+
evalops_test({
63+
description: 'Another test case',
64+
}, () => {
65+
// Your code to be evaluated
66+
});
67+
```
68+
69+
4. **Validate your configuration:**
70+
71+
Before uploading, it's a good practice to validate your configuration and discover your test cases:
72+
73+
```bash
74+
evalops validate
75+
```
76+
77+
5. **Upload your test suite:**
78+
79+
Once you're ready, upload your test suite to the EvalOps platform:
80+
81+
```bash
82+
evalops upload
83+
```
84+
85+
You will need to provide your EvalOps API key. You can do this by setting the `EVALOPS_API_KEY` environment variable or by using the `--api-key` flag.
86+
87+
## CLI Commands
88+
89+
### `init`
90+
91+
Initialize a new EvalOps project.
92+
93+
**Options:**
94+
- `-f, --force`: Overwrite existing `evalops.yaml` file.
95+
- `--template <template>`: Use a specific template (`basic`, `advanced`).
96+
97+
### `validate`
98+
99+
Validate the `evalops.yaml` file and discovered test cases.
100+
101+
**Options:**
102+
- `-v, --verbose`: Show detailed validation output.
103+
- `-f, --file <file>`: Path to `evalops.yaml` file (default: `./evalops.yaml`).
104+
105+
### `upload`
106+
107+
Upload test suite to the EvalOps platform.
108+
109+
**Options:**
110+
- `-f, --file <file>`: Path to `evalops.yaml` file (default: `./evalops.yaml`).
111+
- `--api-key <key>`: EvalOps API key.
112+
- `--api-url <url>`: EvalOps API URL (default: `https://api.evalops.dev`).
113+
- `--name <name>`: Name for the test suite.
114+
- `--dry-run`: Preview what would be uploaded without actually uploading.
115+
116+
### `run`
117+
118+
Run evaluation locally (not yet implemented).
119+
120+
**Options:**
121+
- `-f, --file <file>`: Path to `evalops.yaml` file (default: `./evalops.yaml`).
122+
- `--provider <provider>`: Specify provider to use.
123+
- `--output <output>`: Output file path.
124+
125+
## Configuration
126+
127+
The `evalops.yaml` file supports the following main sections:
128+
129+
- `description`: A brief description of the evaluation.
130+
- `version`: The version of the evaluation configuration.
131+
- `prompts`: The prompts to be sent to the LLM. Can be a single prompt or a list of messages with roles.
132+
- `providers`: A list of LLM providers to use for the evaluation.
133+
- `defaultTest`: Default assertions and variables for all test cases.
134+
- `tests`: A list of specific test cases.
135+
- `config`: Execution configuration like iterations, parallelism, and timeout.
136+
- `outputPath`: The path to store the results of a local run.
137+
- `outputFormat`: The format of the output file (`json`, `yaml`, `csv`).
138+
- `sharing`: Configuration for sharing the evaluation results.

0 commit comments

Comments
 (0)