Skip to content

Commit 714d6bc

Browse files
authored
Update README and add contribution guidelines, code of conduct and citation (#15)
2 parents 2a566de + ccbfe31 commit 714d6bc

File tree

4 files changed

+116
-8
lines changed

4 files changed

+116
-8
lines changed

CITATION.cff

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
cff-version: 1.2.0
2+
title: "EVA: A New Framework for Evaluating Voice Agents"
3+
message: "If you use this software, please cite it as below."
4+
type: software
5+
authors:
6+
- family-names: Bogavelli
7+
given-names: Tara
8+
- family-names: Gauthier Melançon
9+
given-names: Gabrielle
10+
- family-names: Stankiewicz
11+
given-names: Katrina
12+
- family-names: Bamgbose
13+
given-names: Oluwanifemi
14+
- family-names: Nguyen
15+
given-names: Hoang
16+
- family-names: Mehndiratta
17+
given-names: Raghav
18+
- family-names: Subramani
19+
given-names: Hari
20+
version: "0.1.1"
21+
date-released: "2026-03-24"
22+
repository-code: "https://github.com/ServiceNow/eva"
23+
url: "https://servicenow.github.io/eva/"
24+
license: MIT
25+
keywords:
26+
- voice agents
27+
- evaluation
28+
- speech
29+
- conversational AI

CODE_OF_CONDUCT.md

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
# Code of Conduct
2+
3+
This code of conduct provides guidelines for participation in this open-source project.
4+
5+
## Discussion Forum Guidelines
6+
7+
Communities thrive when members support each other and provide useful feedback.
8+
9+
- Be polite and courteous. Respect and treat others as you would expect to be treated yourself.
10+
- Respect your audience. Posts should not upset, annoy, threaten, harass, abuse or embarrass other members.
11+
- User contributions must not include material that is defamatory, obscene, indecent, abusive, offensive, harassing, violent, hateful, inflammatory or otherwise objectionable.
12+
- Lively and collegial discussions are always encouraged in a healthy community. It is okay to argue facts but not okay to argue personalities or personal beliefs.
13+
- Do not use text formats such as all caps or bold that may be read as annoying, rude or send a strong message.
14+
- Do not publish anyone's private personal information without their explicit consent.
15+
- Avoid using abbreviations or terminology that others may not understand. An abbreviation may mean something to you but in another context or country, it may have another meaning.
16+
- Be accountable for your actions by correcting your mistakes and indicating where you have changed a previous post of yours.
17+
- Mark content as correct and helpful, and provide feedback. If you read a discussion post that you find helpful, we encourage you to leave a positive vote and comment in the replies. If you find a post that is unhelpful, please provide more information in the issue comments.
18+
19+
## Issue Board Guidelines
20+
21+
Many open-source projects provide an Issues board, with similar functionality to a Discussions forum. The same rules from the discussion forum guidelines apply to the Issues board.
22+
23+
We suggest the following technical support pathways for open-source projects:
24+
25+
1. Clearly identify and document the issue or question you have.
26+
2. View the documentation.
27+
3. Search the Discussions.
28+
4. Search the project knowledge base or Wiki for known errors, useful solutions, and troubleshooting tips.
29+
5. Check the project guidelines in the [`CONTRIBUTING.md`](CONTRIBUTING.md) file if you would like details on how you can submit a change. Community contributions are valued and appreciated!
30+
6. Log an Issue if it hasn't already been logged. If the issue has already been logged by another user, vote it up, and add a comment with additional or missing information. Do your best to choose the correct category when logging a new issue. This will make it easier to differentiate bugs from new feature requests or ideas. If after logging an issue you find the solution, please close your issue and provide a comment with the solution. This will help the project owners and other users.
31+
7. Contact the project team contributors of the project to see if they can help as a last resort only.
32+
33+
## Repositories
34+
35+
- Read and follow the license instructions.
36+
- Remember to include citations if you use someone else's work in your own project. Use the [`CITATION.cff`](CITATION.cff) to find the correct project citation reference.
37+
- 'Star' project repos to save for future reference.
38+
- 'Watch' project repos to get notifications of changes – this can get noisy for some projects, so only watch the ones you really need to track closely.
39+
40+
## Disclaimer
41+
42+
We may, but are under no obligation to, monitor or censor comments made by users or content provided by contributors and we are not responsible for the accuracy, completeness, appropriateness or legality of anything posted, depicted or otherwise provided by third-party users and we disclaim any and all liability relating thereto.

CONTRIBUTING.md

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
# Contributing to EVA
2+
3+
Thank you for your interest in contributing to EVA!
4+
5+
This document should be able to guide contributors in their different types of contributions.
6+
7+
Just want to ask a question? Open a topic on our [Discussion page](https://github.com/ServiceNow/eva/discussions).
8+
9+
## Get Your Environment Setup
10+
11+
Go to our [Quick Start](README.md) section in the README to get set up.
12+
13+
## How to Submit a Bug Report
14+
15+
[Open an issue on GitHub](https://github.com/ServiceNow/eva/issues/new/choose) and select "Bug report". If you are not sure whether it is a bug or not, submit an issue and we will be able to help you.
16+
17+
Issues with reproducible examples are easier to work with. Do not hesitate to provide your configuration with generated data if need be.
18+
19+
If you are familiar with the codebase, providing a unit test is helpful, but not mandatory.
20+
21+
## How to Submit Changes
22+
23+
First, open an issue describing your desired changes, if it does not exist already.
24+
1. [Fork the repo to your own account](https://github.com/ServiceNow/eva/fork).
25+
2. Clone your fork of the repo locally.
26+
3. Make your changes (the fun part).
27+
4. Commit and push your changes to your fork.
28+
5. [Open a pull request](https://github.com/ServiceNow/eva/compare) with your branch.
29+
6. Once a team member approves your changes, we will merge the pull request promptly.
30+
31+
### Guidelines for a Good Pull Request
32+
33+
When coding, pay special attention to the following:
34+
35+
- Your code should be well commented for non-trivial sections, so it can be easily understood and maintained by others, but not over-commented. Good variable names and functions are your best friend.
36+
- Do not expose any personal or sensitive data.
37+
- Add unit tests when a notable functionality has been added or changed.

README.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -175,15 +175,15 @@ pytest tests/integration/test_metrics.py -v
175175

176176
Existing benchmarks evaluate voice agent components in isolation — speech understanding, TTS quality, or conversational dynamics — but none assess the full pipeline end to end. In real deployed systems, errors compound across modules and failure modes interact in ways that component-level evaluation cannot capture. EVA addresses this by treating voice agent quality as an integrated whole, evaluating accuracy and experience jointly across complete multi-turn spoken conversations.
177177

178-
| **Framework** | **Interaction Mode** | **Multi-turn** | **Tool Calling** | **Goal Completion** | **Experience Metrics** | **Pass@k, Pass^k** | **Supported Systems** |
178+
| **Framework** | **Interaction Mode** | **Multi-turn** | **Tool Calling** | **Goal Completion** | **Experience Metrics** | **Pass@k<br>Pass^k** | **Supported Systems** |
179179
|---|---|---|---|---|---|--------------------|---|
180-
| **EVA** | Live bot-to-bot |||(Task Completion, Speech Fidelity, Faithfulness) |(Conciseness, Turn-taking, Latency, Progression) || Audio-native, Cascade |
180+
| **EVA** | Live bot-to-bot |||<br>Task Completion, Speech Fidelity, Faithfulness |<br>Conciseness, Turn-taking, Latency, Progression || Audio-native, Cascade |
181181
| **VoiceAgent&shy;Bench** | Static, TTS-synthesized ||| ⚠️ ||| Audio-native, Cascade |
182-
| **CAVA** | Partial simulation ||| ⚠️ | ⚠️ (Latency, Tone-awareness) || Audio-native, Cascade |
183-
| **FDB-v2** | Live, automated examiner ||||(Turn-taking fluency, Correction handling, Safety) || Audio-native |
184-
| **FDB-v1** | Static, pre-recorded ||||(Turn-taking, Backchanneling, Interruption) || Audio-native |
185-
| **FD-Bench** | Live, simulated ||||(Interruption, Delay, Robustness) || Audio-native |
186-
| **Talking Turns** | Static, curated ||||(Turn change, Backchannel, Interruption) || Audio-native, Cascade |
182+
| **CAVA** | Partial simulation ||| ⚠️ | ⚠️ <br>Latency, Tone-awareness || Audio-native, Cascade |
183+
| **FDB-v2** | Live, automated examiner ||||<br>Turn-taking fluency, Correction handling, Safety || Audio-native |
184+
| **FDB-v1** | Static, pre-recorded ||||<br>Turn-taking, Backchanneling, Interruption || Audio-native |
185+
| **FD-Bench** | Live, simulated ||||<br>Interruption, Delay, Robustness || Audio-native |
186+
| **Talking Turns** | Static, curated ||||<br>Turn change, Backchannel, Interruption || Audio-native, Cascade |
187187

188188
## 🏗️ Architecture
189189

@@ -253,6 +253,7 @@ eva/
253253
├── compose.yaml # Docker Compose configuration
254254
├── src/eva/
255255
│ ├── cli.py # CLI interface
256+
│ ├── run_benchmark.py # Benchmark runner
256257
│ ├── models/ # Pydantic data models
257258
│ ├── orchestrator/ # Framework execution
258259
│ │ ├── runner.py # Main orchestrator
@@ -277,7 +278,6 @@ eva/
277278
│ │ └── validation/ # Quality control metrics
278279
│ └── utils/ # Utilities (LLM client, log processing)
279280
├── scripts/ # Utility scripts
280-
│ ├── run_benchmark.py # Benchmark runner
281281
│ ├── run_text_only.py # Text-only evaluation runner
282282
│ ├── docker_entrypoint.py # Docker entry point
283283
│ ├── check_version_bump.py # Version checking

0 commit comments

Comments
 (0)