Skip to content

Commit 56873d5

Browse files
committed
Address code review comments #2
Signed-off-by: Abhishree <abhishreetm@gmail.com>
1 parent 950ffe8 commit 56873d5

File tree

2 files changed

+96
-69
lines changed

2 files changed

+96
-69
lines changed

CONTRIBUTING.md

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,3 +53,64 @@
5353
maintained indefinitely and may be redistributed consistent with
5454
this project or the open source license(s) involved.
5555
```
56+
57+
## Development Setup
58+
59+
1. Fork the repository
60+
2. Create a feature branch
61+
3. Install development dependencies:
62+
```bash
63+
pip install -e ".[dev,test]"
64+
```
65+
4. Run pre-commit hooks:
66+
```bash
67+
pre-commit install
68+
```
69+
5. Make your changes and add tests
70+
6. Submit a pull request
71+
72+
## Testing
73+
74+
### Running Tests
75+
76+
```bash
77+
# Run all tests
78+
pytest tests
79+
80+
# Run unit tests only
81+
pytest tests/unit_tests/
82+
83+
# Run functional tests only
84+
pytest tests/functional_tests/
85+
86+
# Run with coverage
87+
pytest --cov=nemo_eval tests
88+
```
89+
90+
### Test Scripts
91+
92+
```bash
93+
# Unit tests on CPU
94+
bash tests/unit_tests/L0_Unit_Tests_CPU.sh
95+
96+
# Unit tests on GPU
97+
bash tests/unit_tests/L0_Unit_Tests_GPU.sh
98+
99+
# Functional tests on GPU
100+
bash tests/functional_tests/L2_Functional_Tests_GPU.sh
101+
```
102+
103+
### Testing Guidelines
104+
105+
- Write unit tests for new functionality
106+
- Ensure all tests pass before submitting
107+
- Add integration tests for complex features
108+
- Follow existing test patterns
109+
110+
## Code Style
111+
112+
We use:
113+
- **Black** for code formatting
114+
- **Ruff** for linting
115+
- **MyPy** for type checking
116+
- **Pre-commit** for automated checks

README.md

Lines changed: 35 additions & 69 deletions
Original file line numberDiff line numberDiff line change
@@ -4,17 +4,16 @@
44
[![Python](https://img.shields.io/badge/python-3.10+-blue.svg)](pyproject.toml)
55
[![NVIDIA](https://img.shields.io/badge/NVIDIA-NeMo-red.svg)](https://github.com/NVIDIA/NeMo)
66

7-
**NeMo Eval** is a comprehensive evaluation framework for Large Language Models (LLMs) built on top of the NeMo Framework. It provides seamless deployment and evaluation capabilities for NeMo checkpoints using NVIDIA's evaluation infrastructure.
7+
**NeMo Eval** is a comprehensive evaluation framework for Large Language Models (LLMs) built on top of the NeMo Framework. It provides seamless deployment and evaluation capabilities for NeMo checkpoints using NVIDIA Eval Factory. NVIDIA Eval Factory contains state-of-the-art evaluation harnesses as modular evaluation packages that are installed in the NeMo Framework container as building blocks for evaluation.
88

99
## 🚀 Features
1010

1111
- **Multi-Backend Deployment**: Support for both PyTriton and Ray Serve deployment backends
12-
- **Comprehensive Evaluation**: Integration with NVIDIA Evals Factory for standardized benchmark evaluation
13-
- **Adapter System**: Flexible adapter architecture for customizing request/response processing
12+
- **Comprehensive Evaluation**: Integration with NVIDIA Eval Factory for standardized benchmark evaluation
13+
- **Adapter System**: Flexible adapter architecture using a chain of interceptors for customizing request/response processing
1414
- **Production Ready**: Optimized for high-performance inference with CUDA graphs and flash decoding
1515
- **Multi-GPU Support**: Distributed inference across multiple GPUs and nodes
1616
- **OpenAI-Compatible API**: RESTful endpoints compatible with OpenAI API standards
17-
- **Extensible Architecture**: Plugin-based interceptor system for custom functionality
1817

1918
## 📋 Table of Contents
2019

@@ -33,7 +32,7 @@
3332
### Prerequisites
3433

3534
- Python 3.10 or higher
36-
- CUDA-compatible GPU(s)
35+
- CUDA-compatible GPU(s) (tested on RTX A6000, A100, H100)
3736
- NeMo Framework container (recommended)
3837

3938
### Using pip
@@ -42,6 +41,16 @@
4241
pip install nemo-eval
4342
```
4443

44+
### Using uv
45+
46+
```bash
47+
# Install uv if you haven't already
48+
pip install uv
49+
50+
# Install nemo-eval
51+
uv pip install nemo-eval
52+
```
53+
4554
### From Source
4655

4756
```bash
@@ -50,6 +59,14 @@ cd Eval
5059
pip install -e .
5160
```
5261

62+
### From Source with uv
63+
64+
```bash
65+
git clone https://github.com/NVIDIA-NeMo/Eval.git
66+
cd Eval
67+
uv sync
68+
```
69+
5370
### Using Docker
5471

5572
The recommended approach is to use the NeMo Framework container:
@@ -97,22 +114,28 @@ results = evaluate(target_cfg=target, eval_cfg=config)
97114
print(results)
98115
```
99116

117+
## 📊 Support Matrix
118+
119+
| Checkpoint Type | Inference Backend | Deployment Server | Evaluation Harnesses Supported |
120+
|----------------|-------------------|-------------|--------------------------|
121+
| NeMo 2.0 | Megatron Core inference engine | PyTriton (single and multi node model parallelism), Ray (single node model parallelism with multi instance evals) | lm-evaluation-harness, simple-evals, BigCode, BFCL, safety-harness, garak |
122+
100123
## 🏗️ Architecture
101124

102125
### Core Components
103126

104127
#### 1. Deployment Layer
105-
- **PyTriton Backend**: High-performance inference using NVIDIA Triton Inference Server and OpenAI API compatibility with FastAPI Interface
106-
- **Ray Backend**: Multi instance or data parallel evaluation using Ray Serve with OpenAI API compatibility
128+
- **PyTriton Backend**: High-performance inference using NVIDIA Triton Inference Server and OpenAI API compatibility via FastAPI Interface with model parallelism across single and multi node. Does not support multi instance evaluation.
129+
- **Ray Backend**: Single node model parallel multi instance evaluation using Ray Serve with OpenAI API compatibility. Multi node support coming soon.
107130

108131

109132
#### 2. Evaluation Layer
110-
- **NVIDIA Evals Factory**: Standardized benchmark evaluation with NVIDIA Evals Factory that provides state-of-the-art evaluation harnesses like lm-evaluation-harness, simple-evals, BigCode, BFCL, safety-harness, garak as modular evaluation packages compatible for installation within in the NeMo Framework container. lm-evaluation-harness is installed inside the NeMo Framework container while the others can be installed on-demand. More details in the [docs](https://github.com/NVIDIA-NeMo/Eval/tree/main/docs).
133+
- **NVIDIA Eval Factory**: Standardized benchmark evaluation with eval packages from NVIDIA Eval Factory that are installed in the NeMo Framework container. lm-evaluation-harness is installed inside the NeMo Framework container by default while the rest from the [support matrix](#-support-matrix) can be installed on-demand. More details in the [docs](https://github.com/NVIDIA-NeMo/Eval/tree/main/docs).
111134
- **Adapter System**: Flexible request/response processing pipeline with **Interceptors** that provide modular processing
112135

113136
```
114137
┌───────────────────────┐
115-
│ NVIDIA Evals Factory │
138+
│ NVIDIA Eval Factory │
116139
└───▲──────┬────────────┘
117140
│ │
118141
│ │
@@ -192,6 +215,8 @@ results = evaluate(target_cfg=target, eval_cfg=config)
192215

193216
### Using Adapters
194217

218+
The example below shows how to configure an Adapter that allows to provide a custom system prompt. Requests/responses are processed through interceptors. Interceptors are automatically selected based on the `AdapterConfig` parameters you provide.
219+
195220
```python
196221
from nemo_eval.utils.api import AdapterConfig
197222

@@ -244,36 +269,7 @@ deploy(
244269
)
245270
```
246271

247-
## 🧪 Testing
248-
249-
### Running Tests
250-
251-
```bash
252-
# Run all tests
253-
pytest tests
254272

255-
# Run unit tests only
256-
pytest tests/unit_tests/
257-
258-
# Run functional tests only
259-
pytest tests/functional_tests/
260-
261-
# Run with coverage
262-
pytest --cov=nemo_eval tests
263-
```
264-
265-
### Test Scripts
266-
267-
```bash
268-
# Unit tests on CPU
269-
bash tests/unit_tests/L0_Unit_Tests_CPU.sh
270-
271-
# Unit tests on GPU
272-
bash tests/unit_tests/L0_Unit_Tests_GPU.sh
273-
274-
# Functional tests on GPU
275-
bash tests/functional_tests/L2_Functional_Tests_GPU.sh
276-
```
277273

278274
## 📁 Project Structure
279275

@@ -302,37 +298,7 @@ Eval/
302298

303299
## 🤝 Contributing
304300

305-
We welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.
306-
307-
### Development Setup
308-
309-
1. Fork the repository
310-
2. Create a feature branch
311-
3. Install development dependencies:
312-
```bash
313-
pip install -e ".[dev,test]"
314-
```
315-
4. Run pre-commit hooks:
316-
```bash
317-
pre-commit install
318-
```
319-
5. Make your changes and add tests
320-
6. Submit a pull request
321-
322-
### Code Style
323-
324-
We use:
325-
- **Black** for code formatting
326-
- **Ruff** for linting
327-
- **MyPy** for type checking
328-
- **Pre-commit** for automated checks
329-
330-
### Testing Guidelines
331-
332-
- Write unit tests for new functionality
333-
- Ensure all tests pass before submitting
334-
- Add integration tests for complex features
335-
- Follow existing test patterns
301+
We welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details on development setup, testing, and code style guidelines.
336302

337303
## 📄 License
338304

0 commit comments

Comments
 (0)