Production-ready wrapper and reimplementation of Temple University's NEDC EEG Evaluation v6.0.0
⚠️ Independent Project: This is an open-source contribution, not officially maintained by Temple University, NEDC, or affiliates. We wrap the original NEDC v6.0.0 software unchanged, and provide a modern reimplementation. All algorithmic credit goes to the original authors (Shah et al., 2021).
NEDC-BENCH transforms Temple University's NEDC EEG evaluation suite into a production-ready platform. We maintain a dual-pipeline architecture that guarantees scoring parity while offering modern infrastructure for scalable deployment.
The Problem: NEDC's evaluation software is excellent for research but challenging to deploy — dependency management, I/O conventions, and operational ergonomics make deployment and usage difficult.
Our Solution: Best of both worlds — preserve exact scientific behavior while making it effortless to run in production:
- 100% algorithmic parity with NEDC v6.0.0 (continuously validated)
- REST API & WebSockets for programmatic access
- Docker/Kubernetes ready with Redis caching and Prometheus metrics
- 92% test coverage with 187 tests
┌───────────────────────────────────────────────────────────┐
│ NEDC-BENCH Platform │
├───────────────────────────────────────────────────────────┤
│ ┌───────────────────────┐ ┌───────────────────────┐ │
│ │ Pipeline Alpha │ │ Pipeline Beta │ │
│ │ (Legacy Wrapper) │ │ (Modern Rewrite) │ │
│ ├───────────────────────┤ ├───────────────────────┤ │
│ │ • Original NEDC code │ │ • Clean architecture │ │
│ │ • Research-grade │ │ • Type-safe Python │ │
│ │ • Text-based I/O │ │ • Async/parallel │ │
│ │ • 100% fidelity │ │ • Cloud-native │ │
│ └───────────────────────┘ └───────────────────────┘ │
│ ↓ ↓ │
│ ┌──────────────────────────────────────────────────┐ │
│ │ Unified API & Result Validator │ │
│ └──────────────────────────────────────────────────┘ │
└───────────────────────────────────────────────────────────┘
- TAES — Time-Aligned Event Scoring
- DP — Dynamic Programming Alignment
- Overlap — Any-overlap detection
- Epoch — 250ms epoch-based sampling
- IRA — Inter-Rater Agreement (Cohen's κ)
- Alpha (Legacy Wrapper): When you need bit-exact reproducibility with NEDC v6.0.0
- Beta (Modern Rewrite): For production deployments requiring speed and modern APIs
- Dual: To validate parity between pipelines (used in CI/CD)
git clone https://github.com/Clarity-Digital-Twin/nedc-bench.git
cd nedc-bench
# Build and start all services (API, Redis, Prometheus, Grafana)
docker compose up -d --build
# Verify health
curl http://localhost:8000/api/v1/health
# Expected: {"status":"healthy"}
# View API documentation
open http://localhost:8000/docs # Swagger UI
open http://localhost:3000 # Grafana dashboards (admin/admin)Quick test with sample data:
curl -X POST "http://localhost:8000/api/v1/evaluate" \
-F "reference=@data/csv_bi_parity/csv_bi_export_clean/ref/aaaaaajy_s001_t000.csv_bi" \
-F "hypothesis=@data/csv_bi_parity/csv_bi_export_clean/hyp/aaaaaajy_s001_t000.csv_bi" \
-F "algorithms=all" \
-F "pipeline=beta"💡 Windows/WSL users: Use
docker compose(v2) notdocker-compose(v1). See deployment guide for troubleshooting.
# Using uv (10-100x faster than pip)
curl -LsSf https://astral.sh/uv/install.sh | sh
make dev # Installs deps + pre-commit hooks
# Or traditional pip
python -m venv .venv && source .venv/bin/activate
pip install -e ".[api]"
# Verify installation
make test # Run full test suite
make lint # Check code qualityimport requests
# Upload and evaluate EEG annotations
with open("ref.csv_bi", "rb") as ref, open("hyp.csv_bi", "rb") as hyp:
response = requests.post(
"http://localhost:8000/api/v1/evaluate",
files={"reference": ref, "hypothesis": hyp},
data={"algorithms": ["taes", "epoch", "ira"], "pipeline": "dual"},
)
job_id = response.json()["job_id"]
# Get results with parity validation
result = requests.get(f"http://localhost:8000/api/v1/evaluate/{job_id}").json()
print(f"TAES Sensitivity: {result['beta']['taes']['sensitivity']:.2f}%")
print(f"Parity Check: {'✅ PASS' if result['parity']['match'] else '❌ FAIL'}")import asyncio
import websockets
async def monitor_job(job_id):
async with websockets.connect(f"ws://localhost:8000/ws/{job_id}") as ws:
async for message in ws:
print(f"Progress: {message}")# Original NEDC wrapper (preserves exact v6.0.0 behavior)
./run_nedc.sh nedc_eeg_eval/v6.0.0/data/lists/ref.list \
nedc_eeg_eval/v6.0.0/data/lists/hyp.list
# Python scripts for batch processing
python scripts/run_alpha_complete.py # Full Alpha pipeline
python scripts/run_beta_batch.py # All Beta algorithms
python scripts/compare_parity.py # Compare Alpha vs Beta# Verify 100% algorithmic match with NEDC v6.0.0
python scripts/compare_parity.py --verbose
# Expected output (exact values):
# ✅ TAES: TP=133.84, FP=552.77, Sensitivity=12.45%, FA/24h=30.46
# ✅ Epoch: TP=33704, FP=18816, Sensitivity=11.86%, FA/24h=259.23
# ✅ Overlap: TP=253, FP=536, Sensitivity=23.53%, FA/24h=29.54
# ✅ DP: TP=328, FP=966, Sensitivity=30.51%, FA/24h=53.23
# ✅ IRA: Kappa=0.1887 (multi-class Cohen's κ)| Component | Metric | Value | Notes |
|---|---|---|---|
| API Latency | P50 | ~250ms | With Redis cache |
| API Latency | P99 | ~2.5s | Cold start |
| Throughput | RPS | ~100 | 4 workers, single node |
| Cache Hit Rate | % | >90% | After warm-up |
| Test Coverage | % | 92% | 187 tests |
| Parity | Match | 100% | All algorithms |
nedc-bench/
├── src/
│ ├── nedc_bench/ # Modern Beta pipeline (clean-room implementation)
│ │ ├── algorithms/ # Reimplemented scoring algorithms
│ │ ├── api/ # FastAPI application & endpoints
│ │ ├── models/ # Pydantic models for type safety
│ │ ├── orchestration/ # Dual-pipeline coordinator
│ │ └── validation/ # Parity checking framework
│ └── alpha/ # Alpha pipeline wrapper
│ └── wrapper/ # Minimal wrapper around NEDC v6.0.0
├── nedc_eeg_eval/ # Original NEDC v6.0.0 (vendored, unchanged)
│ └── v6.0.0/ # DO NOT MODIFY — reference implementation
├── scripts/ # Utility scripts for testing & validation
│ ├── compare_parity.py # Verify algorithmic equivalence
│ └── ultimate_parity_test.py # Full validation suite
├── tests/ # Comprehensive test suite
├── k8s/ # Kubernetes manifests
└── docker-compose.yml # Full stack with Redis & Prometheus
- CSV_BI: Temple's annotation format (included examples)
- XML: Alternative annotation format
- List files: Batch processing of multiple files
- Redis caching provides >10x speedup for repeated evaluations
- Prometheus metrics for production monitoring
- WebSocket support for real-time progress updates
- Async processing for parallel execution
- 📖 Installation Guide — Detailed setup instructions
- 🚀 Quick Start Tutorial — Get running in 5 minutes
- 🔌 API Reference — Endpoints, examples, OpenAPI access
- 🐳 Deployment Guide — Production deployment
- 🔄 Migration Guide — Moving from vanilla NEDC
- 🔬 TAES Algorithm — Time-Aligned Event Scoring with multi-overlap sequencing
- 📊 Epoch Algorithm — 250ms epoch-based sampling details
- 🎯 Overlap Algorithm — Any-overlap detection
- 🔗 DP Alignment — Dynamic programming with NULL sentinel design
- 📈 IRA Algorithm — Inter-Rater Agreement (Cohen's κ)
- 🏗️ Architecture Guide — Dual-pipeline design & router pattern
- 🐛 Bug Fixes 2025 — Complete technical reference for 11 critical fixes
- ⚙️ Beta Configuration — Three-tier architecture documentation
- 🧪 Testing Guide — Comprehensive test strategy & stability solutions
- ✅ Parity Validation — 100% parity on 1832 file pairs
- 🔧 Contributing Guide — Development workflow
The Temple University Hospital (TUH) EEG Corpus is the world's largest open EEG dataset. To access:
- Request access at https://isip.piconepress.com/projects/tuh_eeg/
- Email completed form to
help@nedcdata.org - Use provided credentials for rsync access
For testing NEDC-BENCH, we include sample data in nedc_eeg_eval/v6.0.0/data/.
If you use NEDC-BENCH in your research, please cite both:
@incollection{shah2021objective,
title={Objective Evaluation Metrics for Automatic Classification of EEG Events},
author={Shah, V. and Golmohammadi, M. and Obeid, I. and Picone, J.},
booktitle={Signal Processing in Medicine and Biology},
year={2021},
publisher={Springer},
doi={10.1007/978-3-030-36844-9_1}
}@software{nedc_bench2025,
title={NEDC-BENCH: A Modern Dual-Pipeline Platform for EEG Evaluation},
author={{Clarity Digital Twin}},
year={2025},
url={https://github.com/Clarity-Digital-Twin/nedc-bench},
note={Production wrapper and reimplementation with 100% parity validation
and modern infrastructure}
}We welcome contributions! See CONTRIBUTING.md for guidelines.
- New code (
src/nedc_bench/,src/alpha/,tests/): Apache 2.0 - Original NEDC (
nedc_eeg_eval/): No explicit license; © Temple University
- 🐛 Issues
- 📚 Documentation
- 🔬 Original NEDC
NEDC-BENCH bridges neuroscience research and production systems • Built on Temple University's foundational algorithms