Skip to content

Commit 3fcabcd

Browse files
committed
feat: benchmarking framework, Docker infrastructure, and documentation overhaul
Benchmarking application layer (benchmarks/): - Add competitive benchmark suite with 12 framework adapters (Grain, tf.data, PyTorch DataLoader, DALI, Ray Data, SPDL, MosaicML, WebDataset, HF Datasets, LitData, Deep Lake, jax-dataloader) - Add scenario-driven runner with TOML config profiles (cpu, gpu_a100, tpu_v5e) - Add datarax-bench CLI with Click (run/export/compare subcommands) - Add analysis modules: gap detection, stability validation, comparison reports - Add visualization: throughput bars, radar, latency CDF, memory waterfall, scaling curves, chain depth, feature heatmap - Add W&B export with benchkit adapter, raw results artifact persistence - Add pre-import fixups (_preload.py) for Deep Lake/TF/JAX/Ray ordering Benchmarking engine (src/datarax/benchmarking/): - Add resource_monitor, results, statistics, timing modules - Refactor profiler, comparative, regression, monitor modules - Remove deprecated pipeline_throughput module Cloud infrastructure: - Add SkyPilot configs (cpu/gpu/tpu) with datarax-bench CLI integration, W&B export, PYTHONPATH fix for console scripts, Ray Data exclusion - Add .dockerignore to reduce build context from ~15GB to <500MB - Update Dockerfile: switch to runtime CUDA image, two-layer dep caching, CMD instead of ENTRYPOINT, uv binary COPY - Add benchmark Dockerfiles (cpu/gpu/tpu) in benchmarks/docker/ - Add CI workflows: benchmark-gate (PR) and benchmark-nightly Tools: - Add benchkit package (tools/benchkit/) for benchmark data management: store, exporters (W&B, JSON, HTML), metric definitions, analysis Source and operator improvements: - Add eager source ops, index_shuffle (Feistel cipher O(1) shuffling) - Add image validation utilities - Refactor source modules (memory, HF, TFDS, mixed, array_record) - Update operator strategies (sequential, parallel, branching, ensemble, merging) Documentation: - Restructure docs/ with updated API reference, benchmarking guides, contributing guides, source documentation - Add Docker documentation (docs/contributing/docker.md) - Add benchmark results, resource monitor, statistics, timing docs - Update all examples and notebooks for current API - Update mkdocs.yml navigation Script consolidation: - Remove 8 redundant scripts (run_benchmarks, run_gpu_tests, run_lint, etc.) - Add generate_baselines.py, run_full_benchmark.sh, verify_docs.py - Move vertex_config.yaml.template to scripts/ Tests: - Add benchmark test suite (P0-P5 priority levels, CLI, export, performance) - Add benchmarking engine tests (profiler, resource monitor, results, etc.) - Add source tests (eager ops, mixed source) - Update existing test fixtures and utilities
1 parent f82894d commit 3fcabcd

File tree

493 files changed

+52738
-7308
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

493 files changed

+52738
-7308
lines changed

.dockerignore

Lines changed: 164 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,164 @@
1+
# =============================================================================
2+
# Docker Build Context Ignore
3+
# =============================================================================
4+
# Based on .gitignore with Docker-specific additions.
5+
# Without this file, docker build sends .venv (5-15GB) + .git (~500MB) into
6+
# the build context. With it, context should be <500MB.
7+
8+
# --- Docker-specific (not in .gitignore) ---
9+
.git/
10+
.venv*
11+
tools/benchkit/.venv/
12+
13+
# --- Byte-compiled / optimized ---
14+
__pycache__/
15+
*.py[cod]
16+
*$py.class
17+
*.so
18+
19+
# --- Distribution / packaging ---
20+
.Python
21+
build/
22+
develop-eggs/
23+
dist/
24+
downloads/
25+
eggs/
26+
.eggs/
27+
lib/
28+
lib64/
29+
parts/
30+
sdist/
31+
var/
32+
wheels/
33+
share/python-wheels/
34+
*.egg-info/
35+
.installed.cfg
36+
*.egg
37+
MANIFEST
38+
39+
# --- Installer logs ---
40+
pip-log.txt
41+
pip-delete-this-directory.txt
42+
43+
# --- Unit test / coverage reports ---
44+
htmlcov/
45+
.tox/
46+
.nox/
47+
.coverage
48+
.coverage.*
49+
.cache
50+
nosetests.xml
51+
coverage.xml
52+
*.cover
53+
*.py,cover
54+
.hypothesis/
55+
.pytest_cache/
56+
cover/
57+
58+
# --- Caches ---
59+
.mypy_cache/
60+
.ruff_cache/
61+
.dmypy.json
62+
dmypy.json
63+
.pybuilder/
64+
.pytype/
65+
cython_debug/
66+
67+
# --- Environments ---
68+
.env
69+
.env.*
70+
.env.cloud
71+
env/
72+
venv/
73+
ENV/
74+
env.bak/
75+
venv.bak/
76+
77+
# --- Documentation build artifacts ---
78+
docs/_build/
79+
/site
80+
81+
# --- Non-runtime project files ---
82+
design_docs/
83+
memory-bank/
84+
sandbox/
85+
86+
# --- AI assistant / agent traces ---
87+
CLAUDE.md
88+
.claude/
89+
.claude-collective/
90+
.cursor/
91+
.cursorignore
92+
.agent/
93+
.taskmaster/
94+
.deprecated/
95+
96+
# --- Temp / logs ---
97+
temp/
98+
tmp/
99+
*.tmp
100+
*.tmp.*
101+
*.log
102+
logs/
103+
104+
# --- Benchmark data and results ---
105+
benchmark-data/
106+
benchmark_results/
107+
.benchmarks/
108+
.benchmarks-results/
109+
110+
# --- W&B ---
111+
wandb/
112+
.wandb*
113+
114+
# --- Secrets ---
115+
secrets.sh
116+
**/secrets.sh
117+
vertex_config.yaml
118+
119+
# --- Orbax checkpoint artifacts ---
120+
example_checkpoints/
121+
*.orbax-checkpoint-tmp/
122+
*.orbax-checkpoint-tmp-*/
123+
<MagicMock*>.orbax-checkpoint-tmp/
124+
*MagicMock*.orbax-checkpoint-tmp/
125+
MagicMock/**
126+
_CHECKPOINT_METADATA
127+
_strings.json
128+
manifest.ocdbt
129+
ocdbt.process_*/
130+
array_metadatas/
131+
132+
# --- Test artifacts ---
133+
test_debug*.py
134+
test_*.tmp
135+
**/test_checkpoint*/
136+
**/test_cache*/
137+
**/tests/tmp*/
138+
**/tests/temp*/
139+
tests/data/
140+
141+
# --- JAX/XLA caches ---
142+
.cache/jax/
143+
.cache/xla/
144+
145+
# --- IDE ---
146+
.ipynb_checkpoints
147+
.ropeproject
148+
.spyderproject
149+
.spyproject
150+
profile_default/
151+
152+
# --- Cloud config ---
153+
.pdm.toml
154+
.pdm-python
155+
.pdm-build/
156+
157+
# --- Misc ---
158+
*.manifest
159+
*.spec
160+
*.mo
161+
*.pot
162+
db.sqlite3
163+
db.sqlite3-journal
164+
.pypirc
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
name: Performance Gate
2+
3+
on:
4+
pull_request:
5+
paths:
6+
- 'src/datarax/**'
7+
- 'benchmarks/**'
8+
- 'pyproject.toml'
9+
10+
jobs:
11+
benchmark-tier1:
12+
runs-on: ubuntu-latest
13+
timeout-minutes: 10
14+
env:
15+
XLA_FLAGS: "--xla_force_host_platform_device_count=4"
16+
JAX_PLATFORMS: "cpu"
17+
steps:
18+
- uses: actions/checkout@v4
19+
20+
- uses: astral-sh/setup-uv@v4
21+
with:
22+
version: "latest"
23+
24+
- name: Install dependencies
25+
run: uv sync --all-extras
26+
27+
- name: Install benchkit
28+
run: uv pip install -e tools/benchkit
29+
30+
- name: Run Tier 1 Benchmark Gate
31+
run: uv run python -m benchmarks.runners.ci_runner --repetitions 3
32+
33+
- name: Regression check (benchkit)
34+
run: uv run benchkit check --data benchmark-data/ --threshold 0.05
35+
continue-on-error: true # Non-blocking until baseline is established
36+
37+
- uses: actions/upload-artifact@v4
38+
if: always()
39+
with:
40+
name: benchmark-results
41+
path: benchmark-data/
42+
retention-days: 30
Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
name: Nightly Benchmarks
2+
3+
on:
4+
schedule:
5+
- cron: "0 2 * * *" # 2 AM UTC daily
6+
workflow_dispatch:
7+
inputs:
8+
platform:
9+
description: "Target platform"
10+
default: "cpu"
11+
type: choice
12+
options:
13+
- cpu
14+
- gpu
15+
- tpu
16+
17+
jobs:
18+
cpu-benchmarks:
19+
runs-on: ubuntu-latest
20+
timeout-minutes: 60
21+
env:
22+
XLA_FLAGS: "--xla_force_host_platform_device_count=4"
23+
JAX_PLATFORMS: "cpu"
24+
steps:
25+
- uses: actions/checkout@v4
26+
27+
- uses: astral-sh/setup-uv@v4
28+
29+
- name: Install dependencies
30+
run: uv sync --all-extras
31+
32+
- name: Install benchkit
33+
run: uv pip install -e tools/benchkit[wandb]
34+
35+
- name: Run benchmarks and export to W&B
36+
env:
37+
WANDB_API_KEY: ${{ secrets.WANDB_API_KEY }}
38+
run: >
39+
uv run datarax-bench run
40+
--platform cpu
41+
--repetitions 3
42+
--wandb
43+
--charts
44+
45+
- name: Upload results
46+
uses: actions/upload-artifact@v4
47+
with:
48+
name: nightly-cpu-results
49+
path: benchmark-data/
50+
retention-days: 90
51+
52+
# GPU benchmarks — requires self-hosted runner or SkyPilot
53+
# Uncomment when cloud credits are available (see Section 6.4.5)
54+
# gpu-benchmarks:
55+
# runs-on: [self-hosted, gpu, a100]
56+
# timeout-minutes: 120
57+
# steps:
58+
# - uses: actions/checkout@v4
59+
# - uses: astral-sh/setup-uv@v4
60+
# - run: uv sync --all-extras
61+
# - run: uv pip install -e tools/benchkit[wandb]
62+
# - name: Run GPU benchmarks and export
63+
# env:
64+
# WANDB_API_KEY: ${{ secrets.WANDB_API_KEY }}
65+
# run: >
66+
# uv run datarax-bench run
67+
# --platform gpu
68+
# --profile gpu_a100
69+
# --repetitions 3
70+
# --wandb
71+
# --charts
72+
# - uses: actions/upload-artifact@v4
73+
# with:
74+
# name: nightly-gpu-results
75+
# path: benchmark-data/
76+
# retention-days: 90
77+
78+
# TPU benchmarks — requires TRC access or SkyPilot
79+
# tpu-benchmarks:
80+
# runs-on: [self-hosted, tpu]
81+
# timeout-minutes: 120
82+
# steps:
83+
# - uses: actions/checkout@v4
84+
# - uses: astral-sh/setup-uv@v4
85+
# - run: uv sync --all-extras
86+
# - run: uv pip install -e tools/benchkit[wandb]
87+
# - name: Run TPU benchmarks and export
88+
# env:
89+
# WANDB_API_KEY: ${{ secrets.WANDB_API_KEY }}
90+
# run: >
91+
# uv run datarax-bench run
92+
# --platform tpu
93+
# --profile tpu_v5e
94+
# --repetitions 3
95+
# --wandb
96+
# --charts
97+
# - uses: actions/upload-artifact@v4
98+
# with:
99+
# name: nightly-tpu-results
100+
# path: benchmark-data/
101+
# retention-days: 90

.github/workflows/build-verification.yml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -44,8 +44,7 @@ jobs:
4444
- name: Install build dependencies (macOS)
4545
if: runner.os == 'macOS'
4646
run: |
47-
# Install without CUDA dependencies for macOS
48-
uv pip install -e ".[all-cpu]"
47+
uv pip install -e ".[dev,data]"
4948
uv pip install types-requests types-setuptools
5049
5150
- name: Setup PYTHONPATH for tests

0 commit comments

Comments
 (0)