Add GPU e2e testing with lambda-gha #64

ryan-williams · 2026-01-20T20:07:38Z

Summary

Add .github/workflows/gpu-e2e.yml for GPU-based e2e training tests
Add --gpu flag to tests/e2e_train.py for GPU acceleration

Uses Open-Athena/lambda-gha to spin up ephemeral Lambda Labs GPU instances.

Required Setup

Secrets:

LAMBDA_API_KEY
GH_SA_TOKEN
LAMBDA_SSH_PRIVATE_KEY

Variables:

LAMBDA_SSH_KEY_NAMES

Test plan

Trigger via workflow_dispatch after merge
Verify GPU instance spins up
Verify training runs with --gpu flag
Verify instance self-terminates

🤖 Generated with Claude Code

- `tests/e2e_train.py`: CLI for seeded training on sample data (5 epochs, 8 channels, 2 residual blocks). Verifies val_loss matches expected value. - `tests/expected_loss.txt`: expected val_loss (0.613389) for regression - `tests/test_e2e_train.py`: pytest wrapper - `examples/e2e_training_demo.ipynb`: notebook version with training curve plot Training is fully deterministic (47% val_loss improvement over 5 epochs). Add notebook deps: `nbconvert`, `ipykernel`, `papermill`. Co-Authored-By: Claude Opus 4.5 <[email protected]>

- `.github/workflows/gpu-e2e.yml`: workflow_dispatch workflow using Open-Athena/lambda-gha for ephemeral Lambda Labs GPU instances - `tests/e2e_train.py`: add `--gpu` flag to enable GPU acceleration Workflow spins up A10 GPU by default, runs training test, then instance self-terminates. Requires LAMBDA_API_KEY, GH_SA_TOKEN, LAMBDA_SSH_PRIVATE_KEY secrets and LAMBDA_SSH_KEY_NAMES variable. Co-Authored-By: Claude Opus 4.5 <[email protected]>

ryan-williams force-pushed the gpu-ci branch from cef5dc1 to 8c08388 Compare January 20, 2026 20:15

ryan-williams force-pushed the gpu-ci branch from 8c08388 to 3823ccc Compare January 20, 2026 20:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add GPU e2e testing with lambda-gha #64

Add GPU e2e testing with lambda-gha #64

Uh oh!

ryan-williams commented Jan 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add GPU e2e testing with lambda-gha #64

Are you sure you want to change the base?

Add GPU e2e testing with lambda-gha #64

Uh oh!

Conversation

ryan-williams commented Jan 20, 2026

Summary

Required Setup

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants