Conversation
|
Important Installation incomplete: to start using Gemini Code Assist, please ask the organization owner(s) to visit the Gemini Code Assist Admin Console and sign the Terms of Services. |
There was a problem hiding this comment.
Pull request overview
This PR adds DeltaNet, a linear attention model using the delta rule for hidden state updates, to the Discretax library. DeltaNet offers an efficient alternative to softmax attention with linear complexity in sequence length. The implementation follows the paper "Parallelizing Linear Transformers with the Delta Rule over Sequence Length" (Yang et al., 2024) and includes the chunked delta rule operator for hardware-efficient computation.
Changes:
- Added DeltaNet model, sequence mixer, and chunked delta rule operations
- Added einops dependency for tensor rearrangement operations
- Updated documentation to include DeltaNet in the API reference and model table
- Added comprehensive unit tests for DeltaNet components
Reviewed changes
Copilot reviewed 14 out of 15 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| uv.lock | Updated lockfile with einops 0.8.2 dependency and revision change |
| tests/test_deltanet.py | Comprehensive unit tests for delta rule ops, DeltaNet sequence mixer, and model |
| src/discretax/sequence_mixers/deltanet.py | DeltaNet sequence mixer implementation with multi-head linear attention |
| src/discretax/sequence_mixers/init.py | Exported DeltaNetSequenceMixer to public API |
| src/discretax/ops/delta_rule.py | Core chunked delta rule operator implementation |
| src/discretax/ops/init.py | Exported delta rule functions to public API |
| src/discretax/models/deltanet.py | DeltaNet model with stacked blocks |
| src/discretax/models/init.py | Exported DeltaNet to public API |
| pyproject.toml | Added einops>=0.8.2 dependency with inconsistent indentation |
| mkdocs.yml | Added DeltaNet documentation pages to navigation |
| docs/index.md | Updated logo size and reordered contributors |
| docs/api/sequence_mixer/deltanet.md | API documentation for DeltaNet sequence mixer |
| docs/api/models/deltanet.md | API documentation for DeltaNet model |
| README.md | Added DeltaNet to supported models table and changed logo URL |
| Makefile | Added deploy-docs and serve-docs targets |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Description
Type of Change
Changes Made
Added DeltaNet, updated docs