33![ Mamba] ( assets/selection.png " Selective State Space ")
44> ** Mamba: Linear-Time Sequence Modeling with Selective State Spaces** \
55> Albert Gu* , Tri Dao* \
6- > Paper: https://arxiv.org/abs/2312.00752\
7- > ** Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality** \
6+ > Paper: https://arxiv.org/abs/2312.00752\
7+
8+ ![ Mamba-2] ( assets/ssd_algorithm.png " State Space Dual Model ")
9+ > ** Transformers are SSMs: Generalized Models and Efficient Algorithms** \
10+ > ** Through Structured State Space Duality** \
811> Tri Dao* , Albert Gu* \
912> Paper: https://arxiv.org/abs/2405.21060
1013
@@ -63,6 +66,8 @@ y = model(x)
6366assert y.shape == x.shape
6467```
6568
69+ ### Mamba-2
70+
6671The Mamba-2 block is implemented at [ modules/mamba2.py] ( mamba_ssm/modules/mamba2.py ) .
6772
6873A simpler version is at [ modules/mamba2_simple.py] ( mamba_ssm/modules/mamba2_simple.py )
@@ -81,6 +86,11 @@ y = model(x)
8186assert y.shape == x.shape
8287```
8388
89+ #### SSD
90+
91+ A minimal version of the inner SSD module (Listing 1 from the Mamba-2 paper) with conversion between "discrete" and "continuous" SSM versions
92+ is at [ modules/ssd_minimal.py] ( mamba_ssm/modules/ssd_minimal.py ) .
93+
8494### Mamba Language Model
8595
8696Finally, we provide an example of a complete language model: a deep sequence model backbone (with repeating Mamba blocks) + language model head.
@@ -205,6 +215,7 @@ If you use this codebase, or otherwise find our work valuable, please cite Mamba
205215 journal={arXiv preprint arXiv:2312.00752},
206216 year={2023}
207217}
218+
208219@inproceedings{mamba2,
209220 title={Transformers are {SSM}s: Generalized Models and Efficient Algorithms Through Structured State Space Duality},
210221 author={Dao, Tri and Gu, Albert},
0 commit comments