Skip to content

Commit 83ac8ff

Browse files
authored
Update README.md
1 parent 831e541 commit 83ac8ff

File tree

1 file changed

+5
-2
lines changed

1 file changed

+5
-2
lines changed

README.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,8 +19,11 @@ Official source code of HELM, a family of fully **H**yp**E**rbolic Large **L**an
1919
<p align="center">
2020
<img src="figure/MiCE.jpg" width="400" height="300"/>
2121
<p>
22+
2223
The Mixture of Curvature Experts (MiCE) module is a hyperbolic MoE module that enables each experts to each operate on a distinct curvature space, so the the experts can collectively learn more fine-grained geometric structures in the token distributions. The routing is also sepcifically designed to reflect the geometric structure of the space. Please see our [paper](https://arxiv.org/abs/2505.24722) for techinical details.
2324

25+
26+
2427
### Hyperbolic Multi-Head Latent Attention (HMLA)
2528
<p align="center">
2629
<img src="figure/HMLA.jpg" width="400" height="300"/>
@@ -49,7 +52,7 @@ To train the models with the default config, please first prepare the dataset fo
4952
python3 helm/utils/prep_data.py
5053
```
5154

52-
Then the models can be trained using the scipts found in the [example folder](./example). For example, to train the 120M parameter HELM-MiCE model as did in the paper, please run
55+
Then the models can be trained using the scipts found in the [example folder](./example). For example, to train the 120M parameter HELM-MiCE model as did in the paper, run,
5356

5457
```bash
5558
bash example/train_mice_120M.sh
@@ -174,4 +177,4 @@ You can find the full details regarding the model and modules in the paper [here
174177
journal={arXiv preprint arXiv:2505.24722},
175178
year={2025},
176179
}
177-
```
180+
```

0 commit comments

Comments
 (0)