52
52
<li >
53
53
<a href =" #Parallel-Training-Demo " >Parallel Training Demo</a >
54
54
<ul >
55
- <li><a href="#LLaMA2 ">LLaMA 1/2</a></li>
55
+ <li><a href="#LLaMA3 ">LLaMA 1/2/3 </a></li>
56
56
<li><a href="#MoE">MoE</a></li>
57
57
<li><a href="#GPT-3">GPT-3</a></li>
58
58
<li><a href="#GPT-2">GPT-2</a></li>
@@ -270,13 +270,21 @@ Acceleration of [AlphaFold Protein Structure](https://alphafold.ebi.ac.uk/)
270
270
<p align =" right " >(<a href =" #top " >back to top</a >)</p >
271
271
272
272
## Parallel Training Demo
273
+ ### LLaMA3
274
+ <p align =" center " >
275
+ <img src =" https://raw.githubusercontent.com/hpcaitech/public_assets/main/examples/images/LLaMA3-70B-H100.png " width =600/ >
276
+ </p >
277
+
278
+ - 70 billion parameter LLaMA3 model training accelerated by 18%
279
+ [[ code]] ( https://github.com/hpcaitech/ColossalAI/tree/main/examples/language/llama )
280
+
273
281
### LLaMA2
274
282
<p align =" center " >
275
283
<img src =" https://raw.githubusercontent.com/hpcaitech/public_assets/main/colossalai/img/llama2_pretraining.png " width =600/ >
276
284
</p >
277
285
278
286
- 70 billion parameter LLaMA2 model training accelerated by 195%
279
- [[ code]] ( https://github.com/hpcaitech/ColossalAI/tree/main/examples/language/llama2 )
287
+ [[ code]] ( https://github.com/hpcaitech/ColossalAI/tree/main/examples/language/llama )
280
288
[[ blog]] ( https://www.hpc-ai.tech/blog/70b-llama2-training )
281
289
282
290
### LLaMA1
@@ -285,7 +293,7 @@ Acceleration of [AlphaFold Protein Structure](https://alphafold.ebi.ac.uk/)
285
293
</p >
286
294
287
295
- 65-billion-parameter large model pretraining accelerated by 38%
288
- [[ code]] ( https://github.com/hpcaitech/ColossalAI/tree/example/llama /examples/language/llama )
296
+ [[ code]] ( https://github.com/hpcaitech/ColossalAI/tree/main /examples/language/llama )
289
297
[[ blog]] ( https://www.hpc-ai.tech/blog/large-model-pretraining )
290
298
291
299
### MoE
0 commit comments