Skip to content

The-Asynchronous-Lab/ML-Reading-Group

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 

Repository files navigation

ML-Reading-Group

we are not bound to the time duration but vibes

Iteratable Discussions

  • CUDA:
    • Programming Massive Parallel Systems
    • CUDA Core Compute Libraries (Thrust, CUB, libcudacxx)
    • Multi-GPU programming, NCCL
  • CUTLASS & CUTE

Conceptual Discussions (Unordered)

  • Flash Attention (1&2)
  • Distributed Data Parallelism
  • Tensor Parallelism
  • Pipeline Parallelism
  • Context Parallelism
  • Fully Sharded Data Parallelism
  • DeepSpeed Zero (1, 2 and 3)
  • Sequence Parallelism: Long Sequence Training from System Perspective
  • Blockwise Parallel Transformer for Large Context Models
  • Ring Attention with Blockwise Transformers for Near-Infinite Context Length
  • Efficient Memory Management for Large Language Model Serving with PagedAttention
  • GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism
  • PipeDream: Fast and Efficient Pipeline Parallel DNN Training
  • Zero Bubble Pipeline Parallelism
  • Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism

About

Not your typical reading group

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published