Pinned Loading
-
DeepSeek-V3-Architecture-Implementation
DeepSeek-V3-Architecture-Implementation PublicConverting SmolLM2-135M to DeepSeek-V3 with MLHA and MoE
Python
-
UNet-Segmentation
UNet-Segmentation PublicA modular U-Net implementation in PyTorch built from scratch. Features a dynamic Model Factory to benchmark architectural variations (Pooling vs. Strided, Transpose vs. Upsample) and loss functions…
Python
-
smollm2-135-implementation
smollm2-135-implementation PublicComplete from-scratch implementation of SmolLM2-135M, reverse-engineered from the pretrained model.
Jupyter Notebook
-
FlashMoE-Serve
FlashMoE-Serve PublicHigh-performance MoE inference engine. Features fused OpenAI Triton kernels, continuous batching, and NF4 quantization. +72% throughput on RTX 3060.
Python
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.