Skip to content
Change the repository type filter

All

    Repositories list

    • Libra

      Public
      [ICLR 2026] Libra: Effective yet Efficient Load Balancing for Large-Scale MoE Inference
      0000Updated Mar 3, 2026Mar 3, 2026
    • Apache License 2.0
      0000Updated Feb 28, 2026Feb 28, 2026
    • Python
      Apache License 2.0
      0000Updated Feb 28, 2026Feb 28, 2026
    • GS-Scale

      Public
      [ASPLOS '26] Fast, memory efficient, and scalable 3D Gaussian Splatting training framework
      Cuda
      MIT License
      11600Updated Feb 9, 2026Feb 9, 2026
    • DecDEC

      Public
      [OSDI 2025] DecDEC: A Systems Approach to Advancing Low‑Bit LLM Quantization
      Python
      32200Updated Jan 29, 2026Jan 29, 2026
    • flashTP

      Public
      Torch-native C++/CUDA library to accelerate tensor-product layers in MLIPs
      Cuda
      MIT License
      45510Updated Nov 26, 2025Nov 26, 2025
    • NestedFP

      Public
      [NeurIPS 2025] NestedFP: High-Performance, Memory-Efficient Dual-Precision Floating Point Support for LLMs
      HTML
      0700Updated Nov 21, 2025Nov 21, 2025
    • DP-LLM

      Public
      [NeurIPS 2025] DP-LLM: Runtime Model Adaptation with Dynamic Layer-wise Precision Assignment
      Python
      MIT License
      7700Updated Oct 24, 2025Oct 24, 2025
    • Pre-finetuned results for DP-LLM.
      MIT License
      0000Updated Oct 23, 2025Oct 23, 2025
    • FastPoint

      Public
      [ICCV 2025] FastPoint: Accelerating 3D Point Cloud Model Inference via Sample Point Distance Prediction
      Python
      11900Updated Sep 18, 2025Sep 18, 2025
    • [ICML 2024 Oral] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs
      Python
      MIT License
      712220Updated Jul 4, 2025Jul 4, 2025
    • ADA-NNS

      Public
      Python
      0500Updated Apr 4, 2025Apr 4, 2025
    • C++
      0200Updated Feb 25, 2025Feb 25, 2025
    • gem5

      Public
      forked from https://github.com/gem5/gem5
      C++
      BSD 3-Clause "New" or "Revised" License
      1.7k000Updated Jan 14, 2025Jan 14, 2025
    • Ginex

      Public
      Ginex: SSD-enabled Billion-scale Graph Neural Network Training on a Single Machine via Provably Optimal In-memory Caching
      Python
      84121Updated Jul 10, 2024Jul 10, 2024
    • [ECCV 2024] Frugal 3D Point Cloud Model Training via Progressive Near Point Filtering and Fused Aggregation
      Python
      0400Updated Jul 4, 2024Jul 4, 2024
    • C
      08120Updated Jun 16, 2024Jun 16, 2024
    • C
      08230Updated May 27, 2024May 27, 2024
    • C
      07380Updated Apr 28, 2024Apr 28, 2024
    • C++
      BSD 3-Clause "New" or "Revised" License
      7000Updated Apr 22, 2024Apr 22, 2024
    • C
      36300Updated Apr 1, 2024Apr 1, 2024
    • C
      3000Updated Mar 9, 2024Mar 9, 2024
    • C
      2320Updated Mar 7, 2024Mar 7, 2024
    • KVRouter

      Public
      C++
      0000Updated Mar 1, 2024Mar 1, 2024
    • Makefile
      BSD 2-Clause "Simplified" License
      10000Updated Feb 24, 2024Feb 24, 2024
    • C++
      Other
      0000Updated Feb 7, 2024Feb 7, 2024
    • C++
      Other
      0000Updated Feb 7, 2024Feb 7, 2024
    • C++
      3000Updated Oct 19, 2023Oct 19, 2023
    • FusedMM

      Public
      Implementation of FusedMM method for IPDPS 2021 paper titled "FusedMM: A Unified SDDMM-SpMM Kernel for Graph Embedding and Graph Neural Networks"
      C
      5000Updated Oct 5, 2023Oct 5, 2023
    • Python
      0200Updated Jun 27, 2023Jun 27, 2023