Skip to content
Change the repository type filter

All

    Repositories list

    • CSV-Decode: Certifiable Sub-Vocabulary Decoding for Efficient Large Language Model Inference
      Python
      01200Updated Feb 26, 2026Feb 26, 2026
    • [FPGA'26 Best Paper Nominee] CXL-SpecKV: A Disaggregated FPGA Speculative KV-Cache for Datacenter LLM Serving
      C++
      11800Updated Feb 23, 2026Feb 23, 2026
    • [ACM MM 2025 Oral] TinyServe: Query-Aware Page Allocation Optimization
      Shell
      21000Updated Jan 18, 2026Jan 18, 2026
    • SPI_VecDB

      Public
      Distributed Parallel Multi-Resolution Vector Search
      Go
      Apache License 2.0
      0900Updated Jan 16, 2026Jan 16, 2026
    • HSGM

      Public
      [ICPADS 2025 Oral, *SEM 2025 Oral] HSGM: Hierarchical Segment-Graph Memory for Scalable Long-Text Semantics
      Python
      MIT License
      0800Updated Nov 23, 2025Nov 23, 2025
    • CogLoad

      Public
      Cognitive Load Traces
      Python
      0100Updated Nov 3, 2025Nov 3, 2025
    • NeuroSpec

      Public
      Grammar- and Resource-Aligned Certifiable Speculative Decoding
      Python
      0000Updated Oct 31, 2025Oct 31, 2025
    • PiKV

      Public
      PiKV: KV Cache Management System for MoE [Efficient ML System]
      Python
      Other
      7400Updated Oct 26, 2025Oct 26, 2025
    • GraphSnapShot: Caching Local Structure for Fast Graph Learning [Efficient ML System]
      Python
      6200Updated Sep 22, 2025Sep 22, 2025
    • FastCache

      Public
      FastCache: Fast Caching for Diffusion Transformer Through Learnable Linear Approximation [Efficient ML Model]
      Python
      Apache License 2.0
      312700Updated Sep 22, 2025Sep 22, 2025
    • SemToken

      Public
      [IWCS 2025 Oral] SemToken: Semantic-Aware Tokenization for Efficient Long-Context Language Modeling
      Python
      0500Updated Sep 21, 2025Sep 21, 2025
    • QTM

      Public
      https://www.arxiv.org/abs/2508.13204
      Python
      3000Updated Sep 21, 2025Sep 21, 2025