Skip to content
@FastLM

FastLM

We develop efficient LM in large-scale, distributed, parallel, sparsity senarios.

Popular repositories Loading

  1. CXL-SpecKV CXL-SpecKV Public

    [FPGA'26 Highlight] CXL-SpecKV: A Disaggregated FPGA Speculative KV-Cache for Datacenter LLM Serving

    C++ 22 2

  2. CSV-Decode CSV-Decode Public

    CSV-Decode: Certifiable Sub-Vocabulary Decoding for Efficient Large Language Model Inference

    Python 12

  3. tinyserve-vllm tinyserve-vllm Public

    [ACM MM 2025 Oral] TinyServe: Query-Aware Page Allocation Optimization

    Shell 10 2

  4. SPI_VecDB SPI_VecDB Public

    Distributed Parallel Multi-Resolution Vector Search

    Go 10

  5. HSGM HSGM Public

    [ICPADS 2025 Oral, *SEM 2025 Oral] HSGM: Hierarchical Segment-Graph Memory for Scalable Long-Text Semantics

    Python 8

  6. MKA MKA Public

    [ACM CF'26 Oral] MKA: Memory-Keyed Attention for Efficient Long-Context Reasoning

    Python 6 1

Repositories

Showing 8 of 8 repositories
  • MKA Public

    [ACM CF'26 Oral] MKA: Memory-Keyed Attention for Efficient Long-Context Reasoning

    FastLM/MKA’s past year of commit activity
    Python 6 1 0 0 Updated Mar 31, 2026
  • CSV-Decode Public

    CSV-Decode: Certifiable Sub-Vocabulary Decoding for Efficient Large Language Model Inference

    FastLM/CSV-Decode’s past year of commit activity
    Python 12 0 0 0 Updated Feb 26, 2026
  • CXL-SpecKV Public

    [FPGA'26 Highlight] CXL-SpecKV: A Disaggregated FPGA Speculative KV-Cache for Datacenter LLM Serving

    FastLM/CXL-SpecKV’s past year of commit activity
    C++ 22 2 0 0 Updated Feb 23, 2026
  • tinyserve-vllm Public

    [ACM MM 2025 Oral] TinyServe: Query-Aware Page Allocation Optimization

    FastLM/tinyserve-vllm’s past year of commit activity
    Shell 10 2 0 0 Updated Jan 18, 2026
  • SPI_VecDB Public

    Distributed Parallel Multi-Resolution Vector Search

    FastLM/SPI_VecDB’s past year of commit activity
    Go 10 Apache-2.0 0 0 0 Updated Jan 16, 2026
  • HSGM Public

    [ICPADS 2025 Oral, *SEM 2025 Oral] HSGM: Hierarchical Segment-Graph Memory for Scalable Long-Text Semantics

    FastLM/HSGM’s past year of commit activity
    Python 8 MIT 0 0 0 Updated Nov 23, 2025
  • SemToken Public

    [IWCS 2025 Oral] SemToken: Semantic-Aware Tokenization for Efficient Long-Context Language Modeling

    FastLM/SemToken’s past year of commit activity
    Python 5 0 0 0 Updated Sep 21, 2025
  • FastLM/QTM’s past year of commit activity
    Python 0 3 0 0 Updated Sep 21, 2025

Top languages

Loading…

Most used topics

Loading…