Skip to content
Change the repository type filter

All

    Repositories list

    • vllm-omni

      Public
      A high-throughput and memory efficient inference and serving engine for Omni-modality models
      Python
      Apache License 2.0
      462000Updated Feb 26, 2026Feb 26, 2026
    • vllm

      Public
      vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      Apache License 2.0
      14k9414Updated Feb 26, 2026Feb 26, 2026
    • aiter

      Public
      AI Tensor Engine for ROCm
      Python
      MIT License
      218000Updated Feb 26, 2026Feb 26, 2026
    • litellm

      Public
      Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Re…
      Python
      Other
      6k001Updated Feb 26, 2026Feb 26, 2026
    • skypilot

      Public
      SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 12+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
      Python
      Apache License 2.0
      967000Updated Feb 25, 2026Feb 25, 2026
    • Python
      Apache License 2.0
      0005Updated Feb 24, 2026Feb 24, 2026
    • JamAIBase

      Public
      The collaborative spreadsheet for AI. Chain cells into powerful pipelines, experiment with prompts and models, and evaluate LLM responses in real-time. Work tog…
      Python
      Apache License 2.0
      391.1k11Updated Feb 23, 2026Feb 23, 2026
    • vllmtests

      Public
      This is a repository containing the tools for testing vLLM correctness and perf regression
      Python
      Apache License 2.0
      2200Updated Jan 15, 2026Jan 15, 2026
    • A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      Apache License 2.0
      14k100Updated Jan 9, 2026Jan 9, 2026
    • Typescript Documentation of JamAISDK
      HTML
      0000Updated Jan 8, 2026Jan 8, 2026
    • lmms-eval

      Public
      One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
      Python
      Other
      523000Updated Jan 6, 2026Jan 6, 2026
    • High-performance safetensors model loader
      Python
      Apache License 2.0
      19002Updated Dec 29, 2025Dec 29, 2025
    • Python
      Apache License 2.0
      0005Updated Dec 8, 2025Dec 8, 2025
    • Collect the scripts and results of all reasoning experiments.
      Python
      Apache License 2.0
      1000Updated Dec 7, 2025Dec 7, 2025
    • recipes

      Public
      Common recipes to run vLLM
      Jupyter Notebook
      Apache License 2.0
      155000Updated Nov 25, 2025Nov 25, 2025
    • vllm-rocm

      Public
      Python
      Apache License 2.0
      0200Updated Nov 21, 2025Nov 21, 2025
    • HTML
      70000Updated Oct 3, 2025Oct 3, 2025
    • Python
      14000Updated Sep 25, 2025Sep 25, 2025
    • LMCache

      Public
      ROCm support of Ultra-Fast and Cheaper Long-Context LLM Inference
      Python
      Apache License 2.0
      910000Updated Jul 15, 2025Jul 15, 2025
    • roxl

      Public
      NVIDIA Inference Xfer Library (NIXL)
      C++
      Apache License 2.0
      248000Updated Jun 6, 2025Jun 6, 2025
    • This is a repository to monitor the fast changing ROCm/aiter repository to alert user that AITER function of interests e.g. in vLLM, in SGLang has been updated …
      Python
      Apache License 2.0
      00390Updated Apr 27, 2025Apr 27, 2025
    • vLLM Workshop Content
      Apache License 2.0
      0200Updated Apr 3, 2025Apr 3, 2025
    • Jupyter Notebook
      5000Updated Mar 20, 2025Mar 20, 2025
    • Python
      Apache License 2.0
      1000Updated Feb 24, 2025Feb 24, 2025
    • The driver for LMCache core to run in vLLM
      Python
      Apache License 2.0
      32000Updated Jan 24, 2025Jan 24, 2025
    • Python
      8000Updated Jan 23, 2025Jan 23, 2025
    • Python
      Apache License 2.0
      369000Updated Jan 22, 2025Jan 22, 2025
    • kvpress

      Public
      LLM KV cache compression made easy
      Python
      Apache License 2.0
      113100Updated Jan 21, 2025Jan 21, 2025
    • Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators
      C++
      Other
      272000Updated Dec 20, 2024Dec 20, 2024
    • Mooncake

      Public
      Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
      C++
      Apache License 2.0
      567000Updated Dec 16, 2024Dec 16, 2024