Highlights
- Pro
Pinned Loading
-
-
MinivLLM
MinivLLM PublicForked from Wenyueh/MinivLLM
Based on Nano-vLLM, a simple replication of vLLM with self-contained paged attention and flash attention implementation
Python
-
-
vllm
vllm PublicForked from vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Python
-
vllm-omni
vllm-omni PublicForked from vllm-project/vllm-omni
A framework for efficient model inference with omni-modality models
Python
-
vllm-playground
vllm-playground PublicForked from micytao/vllm-playground
A modern web interface for managing and interacting with vLLM servers (www.github.com/vllm-project/vllm). Supports both GPU and CPU modes, with special optimizations for macOS Apple Silicon and ent…
JavaScript
If the problem persists, check the GitHub status page or contact support.


