Skip to content

Commit 0e75258

Browse files
author
SoulSniper1212
committed
Fix #57: Make triton and flash-attn dependencies Linux-specific and update README.
1 parent 2f21442 commit 0e75258

File tree

2 files changed

+19
-11
lines changed

2 files changed

+19
-11
lines changed

README.md

Lines changed: 15 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -16,32 +16,40 @@ A lightweight vLLM implementation built from scratch.
1616
* 📖 **Readable codebase** - Clean implementation in ~ 1,200 lines of Python code
1717
***Optimization Suite** - Prefix caching, Tensor Parallelism, Torch compilation, CUDA graph, etc.
1818

19+
## Requirements
20+
21+
- **OS:** Linux
22+
- **GPU:** NVIDIA GPU with CUDA support
23+
- **Python:** 3.10 - 3.12
24+
25+
`nano-vllm` relies on `triton` and `flash-attn` for high-performance custom CUDA kernels. These packages are currently only available on Linux platforms with NVIDIA GPUs.
26+
1927
## Installation
2028

21-
```bash
29+
2230
pip install git+https://github.com/GeeeekExplorer/nano-vllm.git
23-
```
31+
2432

2533
## Model Download
2634

2735
To download the model weights manually, use the following command:
28-
```bash
36+
2937
huggingface-cli download --resume-download Qwen/Qwen3-0.6B \
3038
--local-dir ~/huggingface/Qwen3-0.6B/ \
3139
--local-dir-use-symlinks False
32-
```
40+
3341

3442
## Quick Start
3543

3644
See `example.py` for usage. The API mirrors vLLM's interface with minor differences in the `LLM.generate` method:
37-
```python
45+
3846
from nanovllm import LLM, SamplingParams
3947
llm = LLM("/YOUR/MODEL/PATH", enforce_eager=True, tensor_parallel_size=1)
4048
sampling_params = SamplingParams(temperature=0.6, max_tokens=256)
4149
prompts = ["Hello, Nano-vLLM."]
4250
outputs = llm.generate(prompts, sampling_params)
4351
outputs[0]["text"]
44-
```
52+
4553

4654
## Benchmark
4755

@@ -63,4 +71,4 @@ See `bench.py` for benchmark.
6371

6472
## Star History
6573

66-
[![Star History Chart](https://api.star-history.com/svg?repos=GeeeekExplorer/nano-vllm&type=Date)](https://www.star-history.com/#GeeeekExplorer/nano-vllm&Date)
74+
[

pyproject.toml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
44

55
[project]
66
name = "nano-vllm"
7-
version = "0.2.0"
7+
version = "0.2.1"
88
authors = [{ name = "Xingkai Yu" }]
99
license = "MIT"
1010
license-files = ["LICENSE"]
@@ -13,15 +13,15 @@ description = "a lightweight vLLM implementation built from scratch"
1313
requires-python = ">=3.10,<3.13"
1414
dependencies = [
1515
"torch>=2.4.0",
16-
"triton>=3.0.0",
1716
"transformers>=4.51.0",
18-
"flash-attn",
1917
"xxhash",
18+
"triton>=3.0.0; sys_platform == 'linux'",
19+
"flash-attn; sys_platform == 'linux'",
2020
]
2121

2222
[project.urls]
2323
Homepage="https://github.com/GeeeekExplorer/nano-vllm"
2424

2525
[tool.setuptools.packages.find]
2626
where = ["."]
27-
include = ["nanovllm*"]
27+
include = ["nanovllm*"]

0 commit comments

Comments
 (0)