Skip to content

Commit 70cfcbd

Browse files
alessiodevotomaxjeblick
authored andcommitted
Migration to uv (#108)
Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>
1 parent ccc9d96 commit 70cfcbd

File tree

13 files changed

+126
-79
lines changed

13 files changed

+126
-79
lines changed

.flake8

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
[flake8]
2-
exclude = .venv,venv,.git,__pycache__,build,dist, .mypy_cache
2+
exclude = .venv,venv,.git,__pycache__,build,dist,.mypy_cache,.pytest_cache
33
max-line-length = 120
44
per-file-ignores =
55
__init__.py:F401

.github/workflows/python-publish.yml

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -18,12 +18,13 @@ jobs:
1818
uses: actions/setup-python@v3
1919
with:
2020
python-version: 3.10.11
21-
- name: Install dependencies
22-
run: |
23-
python -m pip install --upgrade pip
24-
pip install build
25-
- name: Build package
26-
run: python -m build
21+
22+
- name: Install uv
23+
uses: astral-sh/setup-uv@v6
24+
25+
- name: Build
26+
run: uv build --no-sources
27+
2728
- name: Publish package
2829
uses: pypa/gh-action-pypi-publish@release/v1
2930
with:

.github/workflows/style.yml

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -15,13 +15,13 @@ jobs:
1515
with:
1616
python-version: 3.10.11
1717

18-
- name: Install Poetry
19-
run: |
20-
curl -sSL https://install.python-poetry.org | python3 -
21-
echo "$HOME/.local/bin" >> $GITHUB_PATH # Add Poetry to the PATH
18+
- name: Install uv
19+
uses: astral-sh/setup-uv@v6
20+
with:
21+
enable-cache: true
2222

2323
- name: Install dependencies
24-
run: poetry install --with dev
24+
run: uv sync --all-groups
2525

2626
- name: Run style checks
2727
run: make style

.github/workflows/test.yml

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ name: Test
22

33
on:
44
push:
5-
branches: [ main ]
5+
branches: [ main ]
66
pull_request:
77

88
jobs:
@@ -15,12 +15,12 @@ jobs:
1515
with:
1616
python-version: 3.10.11
1717

18-
- name: Install Poetry
19-
run: |
20-
curl -sSL https://install.python-poetry.org | python3 -
21-
echo "$HOME/.local/bin" >> $GITHUB_PATH # Add Poetry to the PATH
18+
- name: Install uv
19+
uses: astral-sh/setup-uv@v6
20+
with:
21+
enable-cache: true
2222

2323
- name: Install dependencies
24-
run: poetry install --with dev
24+
run: uv sync --all-groups
2525

2626
- run: make test

.gitignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ dev_notebooks/
22
results/
33
reports/
44
.DS_Store
5-
poetry.lock
5+
uv.lock
66
*.parquet
77
# Byte-compiled / optimized / DLL files
88
__pycache__/

Makefile

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,17 @@
11
SHELL := /bin/bash
2-
POETRY ?= $(shell which poetry)
2+
UV ?= $(shell which uv)
33
BUILD_VERSION:=$(APP_VERSION)
44
TESTS_FILTER:=
55

66
PYTEST_LOG=--log-cli-level=debug --log-format="%(asctime)s %(levelname)s [%(name)s:%(filename)s:%(lineno)d] %(message)s" --log-date-format="%Y-%m-%d %H:%M:%S"
77

88
.PHONY: isort
99
isort:
10-
$(POETRY) run isort .
10+
$(UV) run isort .
1111

1212
.PHONY: black
1313
black:
14-
$(POETRY) run black .
14+
$(UV) run black .
1515

1616
PHONY: format
1717
format: isort black
@@ -24,10 +24,10 @@ style: reports
2424
@echo -n > reports/copyright_errors.log
2525
@echo
2626

27-
-$(POETRY) run flake8 | tee -a reports/flake8_errors.log
27+
-$(UV) run flake8 | tee -a reports/flake8_errors.log
2828
@if [ -s reports/flake8_errors.log ]; then exit 1; fi
2929

30-
-$(POETRY) run mypy . --check-untyped-defs | tee -a reports/mypy.log
30+
-$(UV) run mypy . --check-untyped-defs | tee -a reports/mypy.log
3131
@if ! grep -Eq "Success: no issues found in [0-9]+ source files" reports/mypy.log ; then exit 1; fi
3232

3333
@echo "Checking for SPDX-FileCopyrightText headers in Python files..."
@@ -42,7 +42,7 @@ reports:
4242
.PHONY: test
4343
test: reports
4444
PYTHONPATH=. \
45-
$(POETRY) run pytest \
45+
$(UV) run pytest \
4646
--cov-report xml:reports/coverage.xml \
4747
--cov=kvpress/ \
4848
--junitxml=./reports/junit.xml \

README.md

Lines changed: 25 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -16,22 +16,17 @@ Deploying long-context LLMs is costly due to the linear growth of the key-value
1616
pip install kvpress
1717
```
1818

19-
If possible, install flash attention:
20-
```bash
21-
pip install flash-attn --no-build-isolation
22-
```
23-
24-
For a local installation with all dev dependencies, use poetry:
19+
For a local installation with all dev dependencies, use uv:
2520

2621
```bash
2722
git clone https://github.com/NVIDIA/kvpress.git
2823
cd kvpress
29-
poetry install --with dev
24+
uv sync --all-groups
3025
```
3126

3227
## Usage
3328

34-
kvpress provides a set of "presses" that compress the KV cache during the prefilling-phase. Each press is associated with a `compression_ratio` attribute that measures the compression of the cache. The easiest way to use a press is through our custom `KVPressTextGenerationPipeline`. It is automatically registered as a transformers pipeline with the name "kv-press-text-generation" when kvpress is imported and handles chat templates and tokenization for you:
29+
KVPress provides a set of "presses" that compress the KV cache during the prefilling-phase. Each press is associated with a `compression_ratio` attribute that measures the compression of the cache. The easiest way to use a press is through our custom `KVPressTextGenerationPipeline`. It is automatically registered as a transformers pipeline with the name "kv-press-text-generation" when kvpress is imported and handles chat templates and tokenization for you:
3530

3631
```python
3732
from transformers import pipeline
@@ -208,4 +203,25 @@ with press(model):
208203

209204
However, the `generate` method does not allow to exclude the question from the compression, which would artificially favors methods such as SnapKV. Ideally, we want a compression method that works whatever comes after the context (_e.g._ for use cases such as chat or document question answering). Finally the `generate` method does not allow to provide generation for multiple questions at once.
210205

211-
</details>
206+
</details>
207+
208+
209+
## Advanced installation settings
210+
To install optional packages, you can use [uv](https://docs.astral.sh/uv/).
211+
To install with flash attention, just run:
212+
213+
```bash
214+
git clone https://github.com/NVIDIA/kvpress.git
215+
cd kvpress
216+
uv sync --extra flash-attn
217+
```
218+
219+
To install with dependencies for evaluation, run
220+
221+
```bash
222+
git clone https://github.com/NVIDIA/kvpress.git
223+
cd kvpress
224+
uv sync --extra eval
225+
```
226+
227+
Notice that optional dependecies can be combined.

evaluation/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
We support evaluation for all the presses implemented in the library, on a variety of popular benchmarks.
66

77
### Quick Start 🚀
8+
> Evaluation requires some additional packages. You can install them with `uv sync --group eval`
89
910
Running evaluation is straightforward! Make sure you are in the `evaluation` directory, then:
1011

kvpress/pipeline.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -135,7 +135,10 @@ def preprocess(
135135
else:
136136
separator = "\n" + "#" * len(context)
137137
context = self.tokenizer.apply_chat_template(
138-
[{"role": "user", "content": context + separator}], add_generation_prompt=True, tokenize=False
138+
[{"role": "user", "content": context + separator}],
139+
add_generation_prompt=True,
140+
tokenize=False,
141+
enable_thinking=False,
139142
)
140143
context, question_suffix = context.split(separator)
141144

kvpress/presses/block_press.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -16,10 +16,11 @@ class BlockPress(BasePress):
1616
BlockPress: Block-wise iterative KV cache compression.
1717
1818
Applies compression in fixed-size blocks. Iteratively scores and prunes tokens block by block, maintaining
19-
a buffer of previously kept tokens for context. Mathematically equivalent
20-
to global compression when scoring uses only local information.
19+
a buffer of previously kept tokens for context. Mathematically equivalent to global compression when
20+
scoring uses only local information. It was introduced in the KeyDiff paper as part of the KeyDiff press,
21+
but it can also work as a standalone press.
2122
22-
Based on BlockPress (https://arxiv.org/abs/2504.15364).
23+
Based on the KeyDiff paper (https://arxiv.org/abs/2504.15364).
2324
2425
Parameters
2526
----------

0 commit comments

Comments
 (0)