You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: update to use new kv-cache UDS tokenizer (#609)
* feat: update to use new kv-cache UDS tokenizer
- change preprocessing to types from kv-cache
- add new unit test case: same tests from the old ones
- keep old test case but mark it wont be used(for now can be removed
later)
- add new make target to build UDS image
- update image to use the one from llm-d in deploy
- remove parts in Dockerfile to only build go code
Signed-off-by: Wen Zhou <wenzhou@redhat.com>
* update: more changes for UDS in makefile and docs
Signed-off-by: Wen Zhou <wenzhou@redhat.com>
* update: add comments for download-tokenizer and remove as dependecy to
build
Signed-off-by: Wen Zhou <wenzhou@redhat.com>
* GHAction: remove lint-and-test which still using python
Signed-off-by: Wen Zhou <wenzhou@redhat.com>
* update: fix rebase
Signed-off-by: Wen Zhou <wenzhou@redhat.com>
* fix: lint with 2.8.0
Signed-off-by: Wen Zhou <wenzhou@redhat.com>
* update: code review
- remove env variable
LDFLAGS
PYTHON_CONFIG
CGO_CFLAGS
TOKENIZER_ARCH
PYTHON_VERSION
epp_* and sidecar_* for CGO
- update documentation
- remove make targets related to tokenizer, pythone
Signed-off-by: Wen Zhou <wenzhou@redhat.com>
---------
Signed-off-by: Wen Zhou <wenzhou@redhat.com>
> **Python is NOT required** as of v0.5.1. Tokenization is handled by a separate UDS (Unix Domain Socket) tokenizer sidecar container. Previous versions (< v0.5.1) used embedded Python tokenizers with daulet/tokenizers bindings, but these are now deprecated.
23
24
24
-
The project uses Python 3.12 by default, but this can be configured:
25
+
## Tokenization Architecture
25
26
26
-
**For local development:**
27
-
`PYTHON_VERSION` in the Makefile set which Python version is used.
27
+
The project uses **UDS (Unix Domain Socket)** tokenization. Tokenization is handled by a separate UDS tokenizer sidecar container, not by the EPP container itself. Previous embedded tokenizer approaches (daulet/tokenizers, direct Python/vLLM linking) are deprecated and no longer used.
28
28
29
-
**For Docker builds:**
30
-
The Python version is parameterized in the Dockerfile via the `PYTHON_VERSION` build argument, which defaults to 3.12. To build with a different Python version:
0 commit comments