Skip to content

Commit eabb332

Browse files
dmitripikusvMaroon
andauthored
Update Prefix-Cache-Scorer cache_tracking Mode with The v0.2 KVCache.Indexer (#245)
* Support for new version of kv-cache-manager is added to scheduler * Redundant function parameter and struct member are removed * Minor changes in comments * Redundant setting of ZMQEndpoint is removed * - general refactoring - enhanced prefix-cache-scorer cache-tracking mode's configurability - updated docs - enhanced docs (no HTML use) Signed-off-by: Maroon Ayoub <[email protected]> * refactor example configs Signed-off-by: Maroon Ayoub <[email protected]> --------- Signed-off-by: Maroon Ayoub <[email protected]> Co-authored-by: Maroon Ayoub <[email protected]> Co-authored-by: Dmitri Pikus <[email protected]>
1 parent 963c084 commit eabb332

File tree

10 files changed

+445
-285
lines changed

10 files changed

+445
-285
lines changed

.github/workflows/ci-pr-checks.yaml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,14 @@ jobs:
2525
go-version: "${{ env.GO_VERSION }}"
2626
cache-dependency-path: ./go.sum
2727

28+
- name: Install libzmq dependencies (kvcache/kvevents)
29+
run: |
30+
sudo apt-get update
31+
sudo apt-get install -y libzmq3-dev pkg-config
32+
33+
- name: Set PKG_CONFIG_PATH
34+
run: echo "PKG_CONFIG_PATH=/usr/lib/pkgconfig" >> $GITHUB_ENV
35+
2836
- name: go mod tidy
2937
run: go mod tidy
3038

Dockerfile

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,10 @@ ARG TARGETOS
44
ARG TARGETARCH
55

66
# Install build tools
7-
RUN dnf install -y gcc-c++ libstdc++ libstdc++-devel clang && dnf clean all
7+
# The builder is based on UBI8, so we need epel-release-8.
8+
RUN dnf install -y 'https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm' && \
9+
dnf install -y gcc-c++ libstdc++ libstdc++-devel clang zeromq-devel pkgconfig && \
10+
dnf clean all
811

912
WORKDIR /workspace
1013

@@ -36,11 +39,22 @@ RUN go build -a -o bin/epp -ldflags="-extldflags '-L$(pwd)/lib'" cmd/epp/main.go
3639
FROM registry.access.redhat.com/ubi9/ubi-minimal:latest
3740
WORKDIR /
3841
COPY --from=builder /workspace/bin/epp /app/epp
42+
43+
# Install zeromq runtime library needed by the manager.
44+
# The final image is UBI9, so we need epel-release-9.
45+
USER root
46+
RUN microdnf install -y dnf && \
47+
dnf install -y 'https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm' && \
48+
dnf install -y zeromq
49+
3950
USER 65532:65532
4051

4152
# expose gRPC, health and metrics ports
4253
EXPOSE 9002
4354
EXPOSE 9003
4455
EXPOSE 9090
4556

57+
# expose port for KV-Events ZMQ SUB socket
58+
EXPOSE 5557
59+
4660
ENTRYPOINT ["/app/epp"]

deploy/config/epp-prefix-cache-tracking-config.yaml

Lines changed: 24 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -3,20 +3,28 @@
33
apiVersion: inference.networking.x-k8s.io/v1alpha1
44
kind: EndpointPickerConfig
55
plugins:
6-
- type: single-profile-handler
7-
- type: decode-filter
8-
- type: prefix-cache-scorer
9-
parameters:
10-
mode: cache_tracking
11-
kvCacheRedisAddr: ${REDIS_HOST}:${REDIS_PORT}
12-
- type: load-aware-scorer
13-
- type: max-score-picker
6+
- type: single-profile-handler
7+
- type: decode-filter
8+
- type: prefix-cache-scorer
9+
parameters:
10+
mode: cache_tracking
11+
indexerConfig:
12+
tokenProcessorConfig:
13+
blockSize: 64 # must match vLLM block size
14+
hashSeed: "42" # must match vLLM PYTHONHASHSEED env var
15+
kvBlockIndexConfig:
16+
enableMetrics: true # enable kv-block index metrics (prometheus)
17+
- type: kv-cache-scorer # kv-cache-utilization
18+
- type: queue-scorer
19+
- type: max-score-picker
1420
schedulingProfiles:
15-
- name: default
16-
plugins:
17-
- pluginRef: decode-filter
18-
- pluginRef: prefix-cache-scorer
19-
weight: 2.0
20-
- pluginRef: load-aware-scorer
21-
weight: 1.0
22-
- pluginRef: max-score-picker
21+
- name: default
22+
plugins:
23+
- pluginRef: decode-filter
24+
- pluginRef: prefix-cache-scorer
25+
weight: 3.0
26+
- pluginRef: kv-cache-scorer
27+
weight: 1.0
28+
- pluginRef: queue-scorer
29+
weight: 1.0
30+
- pluginRef: max-score-picker

0 commit comments

Comments
 (0)