Releases: llm-d/llm-d-inference-scheduler
v0.3.2
In addition to the below changes these patches include fixes to the kv-cache-manager dependency
What's Changed
- bump gie version to v0.5.0 by @nirrozenbaum in #256
- Fixes for Steps to Build a Kubernetes Development Environment by @dumb0002 in #259
- bump gie version to v0.5.1 rc1 (patch release) by @nirrozenbaum in #262
- added badges to readme by @nirrozenbaum in #261
- chore: bump gie version by @nirrozenbaum in #263
- Update
prefix-cache-scorerConfiguration Doc Entry by @vMaroon in #264 - Update Tokenizer Release Version by @vMaroon in #265
- #245 Followup - Makefile Installs
libzmqDependency by @vMaroon in #266 - deps(go): bump the go-dependencies group with 3 updates by @dependabot[bot] in #257
- Add codespell integration for spell checking by @Jooho in #221
- Initial CODEOWNERS file by @elevran in #267
- added issues templates by @nirrozenbaum in #272
- deps(actions): bump crate-ci/typos from 1.34.0 to 1.35.1 by @dependabot[bot] in #275
- small updates to documentation by @kfswain in #277
- Change CI to only create a latest tagged image on releases by @shmuelk in #278
- fix: correct shell command substitution syntax in Makefile by @yankay in #276
- add reference to writing a new plugin by @elevran in #280
- deps(actions): bump crate-ci/typos from 1.35.1 to 1.35.3 by @dependabot[bot] in #282
- deps(actions): bump actions/checkout from 4 to 5 by @dependabot[bot] in #283
- Remove non-existent labels from dependabot configuration by @elevran in #285
- chore: ⬆️ bump components gie to v0.5.1. by @yafengio in #288
- Add workflows to automate aged issues management by @anoruxylene in #289
- update load aware scorer unit tests by @jairuigou in #291
- Add build metadata to Docker image by @carlory in #292
- Add Prow GitHub Actions by @Jooho in #290
- remove setup log by @carlory in #294
- OWNERS_ALIASES is not working with current automation by @nirrozenbaum in #296
- deps(actions): bump crate-ci/typos from 1.35.3 to 1.35.4 by @dependabot[bot] in #299
- deps(actions): bump actions/checkout from 4 to 5 by @dependabot[bot] in #300
- deps(go): bump the go-dependencies group with 2 updates by @dependabot[bot] in #301
- fix: Makefile fixes for MacOS by @shmuelk in #304
- Update comment in prow-pr-automerge.yml by @elevran in #303
- Remove github CODEOWNERS by @elevran in #306
- deps(docker): bump projectquay/golang from 1.24 to 1.25 by @dependabot[bot] in #302
- Support for installation of inference-sim with kv-cache enabled by @mayabar in #305
- add tokenizer directory to simulator deployment yaml by @mayabar in #307
- [feature] Added Active-Request-Scorer by @vMaroon in #297
- deps(actions): bump crate-ci/typos from 1.35.4 to 1.35.5 by @dependabot[bot] in #316
- deps(go): bump github.com/stretchr/testify from 1.10.0 to 1.11.0 in the go-dependencies group by @dependabot[bot] in #314
- chore: Added end to end tests by @shmuelk in #310
- ByLabelSelector filter tests by @elevran in #315
- chore: drop plugin type from types and file by @yyzxw in #308
- fixed a bug where typos check is checking go.mod and go.sum by @nirrozenbaum in #321
- minor tweaks to Makefile by @nirrozenbaum in #318
- Split the
prefix-cache-scorerplugins by @vMaroon in #323 - Makefile fixes by @vMaroon in #322
- deps(actions): bump crate-ci/typos from 1.35.5 to 1.35.7 by @dependabot[bot] in #325
- deps(go): bump the go-dependencies group with 9 updates by @dependabot[bot] in #326
- Reduce img size by cleaning dnf cache by @rawagner in #328
- added comment to stale issues by @nirrozenbaum in #327
- fixed missing dependencies in makefile by @nirrozenbaum in #330
- sync with IGW release 1.0.0-rc by @nirrozenbaum in #320
- fix: make env-dev-kind fails after sync with IGW 1.0.0-rc-3 by @shmuelk in #331
New Contributors
- @dumb0002 made their first contribution in #259
- @yafengio made their first contribution in #288
- @anoruxylene made their first contribution in #289
- @jairuigou made their first contribution in #291
- @yyzxw made their first contribution in #308
- @rawagner made their first contribution in #328
Full Changelog: v0.2.1...v0.3.2
v0.3.2-rc.1
Small fixes to kv-cache-manager required updated dependencies
v0.3.1
Small patch updating kv cache manager dependency to include support in v0.3
See the full v0.3 changes here:
What's Changed
- bump gie version to v0.5.0 by @nirrozenbaum in #256
- Fixes for Steps to Build a Kubernetes Development Environment by @dumb0002 in #259
- bump gie version to v0.5.1 rc1 (patch release) by @nirrozenbaum in #262
- added badges to readme by @nirrozenbaum in #261
- chore: bump gie version by @nirrozenbaum in #263
- Update
prefix-cache-scorerConfiguration Doc Entry by @vMaroon in #264 - Update Tokenizer Release Version by @vMaroon in #265
- #245 Followup - Makefile Installs
libzmqDependency by @vMaroon in #266 - deps(go): bump the go-dependencies group with 3 updates by @dependabot[bot] in #257
- Add codespell integration for spell checking by @Jooho in #221
- Initial CODEOWNERS file by @elevran in #267
- added issues templates by @nirrozenbaum in #272
- deps(actions): bump crate-ci/typos from 1.34.0 to 1.35.1 by @dependabot[bot] in #275
- small updates to documentation by @kfswain in #277
- Change CI to only create a latest tagged image on releases by @shmuelk in #278
- fix: correct shell command substitution syntax in Makefile by @yankay in #276
- add reference to writing a new plugin by @elevran in #280
- deps(actions): bump crate-ci/typos from 1.35.1 to 1.35.3 by @dependabot[bot] in #282
- deps(actions): bump actions/checkout from 4 to 5 by @dependabot[bot] in #283
- Remove non-existent labels from dependabot configuration by @elevran in #285
- chore: ⬆️ bump components gie to v0.5.1. by @yafengio in #288
- Add workflows to automate aged issues management by @anoruxylene in #289
- update load aware scorer unit tests by @jairuigou in #291
- Add build metadata to Docker image by @carlory in #292
- Add Prow GitHub Actions by @Jooho in #290
- remove setup log by @carlory in #294
- OWNERS_ALIASES is not working with current automation by @nirrozenbaum in #296
- deps(actions): bump crate-ci/typos from 1.35.3 to 1.35.4 by @dependabot[bot] in #299
- deps(actions): bump actions/checkout from 4 to 5 by @dependabot[bot] in #300
- deps(go): bump the go-dependencies group with 2 updates by @dependabot[bot] in #301
- fix: Makefile fixes for MacOS by @shmuelk in #304
- Update comment in prow-pr-automerge.yml by @elevran in #303
- Remove github CODEOWNERS by @elevran in #306
- deps(docker): bump projectquay/golang from 1.24 to 1.25 by @dependabot[bot] in #302
- Support for installation of inference-sim with kv-cache enabled by @mayabar in #305
- add tokenizer directory to simulator deployment yaml by @mayabar in #307
- [feature] Added Active-Request-Scorer by @vMaroon in #297
- deps(actions): bump crate-ci/typos from 1.35.4 to 1.35.5 by @dependabot[bot] in #316
- deps(go): bump github.com/stretchr/testify from 1.10.0 to 1.11.0 in the go-dependencies group by @dependabot[bot] in #314
- chore: Added end to end tests by @shmuelk in #310
- ByLabelSelector filter tests by @elevran in #315
- chore: drop plugin type from types and file by @yyzxw in #308
- fixed a bug where typos check is checking go.mod and go.sum by @nirrozenbaum in #321
- minor tweaks to Makefile by @nirrozenbaum in #318
- Split the
prefix-cache-scorerplugins by @vMaroon in #323 - Makefile fixes by @vMaroon in #322
- deps(actions): bump crate-ci/typos from 1.35.5 to 1.35.7 by @dependabot[bot] in #325
- deps(go): bump the go-dependencies group with 9 updates by @dependabot[bot] in #326
- Reduce img size by cleaning dnf cache by @rawagner in #328
- added comment to stale issues by @nirrozenbaum in #327
- fixed missing dependencies in makefile by @nirrozenbaum in #330
- sync with IGW release 1.0.0-rc by @nirrozenbaum in #320
- fix: make env-dev-kind fails after sync with IGW 1.0.0-rc-3 by @shmuelk in #331
New Contributors
- @dumb0002 made their first contribution in #259
- @kfswain made their first contribution in #277
- @yafengio made their first contribution in #288
- @anoruxylene made their first contribution in #289
- @jairuigou made their first contribution in #291
- @yyzxw made their first contribution in #308
- @rawagner made their first contribution in #328
Full Changelog: v0.2.1...v0.3.1
v0.3.1-rc.1
Full Changelog: v0.3.0...v0.3.1-rc.1
v0.3.0
Image pull example: docker pull ghcr.io/llm-d/llm-d-inference-scheduler:v0.3.0
What's Changed
- bump gie version to v0.5.0 by @nirrozenbaum in #256
- Fixes for Steps to Build a Kubernetes Development Environment by @dumb0002 in #259
- bump gie version to v0.5.1 rc1 (patch release) by @nirrozenbaum in #262
- added badges to readme by @nirrozenbaum in #261
- chore: bump gie version by @nirrozenbaum in #263
- Update
prefix-cache-scorerConfiguration Doc Entry by @vMaroon in #264 - Update Tokenizer Release Version by @vMaroon in #265
- #245 Followup - Makefile Installs
libzmqDependency by @vMaroon in #266 - deps(go): bump the go-dependencies group with 3 updates by @dependabot[bot] in #257
- Add codespell integration for spell checking by @Jooho in #221
- Initial CODEOWNERS file by @elevran in #267
- added issues templates by @nirrozenbaum in #272
- deps(actions): bump crate-ci/typos from 1.34.0 to 1.35.1 by @dependabot[bot] in #275
- small updates to documentation by @kfswain in #277
- Change CI to only create a latest tagged image on releases by @shmuelk in #278
- fix: correct shell command substitution syntax in Makefile by @yankay in #276
- add reference to writing a new plugin by @elevran in #280
- deps(actions): bump crate-ci/typos from 1.35.1 to 1.35.3 by @dependabot[bot] in #282
- deps(actions): bump actions/checkout from 4 to 5 by @dependabot[bot] in #283
- Remove non-existent labels from dependabot configuration by @elevran in #285
- chore: ⬆️ bump components gie to v0.5.1. by @yafengio in #288
- Add workflows to automate aged issues management by @anoruxylene in #289
- update load aware scorer unit tests by @jairuigou in #291
- Add build metadata to Docker image by @carlory in #292
- Add Prow GitHub Actions by @Jooho in #290
- remove setup log by @carlory in #294
- OWNERS_ALIASES is not working with current automation by @nirrozenbaum in #296
- deps(actions): bump crate-ci/typos from 1.35.3 to 1.35.4 by @dependabot[bot] in #299
- deps(actions): bump actions/checkout from 4 to 5 by @dependabot[bot] in #300
- deps(go): bump the go-dependencies group with 2 updates by @dependabot[bot] in #301
- fix: Makefile fixes for MacOS by @shmuelk in #304
- Update comment in prow-pr-automerge.yml by @elevran in #303
- Remove github CODEOWNERS by @elevran in #306
- deps(docker): bump projectquay/golang from 1.24 to 1.25 by @dependabot[bot] in #302
- Support for installation of inference-sim with kv-cache enabled by @mayabar in #305
- add tokenizer directory to simulator deployment yaml by @mayabar in #307
- [feature] Added Active-Request-Scorer by @vMaroon in #297
- deps(actions): bump crate-ci/typos from 1.35.4 to 1.35.5 by @dependabot[bot] in #316
- deps(go): bump github.com/stretchr/testify from 1.10.0 to 1.11.0 in the go-dependencies group by @dependabot[bot] in #314
- chore: Added end to end tests by @shmuelk in #310
- ByLabelSelector filter tests by @elevran in #315
- chore: drop plugin type from types and file by @yyzxw in #308
- fixed a bug where typos check is checking go.mod and go.sum by @nirrozenbaum in #321
- minor tweaks to Makefile by @nirrozenbaum in #318
- Split the
prefix-cache-scorerplugins by @vMaroon in #323 - Makefile fixes by @vMaroon in #322
- deps(actions): bump crate-ci/typos from 1.35.5 to 1.35.7 by @dependabot[bot] in #325
- deps(go): bump the go-dependencies group with 9 updates by @dependabot[bot] in #326
- Reduce img size by cleaning dnf cache by @rawagner in #328
- added comment to stale issues by @nirrozenbaum in #327
- fixed missing dependencies in makefile by @nirrozenbaum in #330
- sync with IGW release 1.0.0-rc by @nirrozenbaum in #320
- fix: make env-dev-kind fails after sync with IGW 1.0.0-rc-3 by @shmuelk in #331
New Contributors
- @dumb0002 made their first contribution in #259
- @kfswain made their first contribution in #277
- @yankay made their first contribution in #276
- @yafengio made their first contribution in #288
- @anoruxylene made their first contribution in #289
- @jairuigou made their first contribution in #291
- @yyzxw made their first contribution in #308
- @rawagner made their first contribution in #328
Full Changelog: v0.2.1...v0.3.0
v0.3.0-rc.2
Image is available here: docker pull ghcr.io/llm-d/llm-d-inference-scheduler:v0.3.0-rc.2
v0.3.0-rc.1
Image is available here: docker pull ghcr.io/llm-d/llm-d-inference-scheduler:v0.3.0-rc.1
What's Changed
- bump gie version to v0.5.0 by @nirrozenbaum in #256
- Fixes for Steps to Build a Kubernetes Development Environment by @dumb0002 in #259
- bump gie version to v0.5.1 rc1 (patch release) by @nirrozenbaum in #262
- added badges to readme by @nirrozenbaum in #261
- chore: bump gie version by @nirrozenbaum in #263
- Update
prefix-cache-scorerConfiguration Doc Entry by @vMaroon in #264 - Update Tokenizer Release Version by @vMaroon in #265
- #245 Followup - Makefile Installs
libzmqDependency by @vMaroon in #266 - deps(go): bump the go-dependencies group with 3 updates by @dependabot[bot] in #257
- Add codespell integration for spell checking by @Jooho in #221
- Initial CODEOWNERS file by @elevran in #267
- added issues templates by @nirrozenbaum in #272
- deps(actions): bump crate-ci/typos from 1.34.0 to 1.35.1 by @dependabot[bot] in #275
- small updates to documentation by @kfswain in #277
- Change CI to only create a latest tagged image on releases by @shmuelk in #278
- fix: correct shell command substitution syntax in Makefile by @yankay in #276
- add reference to writing a new plugin by @elevran in #280
- deps(actions): bump crate-ci/typos from 1.35.1 to 1.35.3 by @dependabot[bot] in #282
- deps(actions): bump actions/checkout from 4 to 5 by @dependabot[bot] in #283
- Remove non-existent labels from dependabot configuration by @elevran in #285
- chore: ⬆️ bump components gie to v0.5.1. by @yafengio in #288
- Add workflows to automate aged issues management by @anoruxylene in #289
- update load aware scorer unit tests by @jairuigou in #291
- Add build metadata to Docker image by @carlory in #292
- Add Prow GitHub Actions by @Jooho in #290
- remove setup log by @carlory in #294
- OWNERS_ALIASES is not working with current automation by @nirrozenbaum in #296
- deps(actions): bump crate-ci/typos from 1.35.3 to 1.35.4 by @dependabot[bot] in #299
- deps(actions): bump actions/checkout from 4 to 5 by @dependabot[bot] in #300
- deps(go): bump the go-dependencies group with 2 updates by @dependabot[bot] in #301
- fix: Makefile fixes for MacOS by @shmuelk in #304
- Update comment in prow-pr-automerge.yml by @elevran in #303
- Remove github CODEOWNERS by @elevran in #306
- deps(docker): bump projectquay/golang from 1.24 to 1.25 by @dependabot[bot] in #302
- Support for installation of inference-sim with kv-cache enabled by @mayabar in #305
- add tokenizer directory to simulator deployment yaml by @mayabar in #307
- [feature] Added Active-Request-Scorer by @vMaroon in #297
- deps(actions): bump crate-ci/typos from 1.35.4 to 1.35.5 by @dependabot[bot] in #316
- deps(go): bump github.com/stretchr/testify from 1.10.0 to 1.11.0 in the go-dependencies group by @dependabot[bot] in #314
- chore: Added end to end tests by @shmuelk in #310
- ByLabelSelector filter tests by @elevran in #315
- chore: drop plugin type from types and file by @yyzxw in #308
- fixed a bug where typos check is checking go.mod and go.sum by @nirrozenbaum in #321
- minor tweaks to Makefile by @nirrozenbaum in #318
- Split the
prefix-cache-scorerplugins by @vMaroon in #323 - Makefile fixes by @vMaroon in #322
- deps(actions): bump crate-ci/typos from 1.35.5 to 1.35.7 by @dependabot[bot] in #325
- deps(go): bump the go-dependencies group with 9 updates by @dependabot[bot] in #326
- Reduce img size by cleaning dnf cache by @rawagner in #328
- added comment to stale issues by @nirrozenbaum in #327
- fixed missing dependencies in makefile by @nirrozenbaum in #330
- sync with IGW release 1.0.0-rc by @nirrozenbaum in #320
- fix: make env-dev-kind fails after sync with IGW 1.0.0-rc-3 by @shmuelk in #331
New Contributors
- @dumb0002 made their first contribution in #259
- @kfswain made their first contribution in #277
- @yankay made their first contribution in #276
- @yafengio made their first contribution in #288
- @anoruxylene made their first contribution in #289
- @jairuigou made their first contribution in #291
- @yyzxw made their first contribution in #308
- @rawagner made their first contribution in #328
Full Changelog: v0.2.0-rc.2...v0.3.0-rc.1
v0.2.1
Image is available here: docker pull ghcr.io/llm-d/llm-d-inference-scheduler:v0.2.1
This patch fix is intended to resolve a few bug fixes.
Justification & breakdown here: kubernetes-sigs/gateway-api-inference-extension#1215
- Helm chart configurability: kubernetes-sigs/gateway-api-inference-extension#1211
- TLS metric scraping: kubernetes-sigs/gateway-api-inference-extension#1190
- Fixing max score picker: kubernetes-sigs/gateway-api-inference-extension#1205
Full Changelog: v0.2.0...v0.2.1
v0.2.1-rc.1
Image is available here: docker pull ghcr.io/llm-d/llm-d-inference-scheduler:v0.2.1-rc.1
This patch fix is intended to resolve a few bug fixes.
Justification & breakdown here: kubernetes-sigs/gateway-api-inference-extension#1215
- Helm chart configurability: kubernetes-sigs/gateway-api-inference-extension#1211
- TLS metric scraping: kubernetes-sigs/gateway-api-inference-extension#1190
- Fixing max score picker: kubernetes-sigs/gateway-api-inference-extension#1205
v0.2.0
Overview
the image is available here: docker pull ghcr.io/llm-d/llm-d-inference-scheduler:v0.2.0
Major Highlights
-
Enhanced Scheduling Framework: enhanced the scheduling framework to include additional extension points and established a well defined mechanism for scheduler plugins inter-communication.
-
New Config API: A new Config API which allows the configuration of plugins through a config file without touching core code.
-
Helm Charts: helm chart update to support the reuse of Config API easily.
-
Plugins Improvements: improved multiple plugins implementations and consolidated some of the GIE and llm-d plugins into a single robust implementation.
What's Changed
- Fixes according to review comments in issue #109 by @mayabar in #129
- update GIE upstream version + code adaptations by @nirrozenbaum in #135
- test: Improve scheduler test by @irar2 in #139
- Wait for storage pods to exist before moving on by @david-martin in #138
- Dependabot configuration for Go, GH acions and Docker by @elevran in #141
- link checker - fails PR if links are broken by @clubanderson in #130
- fix: Cleanup build and kind development environment by @shmuelk in #156
- Replace ubi9 image for ubi-minimal to reduce footprint by @oglok in #153
- [Build]: Update clean-env-dev-kind target in Makefile and fix vllm-sim image version by @kfirtoledo in #155
- remove version.json references from Makefile by @elevran in #137
- simplify config by @nirrozenbaum in #171
- Refactor Redis configuration handling in KV cache scorer by @relyt0925 in #172
- fix: Add MODEL_NAME to kind deploy by @kfirtoledo in #178
- feat: A small set of CI updates by @shmuelk in #176
- DEVELOPMENT.md minor fixes by @d0w in #175
- deps(go): bump the go-dependencies group across 1 directory with 7 updates by @dependabot[bot] in #180
- docs: clarify scorer and filter configuration reference by @kfirtoledo in #185
- Prevent auto updates of GIE by @elevran in #186
- Make URL construction IPv6-compatible by @russellb in #182
- Prefix aware scorer initialization by @mayabar in #143
- update scheduler to use latest GIE by @nirrozenbaum in #179
- fixed by labels link in the docs example markdown by @nirrozenbaum in #194
- removed unused passthrough and random filter/scorer by @nirrozenbaum in #195
- sync with latest GIE after some changes to the scheduling plugins by @nirrozenbaum in #197
- build: add support for development on kubernetes cluster by @kfirtoledo in #190
- chore: Update dependencies to use the latest GIE by @shmuelk in #202
- deps(go): bump github.com/redis/go-redis/v9 from 9.10.0 to 9.11.0 in the go-dependencies group across 1 directory by @dependabot[bot] in #203
- fix: rename kvcache-aware to use underscore to match other src files by @nekomeowww in #205
- Change Prefill and Decode filters to be based on a common filter by @mayabar in #188
- fix: rename pd-profile-handler to use underscore to match other src files by @nekomeowww in #210
- refactor: removes the prefix-aware scorer in favor of the unified GIE prefix scorer by @kfirtoledo in #207
- feat: Add factory functions for all plugins by @shmuelk in #208
- fix: allow prefill when prefixState is unavailable by @kfirtoledo in #211
- Add IDE files to .gitignore by @terrytangyuan in #213
- updated README by @nirrozenbaum in #204
- Fixed some typos by @terrytangyuan in #212
- Fix outdate debug info by @carlory in #216
- Migrate the llm-d-inference-scheduler's configuration to the new text based configuration by @shmuelk in #214
- remove dependency on datastore + updated TypedName of plugins by @nirrozenbaum in #215
- filter rename by @nirrozenbaum in #224
- fixed broken link by @nirrozenbaum in #226
- Add unit tests for session affinity scorer by @sagar0x0 in #222
- nit: fix typo by @Jooho in #232
- prefill header is set in the form of ip:port by @nirrozenbaum in #233
- updated scheduler unit test by @nirrozenbaum in #229
- ignore score in scheduler unit test by @nirrozenbaum in #234
- updated prefill header name to x-prefill-host-port by @nirrozenbaum in #236
- fix: add validation for load-aware scorer to handle invalid queue thresholds by @kfirtoledo in #240
- build: sync with latest GIE v0.0.0-20250715021823 by @kfirtoledo in #239
- Integrate prefix-cache configuration into a single knob by @kfirtoledo in #237
- Clear prefill target header if set in incoming request by @elevran in #244
- bump gie version to latest 0.5.0 rc by @nirrozenbaum in #247
- updating GIE CRD version by @Gregory-Pereira in #248
- bump gie to rc3 by @nirrozenbaum in #250
- Always P/D - Disable Prefix-Cache-Aware Decision Making for P/D by Default by @vMaroon in #253
- build: change epp-config default yamls and image pull policy by @kfirtoledo in #249
- Update Prefix-Cache-Scorer
cache_trackingMode with The v0.2 KVCache.Indexer by @dmitripikus in #245
New Contributors
- @irar2 made their first contribution in #139
- @david-martin made their first contribution in #138
- @clubanderson made their first contribution in #130
- @kfirtoledo made their first contribution in #155
- @relyt0925 made their first contribution in #172
- @d0w made their first contribution in #175
- @dependabot[bot] made their first contribution in #180
- @russellb made their first contribution in #182
- @nekomeowww made their first contribution in #205
- @terrytangyuan made their first contribution in #213
- @carlory made their first contribution in #216
- @sagar0x0 made their first contribution in #222
- @Jooho made their first contribution in #232
Full Changelog: v0.1.0...v0.2.0