v1.2.0-rc.1
Pre-release
Pre-release
What's Changed
- Add openai api link for request format by @learner0810 in #1757
- Docs: Fix incorrect
stream_optionsvalue in Observability example by @aman4433 in #1758 - Docs: Bumps Quickstart to Use Kgateway v2.2.0-main by @danehans in #1761
- Docs: Updates Latest/Main Quickstart by @danehans in #1747
- Docs: Versioned Quickstart Install All CRDs by @danehans in #1762
- chore: fixed meeting link by @nirrozenbaum in #1734
- Add Produces and Consumes methods to Plugin by @rahulgurnani in #1754
- Docs: Removes Agentgateway Docs by @danehans in #1771
- Record EPP NormalizedTimePerOutputToken metric on streaming mode by @dharaneeshvrd in #1706
- chore(deps): bump github.com/onsi/ginkgo/v2 from 2.26.0 to 2.27.2 by @dependabot[bot] in #1776
- chore(deps): bump github.com/prometheus/prometheus from 0.307.1 to 0.307.2 by @dependabot[bot] in #1774
- fix tracing configuration in helm epp-deployment template by @sallyom in #1777
- Fix for kustomization missing path for inferencepoolimport.yaml. by @bexxmodd in #1782
- fix inferenceobjective api types link by @learner0810 in #1739
- update release quickstart to use v1.1.0 by @nirrozenbaum in #1785
- [metrics]: Allow EPP to register metrics from extension by @JeffLuoo in #1787
- feat (reports): add infrastructure to run NGF conformance tests and i… by @sindhushiv in #1788
- Add Install Gateway section in Getting Started Latest guide by @dharaneeshvrd in #1759
- quickstart cleanup by @nirrozenbaum in #1805
- fix(release): update quickstart guide version automatically by @AvineshTripathi in #1803
- chore(deps): bump github.com/prometheus/prometheus from 0.307.2 to 0.307.3 by @dependabot[bot] in #1809
- chore(deps): bump github.com/prometheus/common from 0.67.1 to 0.67.2 by @dependabot[bot] in #1807
- logging cleanup of scheduler pkg by @nirrozenbaum in #1806
- chore(deps): bump sigs.k8s.io/controller-runtime from 0.22.3 to 0.22.4 by @dependabot[bot] in #1808
- allow overriding the runner's containing executable name by @elevran in #1813
- quickstart numbering by @nirrozenbaum in #1819
- [SLO Routing] Add Latency Predictor sidecars and EPP tools by @BenjaminBraunDev in #1791
- update inferencepool helm chart flags to be map instead of an array by @nirrozenbaum in #1818
- feat: Configure LRUCacheSize using the numGPUBlocks for approximate prefix cache by @zetxqx in #1748
- don't use cluster scope permissions when metrics auth is disabled by @nirrozenbaum in #1804
- Add benchmarking folder by @rlakhtakia in #1689
- Add prompt_cached_tokens metrics from each response. by @zetxqx in #1814
- hotfix to helm chart. missing quotes by @nirrozenbaum in #1825
- Correct the InferencePoolResolvedRefsCondition conformance tests. by @zetxqx in #1756
- Adjust default scorer weights to favor more prefix cache affinity by @liu-cong in #1827
- refactor: Flatten Flow Control queue plugin directory structure by @LukeAVanDrie in #1824
- Update docs on prefix cache plugin related metrics by @liu-cong in #1828
- Add prefix cache aware benchmarking config by @rlakhtakia in #1822
- feat: add validation and fallback for prefix cache config fields by @googs1025 in #1846
- chore(deps): bump github.com/envoyproxy/go-control-plane/envoy from 1.35.0 to 1.36.0 by @dependabot[bot] in #1844
- chore(deps): bump golang.org/x/sync from 0.17.0 to 0.18.0 by @dependabot[bot] in #1845
- Improvements to the E2E Test utilities by @shmuelk in #1853
- Conformance: Adds Data Parallelism Test by @danehans in #1769
- fix incorrect interface input parameter names by @googs1025 in #1865
- docs: Adding the Gateway inference support documentation for Nginx Ga… by @sindhushiv in #1789
- helm support for sidecar injection in EPP by @capri-xiyue in #1821
- Helm: Adds
istioas aprovider-scoped value for the inferencepool Chart by @danehans in #1831 - refactor: Improve Flow Control queue contracts for clarity and correctness by @LukeAVanDrie in #1836
- fix training server indentation bug and test yaml to build script by @kaushikmitr in #1854
- Validate datalayer with additional testing by @elevran in #1857
- Add PrepareData and Admission control plugins by @rahulgurnani in #1796
- feat(api): Introduce InferenceModelRewrite API by @zetxqx in #1816
- Add owners files to subsections by @kfswain in #1874
- Additional data layer tests by @irar2 in #1876
- chore(deps): bump the kubernetes group with 6 updates by @dependabot[bot] in #1873
- feat: Extend the text based configuration to include feature flags and the SaturationDetector's configuration by @shmuelk in #1492
- refactor bbr main as a prep for pluggability by @nirrozenbaum in #1867
- use a dispatch ticker to dispatch requests periodly in ShardProcessor… by @delavet in #1850
- feat(conformance): add responseReceived plugin to support verifying destination endpoint. by @zetxqx in #1855
- some cleanup in runner and config loading + deprecation notes by @nirrozenbaum in #1880
- fix bbr dockerfile post build by @nirrozenbaum in #1881
- add shmuelk as code reviewer by @nirrozenbaum in #1882
- SLO Aware Routing Plugins Only by @BenjaminBraunDev in #1849
- Upload prefill and decode heavy benchmarking configs by @rlakhtakia in #1848
- Update outdated documentation for monitoring config of GKE by @JeffLuoo in #1837
- Enable EPP to support endpoint discovery using pod selector by @capri-xiyue in #1833
- Add AutoTune config to prefix scorer to make it explicit when auto vs. manual config is used by @liu-cong in #1888
New Contributors
- @aman4433 made their first contribution in #1758
- @sindhushiv made their first contribution in #1788
- @AvineshTripathi made their first contribution in #1803
- @googs1025 made their first contribution in #1846
Full Changelog: v1.1.0...v1.2.0-rc.1