Skip to content

v1.2.0-rc.1

Pre-release
Pre-release

Choose a tag to compare

@nirrozenbaum nirrozenbaum released this 21 Nov 18:27
· 1 commit to main since this release
v1.2.0-rc.1
318cd7c

What's Changed

  • Add openai api link for request format by @learner0810 in #1757
  • Docs: Fix incorrect stream_options value in Observability example by @aman4433 in #1758
  • Docs: Bumps Quickstart to Use Kgateway v2.2.0-main by @danehans in #1761
  • Docs: Updates Latest/Main Quickstart by @danehans in #1747
  • Docs: Versioned Quickstart Install All CRDs by @danehans in #1762
  • chore: fixed meeting link by @nirrozenbaum in #1734
  • Add Produces and Consumes methods to Plugin by @rahulgurnani in #1754
  • Docs: Removes Agentgateway Docs by @danehans in #1771
  • Record EPP NormalizedTimePerOutputToken metric on streaming mode by @dharaneeshvrd in #1706
  • chore(deps): bump github.com/onsi/ginkgo/v2 from 2.26.0 to 2.27.2 by @dependabot[bot] in #1776
  • chore(deps): bump github.com/prometheus/prometheus from 0.307.1 to 0.307.2 by @dependabot[bot] in #1774
  • fix tracing configuration in helm epp-deployment template by @sallyom in #1777
  • Fix for kustomization missing path for inferencepoolimport.yaml. by @bexxmodd in #1782
  • fix inferenceobjective api types link by @learner0810 in #1739
  • update release quickstart to use v1.1.0 by @nirrozenbaum in #1785
  • [metrics]: Allow EPP to register metrics from extension by @JeffLuoo in #1787
  • feat (reports): add infrastructure to run NGF conformance tests and i… by @sindhushiv in #1788
  • Add Install Gateway section in Getting Started Latest guide by @dharaneeshvrd in #1759
  • quickstart cleanup by @nirrozenbaum in #1805
  • fix(release): update quickstart guide version automatically by @AvineshTripathi in #1803
  • chore(deps): bump github.com/prometheus/prometheus from 0.307.2 to 0.307.3 by @dependabot[bot] in #1809
  • chore(deps): bump github.com/prometheus/common from 0.67.1 to 0.67.2 by @dependabot[bot] in #1807
  • logging cleanup of scheduler pkg by @nirrozenbaum in #1806
  • chore(deps): bump sigs.k8s.io/controller-runtime from 0.22.3 to 0.22.4 by @dependabot[bot] in #1808
  • allow overriding the runner's containing executable name by @elevran in #1813
  • quickstart numbering by @nirrozenbaum in #1819
  • [SLO Routing] Add Latency Predictor sidecars and EPP tools by @BenjaminBraunDev in #1791
  • update inferencepool helm chart flags to be map instead of an array by @nirrozenbaum in #1818
  • feat: Configure LRUCacheSize using the numGPUBlocks for approximate prefix cache by @zetxqx in #1748
  • don't use cluster scope permissions when metrics auth is disabled by @nirrozenbaum in #1804
  • Add benchmarking folder by @rlakhtakia in #1689
  • Add prompt_cached_tokens metrics from each response. by @zetxqx in #1814
  • hotfix to helm chart. missing quotes by @nirrozenbaum in #1825
  • Correct the InferencePoolResolvedRefsCondition conformance tests. by @zetxqx in #1756
  • Adjust default scorer weights to favor more prefix cache affinity by @liu-cong in #1827
  • refactor: Flatten Flow Control queue plugin directory structure by @LukeAVanDrie in #1824
  • Update docs on prefix cache plugin related metrics by @liu-cong in #1828
  • Add prefix cache aware benchmarking config by @rlakhtakia in #1822
  • feat: add validation and fallback for prefix cache config fields by @googs1025 in #1846
  • chore(deps): bump github.com/envoyproxy/go-control-plane/envoy from 1.35.0 to 1.36.0 by @dependabot[bot] in #1844
  • chore(deps): bump golang.org/x/sync from 0.17.0 to 0.18.0 by @dependabot[bot] in #1845
  • Improvements to the E2E Test utilities by @shmuelk in #1853
  • Conformance: Adds Data Parallelism Test by @danehans in #1769
  • fix incorrect interface input parameter names by @googs1025 in #1865
  • docs: Adding the Gateway inference support documentation for Nginx Ga… by @sindhushiv in #1789
  • helm support for sidecar injection in EPP by @capri-xiyue in #1821
  • Helm: Adds istio as a provider-scoped value for the inferencepool Chart by @danehans in #1831
  • refactor: Improve Flow Control queue contracts for clarity and correctness by @LukeAVanDrie in #1836
  • fix training server indentation bug and test yaml to build script by @kaushikmitr in #1854
  • Validate datalayer with additional testing by @elevran in #1857
  • Add PrepareData and Admission control plugins by @rahulgurnani in #1796
  • feat(api): Introduce InferenceModelRewrite API by @zetxqx in #1816
  • Add owners files to subsections by @kfswain in #1874
  • Additional data layer tests by @irar2 in #1876
  • chore(deps): bump the kubernetes group with 6 updates by @dependabot[bot] in #1873
  • feat: Extend the text based configuration to include feature flags and the SaturationDetector's configuration by @shmuelk in #1492
  • refactor bbr main as a prep for pluggability by @nirrozenbaum in #1867
  • use a dispatch ticker to dispatch requests periodly in ShardProcessor… by @delavet in #1850
  • feat(conformance): add responseReceived plugin to support verifying destination endpoint. by @zetxqx in #1855
  • some cleanup in runner and config loading + deprecation notes by @nirrozenbaum in #1880
  • fix bbr dockerfile post build by @nirrozenbaum in #1881
  • add shmuelk as code reviewer by @nirrozenbaum in #1882
  • SLO Aware Routing Plugins Only by @BenjaminBraunDev in #1849
  • Upload prefill and decode heavy benchmarking configs by @rlakhtakia in #1848
  • Update outdated documentation for monitoring config of GKE by @JeffLuoo in #1837
  • Enable EPP to support endpoint discovery using pod selector by @capri-xiyue in #1833
  • Add AutoTune config to prefix scorer to make it explicit when auto vs. manual config is used by @liu-cong in #1888

New Contributors

Full Changelog: v1.1.0...v1.2.0-rc.1