What's Changed
- chore: update convert_tooldef_to_openai_tool to match its usage by @mattf in #4837
- feat!: improve consistency of post-training API endpoints by @eoinfennessy in #4606
- fix: Arbitrary file write via a non-default configuration by @VaishnaviHire in #4844
- chore: reduce uses of models.llama.datatypes by @mattf in #4847
- docs: add technical release steps and improvements to RELEASE_PROCESS.md by @cdoern in #4792
- chore: bump fallback version to 0.5.1 by @cdoern in #4846
- fix: Exclude null 'strict' field in function tools to prevent OpenAI … by @gyliu513 in #4795
- chore(test): add test to verify responses params make it to backend service by @mattf in #4850
- chore: revert "fix: disable together banner (#4517)" by @mattf in #4856
- fix: update together to work with latest api.together.xyz service (circa feb 2026) by @mattf in #4857
- chore(github-deps): bump astral-sh/setup-uv from 7.2.0 to 7.3.0 by @dependabot[bot] in #4867
- chore(github-deps): bump github/codeql-action from 4.32.0 to 4.32.2 by @dependabot[bot] in #4861
- chore(github-deps): bump actions/cache from 5.0.2 to 5.0.3 by @dependabot[bot] in #4859
- chore(github-deps): bump llamastack/llama-stack from 76bcb66 to c518b35 by @dependabot[bot] in #4858
- fix(ci): ensure oasdiff is available for openai-coverage hook by @EleanorWho in #4835
- fix: Deprecate items when create conversation by @gyliu513 in #4765
- chore: refactor chunking to use configurable tiktoken encoding and document tokenizer limits by @mattf in #4870
- chore: prune unused parts of models packages (checkpoint, tokenizer, prompt templates, datatypes) by @mattf in #4871
- chore: prune unused utils from utils.memory.vector_store by @mattf in #4873
- fix: Escape special characters in auto-generated provider documentati… by @gyliu513 in #4822
- chore(docs): Use starter for opentelemetry integration test by @gyliu513 in #4875
- fix: kvstore should call shutdown but not close by @gyliu513 in #4872
- fix: uvicorn log ambiguity by @cdoern in #4522
- chore(github-deps): bump actions/checkout from 4.2.2 to 6.0.2 by @dependabot[bot] in #4865
- chore: cleanup mypy excludes by @mattf in #4876
- feat: add integration test for max_output_tokens by @gyliu513 in #4825
- chore(test): add test to verify responses params make it to backend s… by @gyliu513 in #4852
- ci: add Docker image publishing to release workflow by @cdoern in #4882
- feat: add ProcessFileRequest model to file_processors API by @alinaryan in #4885
- docs: update responses api known limitations doc by @jaideepr97 in #4845
- fix(vector_io): align Protocol signatures with request models by @skamenan7 in #4747
- fix: add _ExceptionTranslatingRoute to prevent keep-alive breakage on Linux by @iamemilio in #4886
- docs: add release notes for version 0.5 by @rhuss in #4855
- fix(ci): disable uv cache cleanup when UV_NO_CACHE is set by @cdoern in #4889
- feat: Add truncation parameter support by @gyliu513 in #4813
- chore(ci): bump pinned action commit hashes in integration-tests.yml by @cdoern in #4895
- docs: Add README for running observability test by @gyliu513 in #4884
- fix: update rerank routing to match params by @mattf in #4900
- feat: Add prompt_cache_key parameter support by @gyliu513 in #4775
- chore: add rerank support to recorder by @mattf in #4903
- feat: add rerank support to vllm inference provider by @mattf in #4902
- fix(inference): use flat response message model for chat/completions by @cdoern in #4891
- feat: add llama cpp server remote inference provider by @Bobbins228 in #4382
- fix: Remove pillow as direct dependency by @VaishnaviHire in #4901
- fix: pre-commit run -a by @mattf in #4907
- fix(ci): Removed kotlin from preview builds by @gyliu513 in #4910
- feat: Add service_tier parameter support by @gyliu513 in #4816
- chore(github-deps): bump github/codeql-action from 4.32.2 to 4.32.3 by @dependabot[bot] in #4918
- chore(github-deps): bump docker/login-action from 3.4.0 to 3.7.0 by @dependabot[bot] in #4916
- chore(github-deps): bump llamastack/llama-stack from c7cdb40 to 4c1b03b by @dependabot[bot] in #4915
- chore(github-deps): bump stainless-api/upload-openapi-spec-action from 1.10.0 to 1.11.6 by @dependabot[bot] in #4913
- chore(github-deps): bump docker/build-push-action from 6.15.0 to 6.19.2 by @dependabot[bot] in #4912
- fix(vertexai): raise descriptive error on auth failure instead of silent empty string by @major in #4909
- fix: resolve StorageConfig default env vars at construction time by @major in #4897
- feat: Add incomplete_details response property by @gyliu513 in #4812
- feat(client-sdks): add OpenAPI Generator tooling by @aegeiger in #4874
- fix(vector_io): eliminate duplicate call for vector store registration by @r3v5 in #4925
- test(vertexai): add unit tests for VertexAI inference adapter by @major in #4927
- feat: introduce new how-to blog by @cdoern in #4794
- chore: remove reference to non-existent WeaviateRequestProviderData by @mattf in #4937
- feat: standardized error types with HTTP status codes by @iamemilio in #4878
- feat: add opentelemetry-distro to core dependencies by @Artemon-line in #4935
- feat(ci): Add nightly job for doc build by @gyliu513 in #4911
- fix: Ensure user isolation for stored conversations and responses by @jaideepr97 in #4834
- fix: align chat completion usage schema with OpenAI spec by @cdoern in #4930
- fix: allow conversation item type to be omitted by @mattf in #4948
- feat: Enable inline PyPDF file_processors provider by @alinaryan in #4743
- feat: add support for /responses background parameter by @cdoern in #4824
- feat(vector_io): Implement Contextual Retrieval for improved RAG search quality by @r-bit-rry in #4750
- chore: use SecretStr for x-llamastack-provider-data keys by @mattf in #4939
- chore: remove unused vector store utils by @mattf in #4961
- feat: auto-identify embedding models for vllm by @mattf in #4975
- chore(github-deps): bump llamastack/llama-stack from 4c1b03b to 7d9786b by @dependabot[bot] in #4971
- chore(github-deps): bump actions/checkout from 6.0.1 to 6.0.2 by @dependabot[bot] in #4969
- chore(github-deps): bump actions/cache from 4.2.0 to 5.0.3 by @dependabot[bot] in #4963
- chore(github-deps): bump github/codeql-action from 4.32.3 to 4.32.4 by @dependabot[bot] in #4964
- chore(github-deps): bump actions/stale from 10.1.1 to 10.2.0 by @dependabot[bot] in #4966
- fix: fix connector_id resolution in agent provider by @jaideepr97 in #4853
- build: bump fallback_version to 0.5.2.dev0 post 0.5.1 release by @cdoern in #4959
- fix: pass request objects to Files API in Responses content conversion by @mattf in #4977
- fix: test_prepend_prompt_with_mixed_variables mock by @mattf in #4979
- feat: enforce max upload size for Files and File Processors APIs by @alinaryan in #4956
- feat: add OpenResponses conformance CI job with replay recordings by @cdoern in #4981
- feat(client-sdks): add hierarchical SDK build pipeline by @aegeiger in #4932
- feat: add top_p parameter support to responses API by @EleanorWho in #4820
- fix(docs): Updated llamastack pod metadata by @gyliu513 in #4983
- chore: move parse_data_url to common package by @mattf in #4982
- feat: record and replay provider exceptions in inferencing integration tests by @iamemilio in #4880
- feat: Use Structured Errors in Responses and Conversations API by @iamemilio in #4879
- fix: strip inline:: prefix from model in vector io tests by @mattf in #4993
- refactor: consolidate dynamic provider config parsing by @mattf in #4985
- feat: auto-merge PRs on stable release branches via Mergify + CI gate by @leseb in #4992
- refactor: use OpenAIErrorResponse model for consistent error responses by @iamemilio in #4883
- fix: populate required OpenResponses fields with non-null defaults by @cdoern in #4994
- feat: auto-merge dependabot github-deps PRs via Mergify by @leseb in #4995
- feat: Add top_logprobs parameter support by @gyliu513 in #4814
- feat: add support for 'frequency_penalty' param to Responses API by @nathan-weinberg in #4823
- feat: add support for 'presence_penalty' param to Responses API by @nathan-weinberg in #4830
- fix: correct PYPDF adapter method signature to match FileProcessors protocol by @alinaryan in #4998
- fix(responses): achieve full OpenResponses conformance by @cdoern in #4999
- fix(docs): Updated health check endpoint by @gyliu513 in #5000
- test: Add responses structured output integration tests by @msager27 in #4940
- feat: structured error handling in Responses API streaming by @iamemilio in #4942
- feat(client-sdks): add LlamaStackClient, httpx, and streaming by @aegeiger in #5001
- feat: accept list content blocks in Responses API function_call_output by @mattf in #4978
- refactor(PGVector): wrap gin index creation into a separate function by @r3v5 in #4980
- chore: consolidate backend-forwarded param tests into unified parametrized test by @mattf in #5003
- test: add integration tests for Responses and Conversations API errors by @iamemilio in #4881
- feat: allow stream usage from ollama when telemetry enabled by @mattf in #5011
- feat: allow stream usage from vllm when telemetry enabled by @mattf in #5010
- fix(registry): loosen register() idempotent checks for server restarts by @max-svistunov in #4976
- feat: add integration test for prompt_cache_key with openai client by @gyliu513 in #5016
- chore(github-deps): bump actions/github-script from 7.0.1 to 8.0.0 by @dependabot[bot] in #5025
- chore(github-deps): bump astral-sh/setup-uv from 7.3.0 to 7.3.1 by @dependabot[bot] in #5027
- chore(github-deps): bump actions/setup-java from 4.5.0 to 5.2.0 by @dependabot[bot] in #5019
- ci: add merge_group trigger to all PR-gating workflows by @cdoern in #5017
- feat(ci): automate post-release and pre-release version management by @cdoern in #4938
- test: Add prompt template test cases to the responses integraton test… by @msager27 in #4950
- fix(stainless): handle [DONE] SSE terminator in streaming responses by @dtmeadows in #5012
- fix(security): pin google-cloud-aiplatform to >=1.131.0 by @derekhiggins in #5037
- feat(inference): bidirectional reasoning token passthrough for chat completions by @cdoern in #5038
- chore: remove unreachable tool_choice check in vllm adapter by @mattf in #5009
- feat(api): support extra_body pass-through in responses API by @codefromthecrypt in #4893
- docs: additional references to Docker Hub by @nathan-weinberg in #5044
- fix: add missing shutdown method to PyPDF file processor adapter by @alinaryan in #5047
- fix(llama-guard): less strict parsing of safety categories by @asimurka in #5045
- fix: OCI26ai sql query patches by @rhdedgar in #5046
- fix(conversations): validate conv_ prefix consistently on all endpoints by @iamemilio in #5058
- fix(conversations): add ExceptionTranslatingRoute to conversations router by @iamemilio in #5057
- feat: allow model registration without provider API keys by @NickGagan in #5014
- chore: Rename test_openai_response.py to test_openai_responses.py by @gyliu513 in #5061
- fix: (pypdf) Possible infinite loop when loading circular /Prev entries in cross-reference streams by @eoinfennessy in #5063
- chore: bump fallback_version to 0.5.3.dev0 after 0.5.2 release by @cdoern in #5065
- feat: passthrough safety provider for forwarding to downstream /v1/moderations by @skamenan7 in #5004
- feat: add conditional authentication provider configuration by @derekhiggins in #5002
- fix: NLTK Zip Slip Vulnerability by @eoinfennessy in #5062
- fix: use semantic JSON comparison for MCP approval argument matching by @iamemilio in #5080
- feat(vertexai): rewrite provider on google-genai with dynamic model listing by @major in #4951
- ci: temporarily disable CodeQL workflow on pull requests by @leseb in #5079
- chore: fix post-release workflow and remove broken docker image by @cdoern in #5064
- fix: Revert "temporarily disable CodeQL workflow on pull requests" by @cdoern in #5085
- fix: use canonical config loading in backward compat test by @leseb in #5081
- fix!: add content capture via otel by @gyliu513 in #5060
- feat!: add integration test for safety_identifier with openai client by @gyliu513 in #5018
- fix: poll test PyPI before building Docker images to avoid race condition by @cdoern in #5090
- feat: add regex pattern support to access_policy and route_policy by @derekhiggins in #4991
- feat: add integration test for truncation with openai client by @gyliu513 in #5084
- feat: integration test for top_p with openai client by @gyliu513 in #5083
- chore(github-deps): bump oven-sh/setup-bun from 2.1.2 to 2.1.3 by @dependabot[bot] in #5068
- chore(github-deps): bump github/codeql-action from 4.32.4 to 4.32.6 by @dependabot[bot] in #5070
- chore(github-deps): bump llamastack/llama-stack from 7d9786b to 6c700da by @dependabot[bot] in #5075
- chore(github-deps): bump actions/setup-node from 6.2.0 to 6.3.0 by @dependabot[bot] in #5067
- chore(github-deps): bump actions/download-artifact from 7.0.0 to 8.0.0 by @dependabot[bot] in #5020
- fix: process hang on exit with aiosqlite >= 0.22 by @shanemcd in #4589
- fix: treat hallucinated tool names as client-side function calls by @mattf in #5043
- test: add streaming web_search test cases to responses integration test suite by @msager27 in #4960
- feat(PGVector): implement ef_search parameter for HNSW vector index in PGVector by @r3v5 in #4933
- feat: improve inference performance via cached ssl context by @mattf in #4486
- ci: update Mergify config with auto-update, auto-approve, and fix merge method by @leseb in #5091
- feat: Add additional Tool runtime metrics by @gyliu513 in #4904
- feat: Add integration test for parallel_tool_calls with openai client by @gyliu513 in #5093
- feat: Enable Filters in OpenAI Search API by @franciscojavierarceo in #4471
- chore: Move background integration test to test_open_responses.py by @gyliu513 in #5094
- refactor(vertexai): extract network helper functions into utils module by @major in #5095
- feat!: new URL for AWS Bedrock and model list support by @are-ces in #4946
New Contributors
- @major made their first contribution in #4909
- @aegeiger made their first contribution in #4874
- @Artemon-line made their first contribution in #4935
- @max-svistunov made their first contribution in #4976
- @dtmeadows made their first contribution in #5012
- @NickGagan made their first contribution in #5014
- @shanemcd made their first contribution in #4589
Full Changelog: v0.5.2...v0.6.0