Skip to content

v0.6.0

Latest

Choose a tag to compare

@cdoern cdoern released this 11 Mar 15:01
· 3 commits to release-0.6.x since this release

What's Changed

  • chore: update convert_tooldef_to_openai_tool to match its usage by @mattf in #4837
  • feat!: improve consistency of post-training API endpoints by @eoinfennessy in #4606
  • fix: Arbitrary file write via a non-default configuration by @VaishnaviHire in #4844
  • chore: reduce uses of models.llama.datatypes by @mattf in #4847
  • docs: add technical release steps and improvements to RELEASE_PROCESS.md by @cdoern in #4792
  • chore: bump fallback version to 0.5.1 by @cdoern in #4846
  • fix: Exclude null 'strict' field in function tools to prevent OpenAI … by @gyliu513 in #4795
  • chore(test): add test to verify responses params make it to backend service by @mattf in #4850
  • chore: revert "fix: disable together banner (#4517)" by @mattf in #4856
  • fix: update together to work with latest api.together.xyz service (circa feb 2026) by @mattf in #4857
  • chore(github-deps): bump astral-sh/setup-uv from 7.2.0 to 7.3.0 by @dependabot[bot] in #4867
  • chore(github-deps): bump github/codeql-action from 4.32.0 to 4.32.2 by @dependabot[bot] in #4861
  • chore(github-deps): bump actions/cache from 5.0.2 to 5.0.3 by @dependabot[bot] in #4859
  • chore(github-deps): bump llamastack/llama-stack from 76bcb66 to c518b35 by @dependabot[bot] in #4858
  • fix(ci): ensure oasdiff is available for openai-coverage hook by @EleanorWho in #4835
  • fix: Deprecate items when create conversation by @gyliu513 in #4765
  • chore: refactor chunking to use configurable tiktoken encoding and document tokenizer limits by @mattf in #4870
  • chore: prune unused parts of models packages (checkpoint, tokenizer, prompt templates, datatypes) by @mattf in #4871
  • chore: prune unused utils from utils.memory.vector_store by @mattf in #4873
  • fix: Escape special characters in auto-generated provider documentati… by @gyliu513 in #4822
  • chore(docs): Use starter for opentelemetry integration test by @gyliu513 in #4875
  • fix: kvstore should call shutdown but not close by @gyliu513 in #4872
  • fix: uvicorn log ambiguity by @cdoern in #4522
  • chore(github-deps): bump actions/checkout from 4.2.2 to 6.0.2 by @dependabot[bot] in #4865
  • chore: cleanup mypy excludes by @mattf in #4876
  • feat: add integration test for max_output_tokens by @gyliu513 in #4825
  • chore(test): add test to verify responses params make it to backend s… by @gyliu513 in #4852
  • ci: add Docker image publishing to release workflow by @cdoern in #4882
  • feat: add ProcessFileRequest model to file_processors API by @alinaryan in #4885
  • docs: update responses api known limitations doc by @jaideepr97 in #4845
  • fix(vector_io): align Protocol signatures with request models by @skamenan7 in #4747
  • fix: add _ExceptionTranslatingRoute to prevent keep-alive breakage on Linux by @iamemilio in #4886
  • docs: add release notes for version 0.5 by @rhuss in #4855
  • fix(ci): disable uv cache cleanup when UV_NO_CACHE is set by @cdoern in #4889
  • feat: Add truncation parameter support by @gyliu513 in #4813
  • chore(ci): bump pinned action commit hashes in integration-tests.yml by @cdoern in #4895
  • docs: Add README for running observability test by @gyliu513 in #4884
  • fix: update rerank routing to match params by @mattf in #4900
  • feat: Add prompt_cache_key parameter support by @gyliu513 in #4775
  • chore: add rerank support to recorder by @mattf in #4903
  • feat: add rerank support to vllm inference provider by @mattf in #4902
  • fix(inference): use flat response message model for chat/completions by @cdoern in #4891
  • feat: add llama cpp server remote inference provider by @Bobbins228 in #4382
  • fix: Remove pillow as direct dependency by @VaishnaviHire in #4901
  • fix: pre-commit run -a by @mattf in #4907
  • fix(ci): Removed kotlin from preview builds by @gyliu513 in #4910
  • feat: Add service_tier parameter support by @gyliu513 in #4816
  • chore(github-deps): bump github/codeql-action from 4.32.2 to 4.32.3 by @dependabot[bot] in #4918
  • chore(github-deps): bump docker/login-action from 3.4.0 to 3.7.0 by @dependabot[bot] in #4916
  • chore(github-deps): bump llamastack/llama-stack from c7cdb40 to 4c1b03b by @dependabot[bot] in #4915
  • chore(github-deps): bump stainless-api/upload-openapi-spec-action from 1.10.0 to 1.11.6 by @dependabot[bot] in #4913
  • chore(github-deps): bump docker/build-push-action from 6.15.0 to 6.19.2 by @dependabot[bot] in #4912
  • fix(vertexai): raise descriptive error on auth failure instead of silent empty string by @major in #4909
  • fix: resolve StorageConfig default env vars at construction time by @major in #4897
  • feat: Add incomplete_details response property by @gyliu513 in #4812
  • feat(client-sdks): add OpenAPI Generator tooling by @aegeiger in #4874
  • fix(vector_io): eliminate duplicate call for vector store registration by @r3v5 in #4925
  • test(vertexai): add unit tests for VertexAI inference adapter by @major in #4927
  • feat: introduce new how-to blog by @cdoern in #4794
  • chore: remove reference to non-existent WeaviateRequestProviderData by @mattf in #4937
  • feat: standardized error types with HTTP status codes by @iamemilio in #4878
  • feat: add opentelemetry-distro to core dependencies by @Artemon-line in #4935
  • feat(ci): Add nightly job for doc build by @gyliu513 in #4911
  • fix: Ensure user isolation for stored conversations and responses by @jaideepr97 in #4834
  • fix: align chat completion usage schema with OpenAI spec by @cdoern in #4930
  • fix: allow conversation item type to be omitted by @mattf in #4948
  • feat: Enable inline PyPDF file_processors provider by @alinaryan in #4743
  • feat: add support for /responses background parameter by @cdoern in #4824
  • feat(vector_io): Implement Contextual Retrieval for improved RAG search quality by @r-bit-rry in #4750
  • chore: use SecretStr for x-llamastack-provider-data keys by @mattf in #4939
  • chore: remove unused vector store utils by @mattf in #4961
  • feat: auto-identify embedding models for vllm by @mattf in #4975
  • chore(github-deps): bump llamastack/llama-stack from 4c1b03b to 7d9786b by @dependabot[bot] in #4971
  • chore(github-deps): bump actions/checkout from 6.0.1 to 6.0.2 by @dependabot[bot] in #4969
  • chore(github-deps): bump actions/cache from 4.2.0 to 5.0.3 by @dependabot[bot] in #4963
  • chore(github-deps): bump github/codeql-action from 4.32.3 to 4.32.4 by @dependabot[bot] in #4964
  • chore(github-deps): bump actions/stale from 10.1.1 to 10.2.0 by @dependabot[bot] in #4966
  • fix: fix connector_id resolution in agent provider by @jaideepr97 in #4853
  • build: bump fallback_version to 0.5.2.dev0 post 0.5.1 release by @cdoern in #4959
  • fix: pass request objects to Files API in Responses content conversion by @mattf in #4977
  • fix: test_prepend_prompt_with_mixed_variables mock by @mattf in #4979
  • feat: enforce max upload size for Files and File Processors APIs by @alinaryan in #4956
  • feat: add OpenResponses conformance CI job with replay recordings by @cdoern in #4981
  • feat(client-sdks): add hierarchical SDK build pipeline by @aegeiger in #4932
  • feat: add top_p parameter support to responses API by @EleanorWho in #4820
  • fix(docs): Updated llamastack pod metadata by @gyliu513 in #4983
  • chore: move parse_data_url to common package by @mattf in #4982
  • feat: record and replay provider exceptions in inferencing integration tests by @iamemilio in #4880
  • feat: Use Structured Errors in Responses and Conversations API by @iamemilio in #4879
  • fix: strip inline:: prefix from model in vector io tests by @mattf in #4993
  • refactor: consolidate dynamic provider config parsing by @mattf in #4985
  • feat: auto-merge PRs on stable release branches via Mergify + CI gate by @leseb in #4992
  • refactor: use OpenAIErrorResponse model for consistent error responses by @iamemilio in #4883
  • fix: populate required OpenResponses fields with non-null defaults by @cdoern in #4994
  • feat: auto-merge dependabot github-deps PRs via Mergify by @leseb in #4995
  • feat: Add top_logprobs parameter support by @gyliu513 in #4814
  • feat: add support for 'frequency_penalty' param to Responses API by @nathan-weinberg in #4823
  • feat: add support for 'presence_penalty' param to Responses API by @nathan-weinberg in #4830
  • fix: correct PYPDF adapter method signature to match FileProcessors protocol by @alinaryan in #4998
  • fix(responses): achieve full OpenResponses conformance by @cdoern in #4999
  • fix(docs): Updated health check endpoint by @gyliu513 in #5000
  • test: Add responses structured output integration tests by @msager27 in #4940
  • feat: structured error handling in Responses API streaming by @iamemilio in #4942
  • feat(client-sdks): add LlamaStackClient, httpx, and streaming by @aegeiger in #5001
  • feat: accept list content blocks in Responses API function_call_output by @mattf in #4978
  • refactor(PGVector): wrap gin index creation into a separate function by @r3v5 in #4980
  • chore: consolidate backend-forwarded param tests into unified parametrized test by @mattf in #5003
  • test: add integration tests for Responses and Conversations API errors by @iamemilio in #4881
  • feat: allow stream usage from ollama when telemetry enabled by @mattf in #5011
  • feat: allow stream usage from vllm when telemetry enabled by @mattf in #5010
  • fix(registry): loosen register() idempotent checks for server restarts by @max-svistunov in #4976
  • feat: add integration test for prompt_cache_key with openai client by @gyliu513 in #5016
  • chore(github-deps): bump actions/github-script from 7.0.1 to 8.0.0 by @dependabot[bot] in #5025
  • chore(github-deps): bump astral-sh/setup-uv from 7.3.0 to 7.3.1 by @dependabot[bot] in #5027
  • chore(github-deps): bump actions/setup-java from 4.5.0 to 5.2.0 by @dependabot[bot] in #5019
  • ci: add merge_group trigger to all PR-gating workflows by @cdoern in #5017
  • feat(ci): automate post-release and pre-release version management by @cdoern in #4938
  • test: Add prompt template test cases to the responses integraton test… by @msager27 in #4950
  • fix(stainless): handle [DONE] SSE terminator in streaming responses by @dtmeadows in #5012
  • fix(security): pin google-cloud-aiplatform to >=1.131.0 by @derekhiggins in #5037
  • feat(inference): bidirectional reasoning token passthrough for chat completions by @cdoern in #5038
  • chore: remove unreachable tool_choice check in vllm adapter by @mattf in #5009
  • feat(api): support extra_body pass-through in responses API by @codefromthecrypt in #4893
  • docs: additional references to Docker Hub by @nathan-weinberg in #5044
  • fix: add missing shutdown method to PyPDF file processor adapter by @alinaryan in #5047
  • fix(llama-guard): less strict parsing of safety categories by @asimurka in #5045
  • fix: OCI26ai sql query patches by @rhdedgar in #5046
  • fix(conversations): validate conv_ prefix consistently on all endpoints by @iamemilio in #5058
  • fix(conversations): add ExceptionTranslatingRoute to conversations router by @iamemilio in #5057
  • feat: allow model registration without provider API keys by @NickGagan in #5014
  • chore: Rename test_openai_response.py to test_openai_responses.py by @gyliu513 in #5061
  • fix: (pypdf) Possible infinite loop when loading circular /Prev entries in cross-reference streams by @eoinfennessy in #5063
  • chore: bump fallback_version to 0.5.3.dev0 after 0.5.2 release by @cdoern in #5065
  • feat: passthrough safety provider for forwarding to downstream /v1/moderations by @skamenan7 in #5004
  • feat: add conditional authentication provider configuration by @derekhiggins in #5002
  • fix: NLTK Zip Slip Vulnerability by @eoinfennessy in #5062
  • fix: use semantic JSON comparison for MCP approval argument matching by @iamemilio in #5080
  • feat(vertexai): rewrite provider on google-genai with dynamic model listing by @major in #4951
  • ci: temporarily disable CodeQL workflow on pull requests by @leseb in #5079
  • chore: fix post-release workflow and remove broken docker image by @cdoern in #5064
  • fix: Revert "temporarily disable CodeQL workflow on pull requests" by @cdoern in #5085
  • fix: use canonical config loading in backward compat test by @leseb in #5081
  • fix!: add content capture via otel by @gyliu513 in #5060
  • feat!: add integration test for safety_identifier with openai client by @gyliu513 in #5018
  • fix: poll test PyPI before building Docker images to avoid race condition by @cdoern in #5090
  • feat: add regex pattern support to access_policy and route_policy by @derekhiggins in #4991
  • feat: add integration test for truncation with openai client by @gyliu513 in #5084
  • feat: integration test for top_p with openai client by @gyliu513 in #5083
  • chore(github-deps): bump oven-sh/setup-bun from 2.1.2 to 2.1.3 by @dependabot[bot] in #5068
  • chore(github-deps): bump github/codeql-action from 4.32.4 to 4.32.6 by @dependabot[bot] in #5070
  • chore(github-deps): bump llamastack/llama-stack from 7d9786b to 6c700da by @dependabot[bot] in #5075
  • chore(github-deps): bump actions/setup-node from 6.2.0 to 6.3.0 by @dependabot[bot] in #5067
  • chore(github-deps): bump actions/download-artifact from 7.0.0 to 8.0.0 by @dependabot[bot] in #5020
  • fix: process hang on exit with aiosqlite >= 0.22 by @shanemcd in #4589
  • fix: treat hallucinated tool names as client-side function calls by @mattf in #5043
  • test: add streaming web_search test cases to responses integration test suite by @msager27 in #4960
  • feat(PGVector): implement ef_search parameter for HNSW vector index in PGVector by @r3v5 in #4933
  • feat: improve inference performance via cached ssl context by @mattf in #4486
  • ci: update Mergify config with auto-update, auto-approve, and fix merge method by @leseb in #5091
  • feat: Add additional Tool runtime metrics by @gyliu513 in #4904
  • feat: Add integration test for parallel_tool_calls with openai client by @gyliu513 in #5093
  • feat: Enable Filters in OpenAI Search API by @franciscojavierarceo in #4471
  • chore: Move background integration test to test_open_responses.py by @gyliu513 in #5094
  • refactor(vertexai): extract network helper functions into utils module by @major in #5095
  • feat!: new URL for AWS Bedrock and model list support by @are-ces in #4946

New Contributors

Full Changelog: v0.5.2...v0.6.0