forked from llamastack/llama-stack
    
        
        - 
                Notifications
    You must be signed in to change notification settings 
- Fork 0
ntsh #4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
          
     Open
      
      
            derekhiggins
  wants to merge
  422
  commits into
  main
  
    
      
        
          
  
    
      Choose a base branch
      
     
    
      
        
      
      
        
          
          
        
        
          
            
              
              
              
  
           
        
        
          
            
              
              
           
        
       
     
  
        
          
            
          
            
          
        
       
    
      
from
ntsh
  
      
      
   
  
    
  
  
  
 
  
      
    base: main
Could not load branches
            
              
  
    Branch not found: {{ refName }}
  
            
                
      Loading
              
            Could not load tags
            
            
              Nothing to show
            
              
  
            
                
      Loading
              
            Are you sure you want to change the base?
            Some commits from the old base branch may be removed from the timeline,
            and old review comments may become outdated.
          
          
                
     Open
            
            ntsh #4
Conversation
  
    
      This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
      Learn more about bidirectional Unicode characters
    
  
  
    
    104d6a7    to
    ace97da      
    Compare
  
    …lamastack#3438) Bumps [next](https://github.com/vercel/next.js) from 15.3.3 to 15.5.3. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/vercel/next.js/releases">next's releases</a>.</em></p> <blockquote> <h2>v15.5.3</h2> <blockquote> <p>[!NOTE]<br /> This release is backporting bug fixes. It does <strong>not</strong> include all pending features/changes on canary.</p> </blockquote> <h3>Core Changes</h3> <ul> <li>fix: validation return types of pages API routes (<a href="https://redirect.github.com/vercel/next.js/issues/83069">#83069</a>)</li> <li>fix: relative paths in dev in validator.ts (<a href="https://redirect.github.com/vercel/next.js/issues/83073">#83073</a>)</li> <li>fix: remove satisfies keyword from type validation to preserve old TS compatibility (<a href="https://redirect.github.com/vercel/next.js/issues/83071">#83071</a>)</li> </ul> <h3>Credits</h3> <p>Huge thanks to <a href="https://github.com/bgub"><code>@bgub</code></a> for helping!</p> <h2>v15.5.2</h2> <blockquote> <p>[!NOTE]<br /> This release is backporting bug fixes. It does <strong>not</strong> include all pending features/changes on canary.</p> </blockquote> <h3>Core Changes</h3> <ul> <li>fix: disable unknownatrules lint rule entirely (<a href="https://redirect.github.com/vercel/next.js/issues/83059">#83059</a>)</li> <li>revert: add ?dpl to fonts in /_next/static/media (<a href="https://redirect.github.com/vercel/next.js/issues/83062">#83062</a>)</li> </ul> <h3>Credits</h3> <p>Huge thanks to <a href="https://github.com/bgub"><code>@bgub</code></a> and <a href="https://github.com/ztanner"><code>@ztanner</code></a> for helping!</p> <h2>v15.5.1</h2> <blockquote> <p>[!NOTE]<br /> This release is backporting bug fixes. It does <strong>not</strong> include all pending features/changes on canary.</p> </blockquote> <h3>Core Changes</h3> <ul> <li>fix: aliased navigations should apply scroll handling (<a href="https://redirect.github.com/vercel/next.js/issues/82900">#82900</a>)</li> <li>Turbopack: fix invalid NFT entry with file behind symlink (<a href="https://redirect.github.com/vercel/next.js/issues/82887">#82887</a>)</li> <li>fix: typesafe linking to route handlers and pages API routes (<a href="https://redirect.github.com/vercel/next.js/issues/82858">#82858</a>)</li> <li>fix: change "noUnknownAtRules" to "warn" for Biome (<a href="https://redirect.github.com/vercel/next.js/issues/82974">#82974</a>)</li> <li>fix: add path normalization to getRelativePath for Windows (<a href="https://redirect.github.com/vercel/next.js/issues/82918">#82918</a>)</li> <li>feat: add typesafety with config.typedRoutes to redirect() and permanentRedirect() (<a href="https://redirect.github.com/vercel/next.js/issues/82860">#82860</a>)</li> <li>fix: avoid importing types that will be unused (<a href="https://redirect.github.com/vercel/next.js/issues/82856">#82856</a>)</li> <li>fix: update the config.api.responseLimit type (<a href="https://redirect.github.com/vercel/next.js/issues/82852">#82852</a>)</li> <li>fix: update validation return types (<a href="https://redirect.github.com/vercel/next.js/issues/82854">#82854</a>)</li> </ul> <h3>Credits</h3> <p>Huge thanks to <a href="https://github.com/bgub"><code>@bgub</code></a>, <a href="https://github.com/mischnic"><code>@mischnic</code></a>, and <a href="https://github.com/ztanner"><code>@ztanner</code></a> for helping!</p> <h2>v15.5.1-canary.39</h2> <h3>Core Changes</h3> <ul> <li>[metadata] change the metadata routes params to promises: <a href="https://redirect.github.com/vercel/next.js/issues/83560">#83560</a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/vercel/next.js/commit/07d1cbc9c6393b5e7972edc7c0e33587b79f9943"><code>07d1cbc</code></a> v15.5.3</li> <li><a href="https://github.com/vercel/next.js/commit/db56d7759546c0447e9435c36c0b94e19d59409a"><code>db56d77</code></a> [backport] fix: validation return types of pages API routes (<a href="https://redirect.github.com/vercel/next.js/issues/83069">#83069</a>) (<a href="https://redirect.github.com/vercel/next.js/issues/83580">#83580</a>)</li> <li><a href="https://github.com/vercel/next.js/commit/7a806231f85a370a81f47170f0c426240fd58c8e"><code>7a80623</code></a> [backport] fix: relative paths in dev in validator.ts (<a href="https://redirect.github.com/vercel/next.js/issues/83073">#83073</a>) (<a href="https://redirect.github.com/vercel/next.js/issues/83190">#83190</a>)</li> <li><a href="https://github.com/vercel/next.js/commit/fddaeb85a0ca57fc9ae89dea4f987eb4f432e8a2"><code>fddaeb8</code></a> [backport] fix: remove <code>satisfies</code> keyword from type validation to preserve o...</li> <li><a href="https://github.com/vercel/next.js/commit/497ec6aa08a33f9e2d65a5c8461f550c2549d3e6"><code>497ec6a</code></a> v15.5.2</li> <li><a href="https://github.com/vercel/next.js/commit/bc72f41a2e66c16b8d8237c9e9020dcda9c5467f"><code>bc72f41</code></a> [backport] revert: add ?dpl to fonts in <code>/_next/static/media</code> (<a href="https://redirect.github.com/vercel/next.js/issues/83062">#83062</a>) (<a href="https://redirect.github.com/vercel/next.js/issues/83066">#83066</a>)</li> <li><a href="https://github.com/vercel/next.js/commit/c8faf6800b1e4e01807642d288b5894b3481ec5f"><code>c8faf68</code></a> [backport] fix: disable unknownatrules lint rule entirely (<a href="https://redirect.github.com/vercel/next.js/issues/83059">#83059</a>) (<a href="https://redirect.github.com/vercel/next.js/issues/83060">#83060</a>)</li> <li><a href="https://github.com/vercel/next.js/commit/cc68ced55210aca1716daabefb5aa2006bc3d024"><code>cc68ced</code></a> v15.5.1</li> <li><a href="https://github.com/vercel/next.js/commit/1ce9857276d1e348776dc61837692ee85a5401a7"><code>1ce9857</code></a> [backport] fix: update validation return types (<a href="https://redirect.github.com/vercel/next.js/issues/82854">#82854</a>) (<a href="https://redirect.github.com/vercel/next.js/issues/83027">#83027</a>)</li> <li><a href="https://github.com/vercel/next.js/commit/b93c89471755ba10e09ab0064c697c5ee35054d5"><code>b93c894</code></a> [backport] fix: update the config.api.responseLimit type (<a href="https://redirect.github.com/vercel/next.js/issues/82852">#82852</a>) (<a href="https://redirect.github.com/vercel/next.js/issues/83028">#83028</a>)</li> <li>Additional commits viewable in <a href="https://github.com/vercel/next.js/compare/v15.3.3...v15.5.3">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…lama_stack/ui (llamastack#3437) Bumps [@radix-ui/react-select](https://github.com/radix-ui/primitives) from 2.2.5 to 2.2.6. <details> <summary>Commits</summary> <ul> <li>See full diff in <a href="https://github.com/radix-ui/primitives/commits">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…lamastack#3442) # What does this PR do? the @required_args decorator in openai-python is masking the async nature of the {AsyncCompletions,chat.AsyncCompletions}.create method. see openai/openai-python#996 this means two things - 0. we cannot use iscoroutine in the recorder to detect async vs non 1. our mocks are inappropriately introducing identifiable async for (0), we update the iscoroutine check w/ detection of /v1/models, which is the only non-async function we mock & record. for (1), we could leave everything as is and assume (0) will catch errors. to be defensive, we update the unit tests to mock below create methods, allowing the true openai-python create() methods to be tested.
…nchmark resources in Llama Stack (llamastack#3371) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR provides functionality for users to unregister ScoringFn and Benchmark resources for `scoring` and `eval` APIs. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Closes llamastack#3051 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* --> Updated integration and unit tests via CI workflow
…tack#3417) # What does this PR do? adds dynamic model support to TGI add new overwrite_completion_id feature to OpenAIMixin to deal with TGI always returning id="" ## Test Plan tgi: `docker run --gpus all --shm-size 1g -p 8080:80 -v /data:/data ghcr.io/huggingface/text-generation-inference --model-id Qwen/Qwen3-0.6B` stack: `TGI_URL=http://localhost:8080 uv run llama stack build --image-type venv --distro ci-tests --run` test: `./scripts/integration-tests.sh --stack-config http://localhost:8321 --setup tgi --subdirs inference --pattern openai`
# What does this PR do?
 use a logger
* update the distro to add the Files API otherwise it won't start since
it is a dependency of vector
* clarify project_id and api_key requirements
* disable openai compatible calls since the endpoint returns 404
* disable text_inference structured format tests
* fixed openai client initialization
## Test Plan
Execute text_inference:
```
WATSONX_API_KEY=... WATSONX_PROJECT_ID=... python -m llama_stack.core.server.server llama_stack/distributions/watsonx/run.yaml
LLAMA_STACK_CONFIG=http://localhost:8321 uv run --group test pytest -vvvv -ra --text-model watsonx/meta-llama/llama-3-3-70b-instruct tests/integration/inference/test_text_inference.py
============================================= test session starts ==============================================
platform darwin -- Python 3.12.8, pytest-8.4.2, pluggy-1.6.0 -- /Users/leseb/Documents/AI/llama-stack/.venv/bin/python3
cachedir: .pytest_cache
metadata: {'Python': '3.12.8', 'Platform': 'macOS-15.6.1-arm64-arm-64bit', 'Packages': {'pytest': '8.4.2', 'pluggy': '1.6.0'}, 'Plugins': {'anyio': '4.9.0', 'html': '4.1.1', 'socket': '0.7.0', 'asyncio': '1.1.0', 'json-report': '1.5.0', 'timeout': '2.4.0', 'metadata': '3.1.1', 'cov': '6.2.1', 'nbval': '0.11.0', 'hydra-core': '1.3.2'}}
rootdir: /Users/leseb/Documents/AI/llama-stack
configfile: pyproject.toml
plugins: anyio-4.9.0, html-4.1.1, socket-0.7.0, asyncio-1.1.0, json-report-1.5.0, timeout-2.4.0, metadata-3.1.1, cov-6.2.1, nbval-0.11.0, hydra-core-1.3.2
asyncio: mode=Mode.AUTO, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
collected 20 items
tests/integration/inference/test_text_inference.py::test_text_completion_non_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:completion:sanity] PASSED [  5%]
tests/integration/inference/test_text_inference.py::test_text_completion_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:completion:sanity] PASSED [ 10%]
tests/integration/inference/test_text_inference.py::test_text_completion_stop_sequence[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:completion:stop_sequence] XFAIL [ 15%]
tests/integration/inference/test_text_inference.py::test_text_completion_log_probs_non_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:completion:log_probs] XFAIL [ 20%]
tests/integration/inference/test_text_inference.py::test_text_completion_log_probs_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:completion:log_probs] XFAIL [ 25%]
tests/integration/inference/test_text_inference.py::test_text_completion_structured_output[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:completion:structured_output] SKIPPED structured output) [ 30%]
tests/integration/inference/test_text_inference.py::test_text_chat_completion_non_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:non_streaming_01] PASSED [ 35%]
tests/integration/inference/test_text_inference.py::test_text_chat_completion_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:streaming_01] PASSED [ 40%]
tests/integration/inference/test_text_inference.py::test_text_chat_completion_with_tool_calling_and_non_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:tool_calling] PASSED [ 45%]
tests/integration/inference/test_text_inference.py::test_text_chat_completion_with_tool_calling_and_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:tool_calling] PASSED [ 50%]
tests/integration/inference/test_text_inference.py::test_text_chat_completion_with_tool_choice_required[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:tool_calling] PASSED [ 55%]
tests/integration/inference/test_text_inference.py::test_text_chat_completion_with_tool_choice_none[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:tool_calling] PASSED [ 60%]
tests/integration/inference/test_text_inference.py::test_text_chat_completion_structured_output[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:structured_output] SKIPPEDstructured output) [ 65%]
tests/integration/inference/test_text_inference.py::test_text_chat_completion_tool_calling_tools_not_in_request[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:tool_calling_tools_absent-True] PASSED [ 70%]
tests/integration/inference/test_text_inference.py::test_text_chat_completion_with_multi_turn_tool_calling[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:text_then_tool] XFAIL [ 75%]
tests/integration/inference/test_text_inference.py::test_text_chat_completion_non_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:non_streaming_02] PASSED [ 80%]
tests/integration/inference/test_text_inference.py::test_text_chat_completion_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:streaming_02] PASSED [ 85%]
tests/integration/inference/test_text_inference.py::test_text_chat_completion_tool_calling_tools_not_in_request[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:tool_calling_tools_absent-False] PASSED [ 90%]
tests/integration/inference/test_text_inference.py::test_text_chat_completion_with_multi_turn_tool_calling[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:tool_then_answer] XFAIL [ 95%]
tests/integration/inference/test_text_inference.py::test_text_chat_completion_with_multi_turn_tool_calling[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:array_parameter] XFAIL [100%]
=========================================== short test summary info ============================================
SKIPPED [2] tests/integration/inference/test_text_inference.py:49: Model watsonx/meta-llama/llama-3-3-70b-instruct hosted by remote::watsonx doesn't support json_schema structured output
XFAIL tests/integration/inference/test_text_inference.py::test_text_completion_stop_sequence[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:completion:stop_sequence] - remote::watsonx doesn't support 'stop' parameter yet
XFAIL tests/integration/inference/test_text_inference.py::test_text_completion_log_probs_non_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:completion:log_probs] - remote::watsonx doesn't support log probs yet
XFAIL tests/integration/inference/test_text_inference.py::test_text_completion_log_probs_streaming[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:completion:log_probs] - remote::watsonx doesn't support log probs yet
XFAIL tests/integration/inference/test_text_inference.py::test_text_chat_completion_with_multi_turn_tool_calling[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:text_then_tool] - Not tested for non-llama4 models yet
XFAIL tests/integration/inference/test_text_inference.py::test_text_chat_completion_with_multi_turn_tool_calling[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:tool_then_answer] - Not tested for non-llama4 models yet
XFAIL tests/integration/inference/test_text_inference.py::test_text_chat_completion_with_multi_turn_tool_calling[txt=watsonx/meta-llama/llama-3-3-70b-instruct-inference:chat_completion:array_parameter] - Not tested for non-llama4 models yet
============================ 12 passed, 2 skipped, 6 xfailed, 14 warnings in 36.88s ============================
```
---------
Signed-off-by: Sébastien Han <[email protected]>
    # What does this PR do? this document outlines different API stability levels, how to enforce them, and next steps ## Next Steps Following the adoption of this document, all existing APIs should follow the enforcement protocol. relates to llamastack#3237 Signed-off-by: Charlie Doern <[email protected]>
# What does this PR do? Pinning to latest pydantic version 2.11.9 as sometime we are picking older version and failing to start container in github actions : https://github.com/llamastack/llama-stack-ops/actions/runs/17750263127 Closes llamastack#3461 ## Test Plan Tested locally with the following commands to start a container Build container `llama stack build --distro starter --image-type container` start container `docker run -d -p 8321:8321 --name llama-stack-test distribution-starter:0.2.21` check health http://localhost:8321/v1/health Couldnt repro with older version(`2.8.2`), but `2.11.9` pydantic is able to start the container https://pypi.org/project/pydantic/#history , 2.11.9 is the latest version
…dapter (llamastack#3458) # What does this PR do? adds embedding and dynamic model support to Together inference adapter - updated to use OpenAIMixin - workarounds for Together api quirks - recordings for together suite when subdirs=inference,pattern=openai ## Test Plan ``` $ TOGETHER_API_KEY=_NONE_ ./scripts/integration-tests.sh --stack-config server:ci-tests --setup together --subdirs inference --pattern openai ... tests/integration/inference/test_openai_completion.py::test_openai_completion_non_streaming[txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:completion:sanity] instantiating llama_stack_client Port 8321 is already in use, assuming server is already running... llama_stack_client instantiated in 0.121s PASSED [ 2%] tests/integration/inference/test_openai_completion.py::test_openai_completion_non_streaming_suffix[txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:completion:suffix] SKIPPED [ 4%] tests/integration/inference/test_openai_completion.py::test_openai_completion_streaming[txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:completion:sanity] PASSED [ 6%] tests/integration/inference/test_openai_completion.py::test_openai_completion_prompt_logprobs[txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-1] SKIPPED [ 8%] tests/integration/inference/test_openai_completion.py::test_openai_completion_guided_choice[txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free] SKIPPED [ 10%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[openai_client-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:non_streaming_01] PASSED [ 12%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[openai_client-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:streaming_01] PASSED [ 14%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[openai_client-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:streaming_01] SKIPPED [ 17%] tests/integration/inference/test_openai_completion.py::test_inference_store[openai_client-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-True] PASSED [ 19%] tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-True] PASSED [ 21%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming_with_file[txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free] SKIPPED [ 23%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_single_string[openai_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 25%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_multiple_strings[openai_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 27%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_encoding_format_float[openai_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 29%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_dimensions[openai_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] SKIPPED [ 31%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_user_parameter[openai_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] SKIPPED [ 34%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_empty_list_error[openai_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 36%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_invalid_model_error[openai_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 38%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_different_inputs_different_outputs[openai_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 40%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_encoding_format_base64[openai_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] SKIPPED [ 42%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_base64_batch_processing[openai_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] SKIPPED [ 44%] tests/integration/inference/test_openai_completion.py::test_openai_completion_prompt_logprobs[txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-0] SKIPPED [ 46%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[openai_client-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:non_streaming_02] PASSED [ 48%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[openai_client-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:streaming_02] PASSED [ 51%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[openai_client-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:streaming_02] SKIPPED [ 53%] tests/integration/inference/test_openai_completion.py::test_inference_store[openai_client-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-False] PASSED [ 55%] tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[openai_client-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-False] PASSED [ 57%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_single_string[llama_stack_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 59%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_multiple_strings[llama_stack_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 61%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_encoding_format_float[llama_stack_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 63%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_dimensions[llama_stack_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] SKIPPED [ 65%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_user_parameter[llama_stack_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] SKIPPED [ 68%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_empty_list_error[llama_stack_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 70%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_invalid_model_error[llama_stack_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 72%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_different_inputs_different_outputs[llama_stack_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] PASSED [ 74%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_with_encoding_format_base64[llama_stack_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] SKIPPED [ 76%] tests/integration/inference/test_openai_embeddings.py::test_openai_embeddings_base64_batch_processing[llama_stack_client-emb=together/togethercomputer/m2-bert-80M-32k-retrieval] SKIPPED [ 78%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[client_with_models-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:non_streaming_01] PASSED [ 80%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[client_with_models-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:streaming_01] PASSED [ 82%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[client_with_models-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:streaming_01] SKIPPED [ 85%] tests/integration/inference/test_openai_completion.py::test_inference_store[client_with_models-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-True] PASSED [ 87%] tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-True] PASSED [ 89%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_non_streaming[client_with_models-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:non_streaming_02] PASSED [ 91%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming[client_with_models-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:streaming_02] PASSED [ 93%] tests/integration/inference/test_openai_completion.py::test_openai_chat_completion_streaming_with_n[client_with_models-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-inference:chat_completion:streaming_02] SKIPPED [ 95%] tests/integration/inference/test_openai_completion.py::test_inference_store[client_with_models-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-False] PASSED [ 97%] tests/integration/inference/test_openai_completion.py::test_inference_store_tool_calls[client_with_models-txt=together/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free-False] PASSED [100%] ============================================ 30 passed, 17 skipped, 50 deselected, 4 warnings in 21.96s ============================================= ```
# What does this PR do? Modified the code in registry.py. The key changes are: 1. Removed the `return False` statement 2. Added a warning log message that includes the object type, identifier, and provider_id for better debugging. 3. The method now continues with the registration process instead of early returning. --------- Co-authored-by: Omar Abdelwahab <[email protected]>
Add default value for PR_HEAD_REPO to prevent 'unbound variable' error when no PR exists for a branch. Signed-off-by: Derek Higgins <[email protected]>
# What does this PR do?
Fixes this warning in llama stack build:
```bash
WARNING  2025-09-15 15:29:02,197 llama_stack.core.distribution:149 core: Failed to import module prompts: No module named
         'llama_stack.providers.registry.prompts'"
```
## Test Plan
Test added
---------
Signed-off-by: Francisco Javier Arceo <[email protected]>
    # What does this PR do? * Updates documentation links from readthedocs to llamastack.github.io ## Test Plan * Manual testing
…mastack#3472) # What does this PR do? When registering a dataset for NVIDIA, the DatasetsRoutingTable expects `nvidia` to be passed via the `provider_id` [here](https://github.com/llamastack/llama-stack/blob/main/llama_stack/core/routing_tables/datasets.py#L61). This PR fixes a notebook to correctly use `provider_id`. <!-- If resolving an issue, uncomment and update the line below --> Closes llamastack#3308 ## Test Plan Manually execute the notebook steps to verify the dataset is registered. Co-authored-by: Jash Gulabrai <[email protected]>
) # What does this PR do? Updates the qdrant provider's convert_id function to use a FIPS-validated cryptographic hashing function, so that llama-stack is considered to be `Designed for FIPS`. The standard library `uuid.uuid5()` function uses SHA-1 under the hood, which is not FIPS-validated. This commit uses an approach similar to the one merged in llamastack#3423. Closes llamastack#3476. ## Test Plan Unit tests from scripts/unit-tests.sh were ran to verify that the tests pass. A small test script can display the data flow: ```python import hashlib import uuid # Input _id = "chunk_abc123" print(_id) # Step 1: Format and encode hash_input = f"qdrant_id:{_id}".encode() print(hash_input) # Result: b'qdrant_id:chunk_abc123' # Step 2: SHA-256 hash sha256_hash = hashlib.sha256(hash_input).hexdigest() print(sha256_hash) # Result: "184893a6eafeaac487cb9166351e8625b994d50f3456d8bc6cea32a014a27151" # Step 3: Create UUID from first 32 chars uuid_string = str(uuid.UUID(sha256_hash[:32])) print(uuid_string) # sha256_hash[:32] = "184893a6eafeaac487cb9166351e8625" # Final result: "184893a6-eafe-aac4-87cb-9166351e8625" ``` Signed-off-by: Doug Edgar <[email protected]>
…lamastack#3388) # What does this PR do? *Add dynamic authentication token forwarding support for vLLM provider* This enables per-request authentication tokens for vLLM providers, supporting use cases like RAG operations where different requests may need different authentication tokens. The implementation follows the same pattern as other providers like Together AI, Fireworks, and Passthrough. - Add LiteLLMOpenAIMixin that manages the vllm_api_token properly Usage: - Static: VLLM_API_TOKEN env var or config.api_token - Dynamic: X-LlamaStack-Provider-Data header with vllm_api_token All existing functionality is preserved while adding new dynamic capabilities. <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* --> ``` curl -X POST "http://localhost:8000/v1/chat/completions" -H "Authorization: Bearer my-dynamic-token" \ -H "X-LlamaStack-Provider-Data: {\"vllm_api_token\": \"Bearer my-dynamic-token\", \"vllm_url\": \"http://dynamic-server:8000\"}" \ -H "Content-Type: application/json" \ -d '{"model": "llama-3.1-8b", "messages": [{"role": "user", "content": "Hello!"}]}' ``` --------- Signed-off-by: Akram Ben Aissi <[email protected]>
# What does this PR do? this replaces the static model listing for any provider using OpenAIMixin currently - - anthropic - azure openai - gemini - groq - llama-api - nvidia - openai - sambanova - tgi - vertexai - vllm - not changed: together has its own impl ## Test Plan - new unit tests - manual for llama-api, openai, groq, gemini ``` for provider in llama-openai-compat openai groq gemini; do uv run llama stack build --image-type venv --providers inference=remote::provider --run & uv run --with llama-stack-client llama-stack-client models list | grep Total ``` results (17 sep 2025): - llama-api: 4 - openai: 86 - groq: 21 - gemini: 66 closes llamastack#3467
…-compat functions (llamastack#3395) # What does this PR do? update Ollama inference provider to use OpenAIMixin for openai-compat endpoints ## Test Plan ci
# What does this PR do?
The rag-runtime tool requires files API as a dependency, but the NVIDIA
distribution was missing the files provider configuration. Thus, when
running:
```
llama stack build --distro nvidia --image-type venv
```
And then:
```
llama stack run {path_to_distribution_config} --image-type venv
```
It would raise an error:
```
RuntimeError: Failed to resolve 'tool_runtime' provider 'rag-runtime' of type 'inline::rag-runtime': required dependency 'files' is not available. Please add a 'files' provider to your configuration or check if the provider is properly configured.
```
This PR fixes the issue by adding missing files provider to NVIDIA
distribution.
## Test Plan
N/A
    # What does this PR do? currently `RemoteProviderSpec` has an `AdapterSpec` embedded in it. Remove `AdapterSpec`, and put its leftover fields into `RemoteProviderSpec`. Additionally, many of the fields were duplicated between `InlineProviderSpec` and `RemoteProviderSpec`. Move these to `ProviderSpec` so they are shared. Fixup the distro codegen to use `RemoteProviderSpec` directly rather than `remote_provider_spec` which took an AdapterSpec and returned a full provider spec ## Test Plan existing distro tests should pass. Signed-off-by: Charlie Doern <[email protected]>
# What does this PR do? As shown in llamastack#3421, we can scale stack to handle more RPS with k8s replicas. This PR enables multi process stack with uvicorn --workers so that we can achieve the same scaling without being in k8s. To achieve that we refactor main to split out the app construction logic. This method needs to be non-async. We created a new `Stack` class to house impls and have a `start()` method to be called in lifespan to start background tasks instead of starting them in the old `construct_stack`. This way we avoid having to manage an event loop manually. ## Test Plan CI > uv run --with llama-stack python -m llama_stack.core.server.server benchmarking/k8s-benchmark/stack_run_config.yaml works. > LLAMA_STACK_CONFIG=benchmarking/k8s-benchmark/stack_run_config.yaml uv run uvicorn llama_stack.core.server.server:create_app --port 8321 --workers 4 works.
# What does this PR do? This PR fixes a blocking issue in the detailed RAG tutorial where the code fails with a 400 Bad Request error. The root cause is that recent versions of Llama-Stack ignore the client-generated vector_db_id and assign a new server-side ID. The tutorial was not updated to reflect this, causing the rag_tool.insert call to fail. This change updates the code to capture the authoritative ID from the .identifier attribute of the register() method's response. This ensures the tutorial code runs successfully and reflects the current API behavior. ## Test Plan The fix can be verified by running the Python code snippet from the detailed tutorial page. Run the original code (Before this change): Result: The script fails with a 400 Bad Request error on the rag_tool.insert step. Run the updated code (After this change): Result: The script runs successfully to completion. Co-authored-by: Adam Young <[email protected]>
…_stack/ui (llamastack#3789) Bumps [@types/react-dom](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/react-dom) from 19.2.0 to 19.2.1. <details> <summary>Commits</summary> <ul> <li>See full diff in <a href="https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/react-dom">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…lamastack#3791) Bumps [eslint](https://github.com/eslint/eslint) from 9.26.0 to 9.37.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/eslint/eslint/releases">eslint's releases</a>.</em></p> <blockquote> <h2>v9.37.0</h2> <h2>Features</h2> <ul> <li><a href="https://github.com/eslint/eslint/commit/39f7fb493a6924ff7dc638fd4d6e7b3d8eb95383"><code>39f7fb4</code></a> feat: <code>preserve-caught-error</code> should recognize all static "cause" keys (<a href="https://redirect.github.com/eslint/eslint/issues/20163">#20163</a>) (Pixel998)</li> <li><a href="https://github.com/eslint/eslint/commit/f81eabc5849ece98b8ca054f96b29f038a69bcf8"><code>f81eabc</code></a> feat: support TS syntax in <code>no-restricted-imports</code> (<a href="https://redirect.github.com/eslint/eslint/issues/19562">#19562</a>) (Nitin Kumar)</li> </ul> <h2>Bug Fixes</h2> <ul> <li><a href="https://github.com/eslint/eslint/commit/a129cced7a86ea2518eb9be6990fa18af39694ca"><code>a129cce</code></a> fix: correct <code>no-loss-of-precision</code> false positives for leading zeros (<a href="https://redirect.github.com/eslint/eslint/issues/20164">#20164</a>) (Francesco Trotta)</li> <li><a href="https://github.com/eslint/eslint/commit/09e04fcc3f4cc963eea7c9c579391de5e231595b"><code>09e04fc</code></a> fix: add missing AST token types (<a href="https://redirect.github.com/eslint/eslint/issues/20172">#20172</a>) (Pixel998)</li> <li><a href="https://github.com/eslint/eslint/commit/861c6da2bd2796414e6eed782155ec34e2ed6344"><code>861c6da</code></a> fix: correct <code>ESLint</code> typings (<a href="https://redirect.github.com/eslint/eslint/issues/20122">#20122</a>) (Pixel998)</li> </ul> <h2>Documentation</h2> <ul> <li><a href="https://github.com/eslint/eslint/commit/b950359c5f39085483c3137a6a160e582ef32007"><code>b950359</code></a> docs: fix typos across the docs (<a href="https://redirect.github.com/eslint/eslint/issues/20182">#20182</a>) (루밀LuMir)</li> <li><a href="https://github.com/eslint/eslint/commit/42498a27981d50750dd15ae8660dbe85c4f4587c"><code>42498a2</code></a> docs: improve ToC accessibility by hiding non-semantic character (<a href="https://redirect.github.com/eslint/eslint/issues/20181">#20181</a>) (Percy Ma)</li> <li><a href="https://github.com/eslint/eslint/commit/29ea092b93608756350b1e9c5a4f29c8a49264ab"><code>29ea092</code></a> docs: Update README (GitHub Actions Bot)</li> <li><a href="https://github.com/eslint/eslint/commit/5c97a04578e6280c2395f642c2d8d6bdf30eec18"><code>5c97a04</code></a> docs: show <code>availableUntil</code> in deprecated rule banner (<a href="https://redirect.github.com/eslint/eslint/issues/20170">#20170</a>) (Pixel998)</li> <li><a href="https://github.com/eslint/eslint/commit/90a71bf5024a86fc232cd2e05f96811e2a18fd0f"><code>90a71bf</code></a> docs: update <code>README</code> files to add badge and instructions (<a href="https://redirect.github.com/eslint/eslint/issues/20115">#20115</a>) (루밀LuMir)</li> <li><a href="https://github.com/eslint/eslint/commit/1603ae1526d9b6f557c7d5534a4f40f46842edd6"><code>1603ae1</code></a> docs: update references from <code>master</code> to <code>main</code> (<a href="https://redirect.github.com/eslint/eslint/issues/20153">#20153</a>) (루밀LuMir)</li> </ul> <h2>Chores</h2> <ul> <li><a href="https://github.com/eslint/eslint/commit/afe8a1346958242031fea66fdfbb239e8bf408b7"><code>afe8a13</code></a> chore: update <code>@eslint/js</code> dependency to version 9.37.0 (<a href="https://redirect.github.com/eslint/eslint/issues/20183">#20183</a>) (Francesco Trotta)</li> <li><a href="https://github.com/eslint/eslint/commit/abee4ca1fa10da733b1cc4a7d5e765b912a9de82"><code>abee4ca</code></a> chore: package.json update for <code>@eslint/js</code> release (Jenkins)</li> <li><a href="https://github.com/eslint/eslint/commit/fc9381f6ca57b824e82d118c14631c17bea79d7e"><code>fc9381f</code></a> chore: fix typos in comments (<a href="https://redirect.github.com/eslint/eslint/issues/20175">#20175</a>) (overlookmotel)</li> <li><a href="https://github.com/eslint/eslint/commit/e1574a22d38fd7e1891f86f8db0b09053f8963cb"><code>e1574a2</code></a> chore: unpin jiti (<a href="https://redirect.github.com/eslint/eslint/issues/20173">#20173</a>) (renovate[bot])</li> <li><a href="https://github.com/eslint/eslint/commit/e1ac05e2fae779e738f85bd47dda1cc2b7099346"><code>e1ac05e</code></a> refactor: mark <code>ESLint.findConfigFile()</code> as <code>async</code>, add missing docs (<a href="https://redirect.github.com/eslint/eslint/issues/20157">#20157</a>) (Pixel998)</li> <li><a href="https://github.com/eslint/eslint/commit/347906d627c53bf45d63ba831d2fd2b83fb0a749"><code>347906d</code></a> chore: update eslint (<a href="https://redirect.github.com/eslint/eslint/issues/20149">#20149</a>) (renovate[bot])</li> <li><a href="https://github.com/eslint/eslint/commit/0cb5897e24059bacadb8d2e6458184904759fda1"><code>0cb5897</code></a> test: remove tmp dir created for circular fixes in multithread mode test (<a href="https://redirect.github.com/eslint/eslint/issues/20146">#20146</a>) (Milos Djermanovic)</li> <li><a href="https://github.com/eslint/eslint/commit/bb995665e32b3a958e78006c9fd75744c5604f1b"><code>bb99566</code></a> ci: pin <code>jiti</code> to version 2.5.1 (<a href="https://redirect.github.com/eslint/eslint/issues/20151">#20151</a>) (Pixel998)</li> <li><a href="https://github.com/eslint/eslint/commit/177f669adc0f96d14ae1a71cde7786f327515863"><code>177f669</code></a> perf: improve worker count calculation for <code>"auto"</code> concurrency (<a href="https://redirect.github.com/eslint/eslint/issues/20067">#20067</a>) (Francesco Trotta)</li> <li><a href="https://github.com/eslint/eslint/commit/448b57bca3406ee12c4e44e9298fc0c99d3ee10c"><code>448b57b</code></a> chore: Mark deprecated formatting rules as available until v11.0.0 (<a href="https://redirect.github.com/eslint/eslint/issues/20144">#20144</a>) (Milos Djermanovic)</li> </ul> <h2>v9.36.0</h2> <h2>Features</h2> <ul> <li><a href="https://github.com/eslint/eslint/commit/47afcf668df65eac68d7b04145d037037010a076"><code>47afcf6</code></a> feat: correct <code>preserve-caught-error</code> edge cases (<a href="https://redirect.github.com/eslint/eslint/issues/20109">#20109</a>) (Francesco Trotta)</li> </ul> <h2>Bug Fixes</h2> <ul> <li><a href="https://github.com/eslint/eslint/commit/75b74d865d3b8e7fa3bcf5ad29f4bf6d18d1310e"><code>75b74d8</code></a> fix: add missing rule option types (<a href="https://redirect.github.com/eslint/eslint/issues/20127">#20127</a>) (ntnyq)</li> <li><a href="https://github.com/eslint/eslint/commit/1c0d85049e3f30a8809340c1abc881c63b7812ff"><code>1c0d850</code></a> fix: update <code>eslint-all.js</code> to use <code>Object.freeze</code> for <code>rules</code> object (<a href="https://redirect.github.com/eslint/eslint/issues/20116">#20116</a>) (루밀LuMir)</li> <li><a href="https://github.com/eslint/eslint/commit/7d61b7fadc9c5c6f2b131e37e8a3cffa5aae8ee6"><code>7d61b7f</code></a> fix: add missing scope types to <code>Scope.type</code> (<a href="https://redirect.github.com/eslint/eslint/issues/20110">#20110</a>) (Pixel998)</li> <li><a href="https://github.com/eslint/eslint/commit/7a670c301b58609017ce8cfda99ee81f95de3898"><code>7a670c3</code></a> fix: correct rule option typings in <code>rules.d.ts</code> (<a href="https://redirect.github.com/eslint/eslint/issues/20084">#20084</a>) (Pixel998)</li> </ul> <h2>Documentation</h2> <ul> <li><a href="https://github.com/eslint/eslint/commit/b73ab12acd3e87f8d8173cda03499f6cd1f26db6"><code>b73ab12</code></a> docs: update examples to use <code>defineConfig</code> (<a href="https://redirect.github.com/eslint/eslint/issues/20131">#20131</a>) (sethamus)</li> <li><a href="https://github.com/eslint/eslint/commit/31d93926990fba536846ec727d7a2625fc844649"><code>31d9392</code></a> docs: fix typos (<a href="https://redirect.github.com/eslint/eslint/issues/20118">#20118</a>) (Pixel998)</li> <li><a href="https://github.com/eslint/eslint/commit/c7f861b3f8c1ac961b4cd4f22483798f3324c62b"><code>c7f861b</code></a> docs: Update README (GitHub Actions Bot)</li> <li><a href="https://github.com/eslint/eslint/commit/6b0c08b106aa66f2e9fa484282f0eb63c64a1215"><code>6b0c08b</code></a> docs: Update README (GitHub Actions Bot)</li> <li><a href="https://github.com/eslint/eslint/commit/91f97c50468fbdc089c91e99c2ea0fe821911df2"><code>91f97c5</code></a> docs: Update README (GitHub Actions Bot)</li> </ul> <h2>Chores</h2> <ul> <li><a href="https://github.com/eslint/eslint/commit/12411e8d450ed26a5f7cca6a78ec05323c9323e8"><code>12411e8</code></a> chore: upgrade <code>@eslint/js</code><a href="https://github.com/9"><code>@9</code></a>.36.0 (<a href="https://redirect.github.com/eslint/eslint/issues/20139">#20139</a>) (Milos Djermanovic)</li> <li><a href="https://github.com/eslint/eslint/commit/488cba6b391b97b2cfc74bbb46fdeacb1361949e"><code>488cba6</code></a> chore: package.json update for <code>@eslint/js</code> release (Jenkins)</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/eslint/eslint/commit/d5d1bdf5fdfad75197aadd3e894182135158c3b1"><code>d5d1bdf</code></a> 9.37.0</li> <li><a href="https://github.com/eslint/eslint/commit/94865ff41cdc14b90ecd325926b78c2ffc6a5206"><code>94865ff</code></a> Build: changelog update for 9.37.0</li> <li><a href="https://github.com/eslint/eslint/commit/afe8a1346958242031fea66fdfbb239e8bf408b7"><code>afe8a13</code></a> chore: update <code>@eslint/js</code> dependency to version 9.37.0 (<a href="https://redirect.github.com/eslint/eslint/issues/20183">#20183</a>)</li> <li><a href="https://github.com/eslint/eslint/commit/abee4ca1fa10da733b1cc4a7d5e765b912a9de82"><code>abee4ca</code></a> chore: package.json update for <code>@eslint/js</code> release</li> <li><a href="https://github.com/eslint/eslint/commit/b950359c5f39085483c3137a6a160e582ef32007"><code>b950359</code></a> docs: fix typos across the docs (<a href="https://redirect.github.com/eslint/eslint/issues/20182">#20182</a>)</li> <li><a href="https://github.com/eslint/eslint/commit/42498a27981d50750dd15ae8660dbe85c4f4587c"><code>42498a2</code></a> docs: improve ToC accessibility by hiding non-semantic character (<a href="https://redirect.github.com/eslint/eslint/issues/20181">#20181</a>)</li> <li><a href="https://github.com/eslint/eslint/commit/fc9381f6ca57b824e82d118c14631c17bea79d7e"><code>fc9381f</code></a> chore: fix typos in comments (<a href="https://redirect.github.com/eslint/eslint/issues/20175">#20175</a>)</li> <li><a href="https://github.com/eslint/eslint/commit/e1574a22d38fd7e1891f86f8db0b09053f8963cb"><code>e1574a2</code></a> chore: unpin jiti (<a href="https://redirect.github.com/eslint/eslint/issues/20173">#20173</a>)</li> <li><a href="https://github.com/eslint/eslint/commit/29ea092b93608756350b1e9c5a4f29c8a49264ab"><code>29ea092</code></a> docs: Update README</li> <li><a href="https://github.com/eslint/eslint/commit/a129cced7a86ea2518eb9be6990fa18af39694ca"><code>a129cce</code></a> fix: correct <code>no-loss-of-precision</code> false positives for leading zeros (<a href="https://redirect.github.com/eslint/eslint/issues/20164">#20164</a>)</li> <li>Additional commits viewable in <a href="https://github.com/eslint/eslint/compare/v9.26.0...v9.37.0">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…a_stack/ui (llamastack#3792) Bumps [framer-motion](https://github.com/motiondivision/motion) from 12.23.12 to 12.23.24. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/motiondivision/motion/blob/main/CHANGELOG.md">framer-motion's changelog</a>.</em></p> <blockquote> <h2>[12.23.24] 2025-10-10</h2> <h3>Fixed</h3> <ul> <li>Ensure that when a component remounts, it continues to fire animations even when <code>initial={false}</code>.</li> </ul> <h2>[12.23.23] 2025-10-10</h2> <h3>Added</h3> <ul> <li>Exporting <code>PresenceChild</code> and <code>PopChild</code> type for internal use.</li> </ul> <h2>[12.23.22] 2025-09-25</h2> <h3>Added</h3> <ul> <li>Exporting <code>HTMLElements</code> and <code>useComposedRefs</code> type for internal use.</li> </ul> <h2>[12.23.21] 2025-09-24</h2> <h3>Fixed</h3> <ul> <li>Fixing main-thread <code>scroll</code> with animations that contain <code>delay</code>.</li> </ul> <h2>[12.23.20] 2025-09-24</h2> <h3>Fixed</h3> <ul> <li>Suppress non-animatable value warning for instant animations.</li> </ul> <h2>[12.23.19] 2025-09-23</h2> <h3>Fixed</h3> <ul> <li>Remove support for changing <code>ref</code> prop.</li> </ul> <h2>[12.23.18] 2025-09-19</h2> <h3>Fixed</h3> <ul> <li><code><motion /></code> components now support changing <code>ref</code> prop.</li> </ul> <h2>[12.23.17] 2025-09-19</h2> <h3>Fixed</h3> <ul> <li>Ensure <code>animate()</code> <code>onComplete</code> only fires once, when all values are complete.</li> </ul> <h2>[12.23.16] 2025-09-19</h2> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/motiondivision/motion/commit/b5df740a4649ee64a07523fc9f362c56f240eb3f"><code>b5df740</code></a> v12.23.24</li> <li><a href="https://github.com/motiondivision/motion/commit/808ebce630eb8dd75716190b41a0ecb3b7dab56f"><code>808ebce</code></a> Updating changelog</li> <li><a href="https://github.com/motiondivision/motion/commit/237eee22464039152dd3acf122496503b3f3a5da"><code>237eee2</code></a> v12.23.23</li> <li><a href="https://github.com/motiondivision/motion/commit/834965c8031118bd156292644ac94ef5cb5b45e8"><code>834965c</code></a> Updating changelog</li> <li><a href="https://github.com/motiondivision/motion/commit/40690864e92cf976e2dd19cdd41ec27b67e91f66"><code>4069086</code></a> Update README.md</li> <li><a href="https://github.com/motiondivision/motion/commit/6da6b61e9424380b35e0b2d41ab6e42be020d65b"><code>6da6b61</code></a> Update README.md with new sponsor links</li> <li><a href="https://github.com/motiondivision/motion/commit/e36683149d738b36237b10fb86ff0a64d0e39c03"><code>e366831</code></a> Update README.md</li> <li><a href="https://github.com/motiondivision/motion/commit/7796f4f1e079ccdffea0c0d1abab32bc08871be8"><code>7796f4f</code></a> Update Gold section with new links and images</li> <li><a href="https://github.com/motiondivision/motion/commit/d1bb93757c22afe2251f1088e4c785800bbb3fef"><code>d1bb937</code></a> Update sponsor section in README.md</li> <li><a href="https://github.com/motiondivision/motion/commit/97fba16059d072b79741c54689fede81d0b632ad"><code>97fba16</code></a> Update sponsorship logos in README</li> <li>Additional commits viewable in <a href="https://github.com/motiondivision/motion/compare/v12.23.12...v12.23.24">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…tack/ui (llamastack#3788) Bumps [lucide-react](https://github.com/lucide-icons/lucide/tree/HEAD/packages/lucide-react) from 0.542.0 to 0.545.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/lucide-icons/lucide/releases">lucide-react's releases</a>.</em></p> <blockquote> <h2>Version 0.545.0</h2> <h2>What's Changed</h2> <ul> <li>fix(icons): changed <code>flame</code> icon by <a href="https://github.com/jamiemlaw"><code>@jamiemlaw</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3600">lucide-icons/lucide#3600</a></li> <li>fix(icons): arcified <code>square-m</code> icon by <a href="https://github.com/jguddas"><code>@jguddas</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3549">lucide-icons/lucide#3549</a></li> <li>chore(deps-dev): bump vite from 6.3.5 to 6.3.6 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3611">lucide-icons/lucide#3611</a></li> <li>fix(icons): changed <code>combine</code> icon by <a href="https://github.com/jguddas"><code>@jguddas</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3200">lucide-icons/lucide#3200</a></li> <li>fix(icons): changed <code>building-2</code> icon by <a href="https://github.com/karsa-mistmere"><code>@karsa-mistmere</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3509">lucide-icons/lucide#3509</a></li> <li>chore(deps): bump devalue from 5.1.1 to 5.3.2 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3638">lucide-icons/lucide#3638</a></li> <li>feat(icons): Add <code>motorbike</code> icon by <a href="https://github.com/jamiemlaw"><code>@jamiemlaw</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3371">lucide-icons/lucide#3371</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/lucide-icons/lucide/compare/0.544.0...0.545.0">https://github.com/lucide-icons/lucide/compare/0.544.0...0.545.0</a></p> <h2>Version 0.544.0</h2> <h2>What's Changed</h2> <ul> <li>docs: update lucide-static documentation about raw string imports by <a href="https://github.com/pascalduez"><code>@pascalduez</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3524">lucide-icons/lucide#3524</a></li> <li>feat(icons): added <code>ev-charger</code> icon by <a href="https://github.com/UsamaKhan"><code>@UsamaKhan</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/2781">lucide-icons/lucide#2781</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/pascalduez"><code>@pascalduez</code></a> made their first contribution in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3524">lucide-icons/lucide#3524</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/lucide-icons/lucide/compare/0.543.0...0.544.0">https://github.com/lucide-icons/lucide/compare/0.543.0...0.544.0</a></p> <h2>Version 0.543.0</h2> <h2>What's Changed</h2> <ul> <li>feat(preview-comment): put x-ray at top if there are more than 7 changed icons to prevent them from being cut of by <a href="https://github.com/jguddas"><code>@jguddas</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3589">lucide-icons/lucide#3589</a></li> <li>fix(icons): changed <code>church</code> icon by <a href="https://github.com/karsa-mistmere"><code>@karsa-mistmere</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/2971">lucide-icons/lucide#2971</a></li> <li>chore(metadata): Added tags to <code>messages-square</code> by <a href="https://github.com/jamiemlaw"><code>@jamiemlaw</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3529">lucide-icons/lucide#3529</a></li> <li>fix(icons): Optimise <code>bug</code> icons by <a href="https://github.com/jamiemlaw"><code>@jamiemlaw</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3574">lucide-icons/lucide#3574</a></li> <li>fix(icons): changed list/text & derived icons by <a href="https://github.com/karsa-mistmere"><code>@karsa-mistmere</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3568">lucide-icons/lucide#3568</a></li> <li>fix(icons): changed <code>panel-top-bottom-dashed</code> icon by <a href="https://github.com/jguddas"><code>@jguddas</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3584">lucide-icons/lucide#3584</a></li> <li>fix(icons): changed <code>message-square-quote</code> icon by <a href="https://github.com/jguddas"><code>@jguddas</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3550">lucide-icons/lucide#3550</a></li> <li>fix(meta): added tag to <code>ship</code> metadata by <a href="https://github.com/jguddas"><code>@jguddas</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3559">lucide-icons/lucide#3559</a></li> <li>fix(meta): add tags to <code>id-card-lanyard</code> metadata by <a href="https://github.com/jguddas"><code>@jguddas</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3534">lucide-icons/lucide#3534</a></li> <li>fix(icons): changed <code>calendar-cog</code> icon by <a href="https://github.com/jguddas"><code>@jguddas</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3583">lucide-icons/lucide#3583</a></li> <li>chore(deps): bump astro from 5.5.2 to 5.13.2 by <a href="https://github.com/dependabot"><code>@dependabot</code></a>[bot] in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3564">lucide-icons/lucide#3564</a></li> <li>feat(packages): add new package for flutter by <a href="https://github.com/vqh2602"><code>@vqh2602</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3536">lucide-icons/lucide#3536</a></li> <li>feat(icons): added <code>house-heart</code> icon by <a href="https://github.com/danielbayley"><code>@danielbayley</code></a> in <a href="https://redirect.github.com/lucide-icons/lucide/pull/3239">lucide-icons/lucide#3239</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/lucide-icons/lucide/compare/0.542.0...0.543.0">https://github.com/lucide-icons/lucide/compare/0.542.0...0.543.0</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/lucide-icons/lucide/commit/1cfb3ff70e26f0deb5476c909381620d77ff702f"><code>1cfb3ff</code></a> chore(deps-dev): bump vite from 6.3.5 to 6.3.6 (<a href="https://github.com/lucide-icons/lucide/tree/HEAD/packages/lucide-react/issues/3611">#3611</a>)</li> <li>See full diff in <a href="https://github.com/lucide-icons/lucide/commits/0.545.0/packages/lucide-react">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…3796) # What does this PR do? Remove usage of deprecated `Message` from Safety apis ## Test Plan CI
…lamastack#3794) Applies the same pattern from llamastack#3777 to embeddings and vector_stores.create() endpoints. This should _not_ be a breaking change since (a) our tests were already using the `extra_body` parameter when passing in to the backend (b) but the backend probably wasn't extracting the parameters correctly. This PR will fix that. Updated APIs: `openai_embeddings(), openai_create_vector_store(), openai_create_vector_store_file_batch()`
## Test Plan added unit tests
Fixed CI job to check the correct directory for file changes Artifacts are now stored in multiple directories not just ./tests/integration/recordings Signed-off-by: Derek Higgins <[email protected]>
llamastack#3802) # What does this PR do? 2 main changes: 1. Remove `provider_id` requirement in call to vector stores and 2. Removes "register first embedding model" logic - Now forces embedding model id as required on Vector Store creation Simplifies the UX for OpenAI to: ```python vs = client.vector_stores.create( name="my_citations_db", extra_body={ "embedding_model": "ollama/nomic-embed-text:latest", } ) ``` <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* --> --------- Signed-off-by: Francisco Javier Arceo <[email protected]>
# What does this PR do? This commit migrates the authentication system from python-jose to PyJWT to eliminate the dependency on the archived rsa package. The migration includes: - Refactored OAuth2TokenAuthProvider to use PyJWT's PyJWKClient for clean JWKS handling - Removed manual JWKS fetching, caching and key extraction logic in favor of PyJWT's built-in functionality The new implementation is cleaner, more maintainable, and follows PyJWT best practices while maintaining full backward compatibility. ## Test Plan Unit tests. Auth CI. --------- Signed-off-by: Sébastien Han <[email protected]>
# What does this PR do? This PR fixes issues with the WatsonX provider so it works correctly with LiteLLM. The main problem was that WatsonX requests failed because the provider data validator didn’t properly handle the API key and project ID. This was fixed by updating the WatsonXProviderDataValidator and ensuring the provider data is loaded correctly. The openai_chat_completion method was also updated to match the behavior of other providers while adding WatsonX-specific fields like project_id. It still calls await super().openai_chat_completion.__func__(self, params) to keep the existing setup and tracing logic. After these changes, WatsonX requests now run correctly. ## Test Plan The changes were tested by running chat completion requests and confirming that credentials and project parameters are passed correctly. I have tested with my WatsonX credentials, by using the cli with `uv run llama-stack-client inference chat-completion --session` --------- Signed-off-by: Sébastien Han <[email protected]> Co-authored-by: Sébastien Han <[email protected]>
…mbed-text-v1.5 in Llama Stack (llamastack#3183) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> The purpose of this PR is to replace the Llama Stack's default embedding model by nomic-embed-text-v1.5. These are the key reasons why Llama Stack community decided to switch from all-MiniLM-L6-v2 to nomic-embed-text-v1.5: 1. The training data for [all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2#training-data) includes a lot of data sets with various licensing terms, so it is tricky to know when/whether it is appropriate to use this model for commercial applications. 2. The model is not particularly competitive on major benchmarks. For example, if you look at the [MTEB Leaderboard](https://huggingface.co/spaces/mteb/leaderboard) and click on Miscellaneous/BEIR to see English information retrieval accuracy, you see that the top of the leaderboard is dominated by enormous models but also that there are many, many models of relatively modest size whith much higher Retrieval scores. If you want to look closely at the data, I recommend clicking "Download Table" because it is easier to browse that way. More discussion info can be founded [here](llamastack#2418) <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Closes llamastack#2418 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* --> 1. Run `./scripts/unit-tests.sh` 2. Integration tests via CI wokrflow --------- Signed-off-by: Sébastien Han <[email protected]> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Francisco Arceo <[email protected]> Co-authored-by: Sébastien Han <[email protected]>
…3807) # What does this PR do? Updates CONTRIBUTING.md with the following changes: - Use Python 3.12 (and why) - Use pre-commit==4.3.0 - Recommend using -v with pre-commit to get detailed info about why it is failing if it fails. - Instructs users to go to the docs/ directory before rebuilding the docs (it doesn't work unless you do that). Signed-off-by: Bill Murdock <[email protected]>
) # What does this PR do? As discussed on discord, we do not need to reinvent the wheel for telemetry. Instead we'll lean into the canonical OTEL stack. Logs/traces/metrics will still be sent via OTEL - they just won't be stored on, queried through Stack. This is the first of many PRs to remove telemetry API from Stack. 1) removed webmethod decorators to remove from API spec 2) removed tests as @iamemilio is adding them on otel directly. ## Test Plan
…ric embedding models for NVIDIA Inference Provider (llamastack#3804) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> Previously, the NVIDIA inference provider implemented a custom `openai_embeddings` method with a hardcoded `input_type="query"` parameter, which is required by NVIDIA asymmetric embedding models([https://github.com/llamastack/llama-stack/pull/3205](https://github.com/llamastack/llama-stack/pull/3205)). Recently `extra_body` parameter is added to the embeddings API ([https://github.com/llamastack/llama-stack/pull/3794](https://github.com/llamastack/llama-stack/pull/3794)). So, this PR updates the NVIDIA inference provider to use the base `OpenAIMixin.openai_embeddings` method instead and pass the `input_type` through the `extra_body` parameter for asymmetric embedding models. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* --> Run the following command for the ```embedding_model```: ```nvidia/llama-3.2-nv-embedqa-1b-v2```, ```nvidia/nv-embedqa-e5-v5```, ```nvidia/nv-embedqa-mistral-7b-v2```, and ```snowflake/arctic-embed-l```. ``` pytest -s -v tests/integration/inference/test_openai_embeddings.py --stack-config="inference=nvidia" --embedding-model={embedding_model} --env NVIDIA_API_KEY={nvidia_api_key} --env NVIDIA_BASE_URL="https://integrate.api.nvidia.com" --inference-mode=record ```
…ck#3803) # What does this PR do? Enables automatic embedding model detection for vector stores and by using a `default_configured` boolean that can be defined in the `run.yaml`. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan - Unit tests - Integration tests - Simple example below: Spin up the stack: ```bash uv run llama stack build --distro starter --image-type venv --run ``` Then test with OpenAI's client: ```python from openai import OpenAI client = OpenAI(base_url="http://localhost:8321/v1/", api_key="none") vs = client.vector_stores.create() ``` Previously you needed: ```python vs = client.vector_stores.create( extra_body={ "embedding_model": "sentence-transformers/all-MiniLM-L6-v2", "embedding_dimension": 384, } ) ``` The `extra_body` is now unnecessary. --------- Signed-off-by: Francisco Javier Arceo <[email protected]>
…stack#3811) # What does this PR do? Support reading embedding model and dimensions from metadata for vector store ## Test Plan Unit Tests
…metadata keys (llamastack#3813) # Add support for Google Gemini `gemini-embedding-001` embedding model and correctly registers model type MR message created with the assistance of Claude-4.5-sonnet This resolves llamastack#3755 ## What does this PR do? This PR adds support for the `gemini-embedding-001` Google embedding model to the llama-stack Gemini provider. This model provides high-dimensional embeddings (3072 dimensions) compared to the existing `text-embedding-004` model (768 dimensions). Old embeddings models (such as text-embedding-004) will be deprecated soon according to Google ([Link](https://developers.googleblog.com/en/gemini-embedding-available-gemini-api/)) ## Problem The Gemini provider only supported the `text-embedding-004` embedding model. The newer `gemini-embedding-001` model, which provides higher-dimensional embeddings for improved semantic representation, was not available through llama-stack. ## Solution This PR consists of three commits that implement, fix the model registration, and enable embedding generation: ### Commit 1: Initial addition of gemini-embedding-001 Added metadata for `gemini-embedding-001` to the `embedding_model_metadata` dictionary: ```python embedding_model_metadata: dict[str, dict[str, int]] = { "text-embedding-004": {"embedding_dimension": 768, "context_length": 2048}, "gemini-embedding-001": {"embedding_dimension": 3072, "context_length": 2048}, # NEW } ``` **Issue discovered:** The model was not being registered correctly because the dictionary keys didn't match the model IDs returned by Gemini's API. ### Commit 2: Fix model ID matching with `models/` prefix Updated both dictionary keys to include the `models/` prefix to match Gemini's OpenAI-compatible API response format: ```python embedding_model_metadata: dict[str, dict[str, int]] = { "models/text-embedding-004": {"embedding_dimension": 768, "context_length": 2048}, # UPDATED "models/gemini-embedding-001": {"embedding_dimension": 3072, "context_length": 2048}, # UPDATED } ``` **Root cause:** Gemini's OpenAI-compatible API returns model IDs with the `models/` prefix (e.g., `models/text-embedding-004`). The `OpenAIMixin.list_models()` method directly matches these IDs against the `embedding_model_metadata` dictionary keys. Without the prefix, the models were being registered as LLMs instead of embedding models. ### Commit 3: Fix embedding generation for providers without usage stats Fixed a bug in `OpenAIMixin.openai_embeddings()` that prevented embedding generation for providers (like Gemini) that don't return usage statistics: ```python # Before (Line 351-354): usage = OpenAIEmbeddingUsage( prompt_tokens=response.usage.prompt_tokens, # ← Crashed with AttributeError total_tokens=response.usage.total_tokens, ) # After (Lines 351-362): if response.usage: usage = OpenAIEmbeddingUsage( prompt_tokens=response.usage.prompt_tokens, total_tokens=response.usage.total_tokens, ) else: usage = OpenAIEmbeddingUsage( prompt_tokens=0, # Default when not provided total_tokens=0, # Default when not provided ) ``` **Impact:** This fix enables embedding generation for **all** Gemini embedding models, not just the newly added one. ## Changes ### Modified Files **`llama_stack/providers/remote/inference/gemini/gemini.py`** - Line 17: Updated `text-embedding-004` key to `models/text-embedding-004` - Line 18: Added `models/gemini-embedding-001` with correct metadata **`llama_stack/providers/utils/inference/openai_mixin.py`** - Lines 351-362: Added null check for `response.usage` to handle providers without usage statistics ## Key Technical Details ### Model ID Matching Flow 1. `list_provider_model_ids()` calls Gemini's `/v1/models` endpoint 2. API returns model IDs like: `models/text-embedding-004`, `models/gemini-embedding-001` 3. `OpenAIMixin.list_models()` (line 410) checks: `if metadata := self.embedding_model_metadata.get(provider_model_id)` 4. If matched, registers as `model_type: "embedding"` with metadata; otherwise registers as `model_type: "llm"` ### Why Both Keys Needed the Prefix The `text-embedding-004` model was already working because there was likely separate configuration or manual registration handling it. For auto-discovery to work correctly for **both** models, both keys must match the API's model ID format exactly. ## How to test this PR Verified the changes by: 1. **Model Auto-Discovery**: Started llama-stack server and confirmed models are auto-discovered from Gemini API 2. **Model Registration**: Confirmed both embedding models are correctly registered and visible ```bash curl http://localhost:8325/v1/models | jq '.data[] | select(.provider_id == "gemini" and .model_type == "embedding")' ``` **Results:** - ✅ `gemini/models/text-embedding-004` - 768 dimensions - `model_type: "embedding"` - ✅ `gemini/models/gemini-embedding-001` - 3072 dimensions - `model_type: "embedding"` 3. **Before Fix (Commit 1)**: Models appeared as `model_type: "llm"` without embedding metadata 4. **After Fix (Commit 2)**: Models correctly identified as `model_type: "embedding"` with proper metadata 5. **Generate Embeddings**: Verified embedding generation works ```bash curl -X POST http://localhost:8325/v1/embeddings \ -H "Content-Type: application/json" \ -d '{"model": "gemini/models/gemini-embedding-001", "input": "test"}' | \ jq '.data[0].embedding | length' ```
…lamastack#3810) This PR updates the Conversation item related types and improves a couple critical parts of the implemenation: - it creates a streaming output item for the final assistant message output by the model. until now we only added content parts and included that message in the final response. - rewrites the conversation update code completely to account for items other than messages (tool calls, outputs, etc.) ## Test Plan Used the test script from llamastack/llama-stack-client-python#281 for this ``` TEST_API_BASE_URL=http://localhost:8321/v1 \ pytest tests/integration/test_agent_turn_step_events.py::test_client_side_function_tool -xvs ```
…ck#3521) Fixed KeyError when chunks don't have document_id in metadata or chunk_metadata. Updated logging to safely extract document_id using getattr and RAG memory to handle different document_id locations. Added test for missing document_id scenarios. Fixes issue llamastack#3494 where /v1/vector-io/insert would crash with KeyError. Fixed KeyError when chunks don't have document_id in metadata or chunk_metadata. Updated logging to safely extract document_id using getattr and RAG memory to handle different document_id locations. Added test for missing document_id scenarios. # What does this PR do? Fixes a KeyError crash in `/v1/vector-io/insert` when chunks are missing `document_id` fields. The API was failing even though `document_id` is optional according to the schema. Closes llamastack#3494 ## Test Plan **Before fix:** - POST to `/v1/vector-io/insert` with chunks → 500 KeyError - Happened regardless of where `document_id` was placed **After fix:** - Same request works fine → 200 OK - Tested with Postman using FAISS backend - Added unit test covering missing `document_id` scenarios
# What does this PR do? ## Test Plan <img width="1506" height="653" alt="image" src="https://github.com/user-attachments/assets/6c28b8e8-effe-41ab-8e31-72482c05662d" />
…ack#3817) We were generating "FunctionToolCall" items even for MCP (and file-search, etc.) server-side calls. ID mismatches, etc. galore.
…ack#3808) # What does this PR do? - Removed sqlite sink from telemetry config. - Removed related code - Updated doc related to telemetry ## Test Plan CI
…lamastack#3819) Handle a base case when no stored messages exist because no Response call has been made. ## Test Plan ``` ./scripts/integration-tests.sh --stack-config server:ci-tests \ --suite responses --inference-mode record-if-missing --pattern test_conversation_responses ```
# What does this PR do? Have closed the previous PR due to merge conflicts with multiple PRs Addressed all comments from llamastack#3768 (sorry for carrying over to this one) ## Test Plan Added UTs and integration tests
Wanted to re-enable Responses CI but it seems to hang for some reason due to some interactions with conversations_store or responses_store. ## Test Plan ``` # library client ./scripts/integration-tests.sh --stack-config ci-tests --suite responses # server ./scripts/integration-tests.sh --stack-config server:ci-tests --suite responses ```
c2952b6    to
    018b84c      
    Compare
  
    Its unused Signed-off-by: Derek Higgins <[email protected]>
  
    Sign up for free
    to join this conversation on GitHub.
    Already have an account?
    Sign in to comment
  
      
  Add this suggestion to a batch that can be applied as a single commit.
  This suggestion is invalid because no changes were made to the code.
  Suggestions cannot be applied while the pull request is closed.
  Suggestions cannot be applied while viewing a subset of changes.
  Only one suggestion per line can be applied in a batch.
  Add this suggestion to a batch that can be applied as a single commit.
  Applying suggestions on deleted lines is not supported.
  You must change the existing code in this line in order to create a valid suggestion.
  Outdated suggestions cannot be applied.
  This suggestion has been applied or marked resolved.
  Suggestions cannot be applied from pending reviews.
  Suggestions cannot be applied on multi-line comments.
  Suggestions cannot be applied while the pull request is queued to merge.
  Suggestion cannot be applied right now. Please check back later.
  
    
  
    
No description provided.