Skip to content

Commit 0899f78

Browse files
fix: Avoid model_limits KeyError (backport #4060) (#4283)
# What does this PR do? It avoids model_limit KeyError while trying to get embedding models for Watsonx Closes #4059 ## Test Plan Start server with watsonx distro: ```bash llama stack list-deps watsonx | xargs -L1 uv pip install uv run llama stack run watsonx ``` Run ```python client = LlamaStackClient(base_url=base_url) client.models.list() ``` Check if there is any embedding model available (currently there is not a single one)<hr>This is an automatic backport of pull request #4060 done by [Mergify](https://mergify.com). Co-authored-by: Wojciech-Rebisz <[email protected]>
1 parent 9b68b38 commit 0899f78

File tree

1 file changed

+2
-6
lines changed
  • llama_stack/providers/remote/inference/watsonx

1 file changed

+2
-6
lines changed

llama_stack/providers/remote/inference/watsonx/watsonx.py

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -283,8 +283,8 @@ async def list_models(self) -> list[Model] | None:
283283
# ...
284284
provider_resource_id = f"{self.__provider_id__}/{model_spec['model_id']}"
285285
if "embedding" in functions:
286-
embedding_dimension = model_spec["model_limits"]["embedding_dimension"]
287-
context_length = model_spec["model_limits"]["max_sequence_length"]
286+
embedding_dimension = model_spec.get("model_limits", {}).get("embedding_dimension", 0)
287+
context_length = model_spec.get("model_limits", {}).get("max_sequence_length", 0)
288288
embedding_metadata = {
289289
"embedding_dimension": embedding_dimension,
290290
"context_length": context_length,
@@ -306,10 +306,6 @@ async def list_models(self) -> list[Model] | None:
306306
metadata={},
307307
model_type=ModelType.llm,
308308
)
309-
# In theory, I guess it is possible that a model could be both an embedding model and a text chat model.
310-
# In that case, the cache will record the generator Model object, and the list which we return will have
311-
# both the generator Model object and the text chat Model object. That's fine because the cache is
312-
# only used for check_model_availability() anyway.
313309
self._model_cache[provider_resource_id] = model
314310
models.append(model)
315311
return models

0 commit comments

Comments
 (0)