You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
One of Llama Stack’s core design goals is runtime agnosticism, a valuable (and challenging) goal. Today, LLS runs in at least three distinct environments (please add if I’m missing one):
Developer desktop. Single user; the same person is both admin and consumer on a single node.
SaaS/hosted LLS. Self-managed in the cloud. Often “self-service” (admin=consumer), but can be split. Admin + consumer APIs are usually exposed on the same endpoint.
Kubernetes. Multi-user/tenant orchestration with a clean split between consumer APIs (LLS REST) and admin/config APIs (Kubernetes API with CRDs/ConfigMaps).
The data plane (routing and API behavior) translates well across all three. The real friction starts in the control plane, configuration and wiring:
SaaS gravitates toward CRUD/imperative admin endpoints living alongside consumer APIs.
Kubernetes embraces declarative intent and reconciliation toward a target state (eventual consistency, async).
Supporting both at the same time is a great challenge as manifested in #3809
I believe separating this configuration concern from the main LLS stack functionality has a great benefit for separating concerns and supporting all platforms at the best way possible.
In fact, the LLS Kubernetes operator, living in a separate repository actually goes into that direction and uses K8s as the configuration store on which the operator generates the one-and-only run.yaml that contains LLS target state.
My proposal would be to do the same for the SaaS use case: Having a separate repository, that implements the admin REST APIs and eventually creates the run.yaml file.
The duty of the core LLS instance would be then just to monitor this file and pick it up when changed (simple, and no distributed database involved)
So I'm opening the discussion whether this approach would match the SaaS platform use-case?
(edited to be more concise and removing boilerplate)
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
One of Llama Stack’s core design goals is runtime agnosticism, a valuable (and challenging) goal. Today, LLS runs in at least three distinct environments (please add if I’m missing one):
The data plane (routing and API behavior) translates well across all three. The real friction starts in the control plane, configuration and wiring:
Supporting both at the same time is a great challenge as manifested in #3809
I believe separating this configuration concern from the main LLS stack functionality has a great benefit for separating concerns and supporting all platforms at the best way possible.
In fact, the LLS Kubernetes operator, living in a separate repository actually goes into that direction and uses K8s as the configuration store on which the operator generates the one-and-only
run.yamlthat contains LLS target state.My proposal would be to do the same for the SaaS use case: Having a separate repository, that implements the admin REST APIs and eventually creates the
run.yamlfile.The duty of the core LLS instance would be then just to monitor this file and pick it up when changed (simple, and no distributed database involved)
So I'm opening the discussion whether this approach would match the SaaS platform use-case?
(edited to be more concise and removing boilerplate)
Beta Was this translation helpful? Give feedback.
All reactions