-
Notifications
You must be signed in to change notification settings - Fork 182
Description
as of today, config API defines two main sections - plugins and scheduling-profiles.
In llm-d we've noticed that there is a use case for multiple plugins (more than one) that are using the same parameter. that means we need to define the same parameter twice in two different places, and if the values are not matching the behavior could be unexpected.
an example of a config file could look like this (hashBlockSize: 64 parameter is defined twice):
apiVersion: inference.networking.x-k8s.io/v1alpha1
kind: EndpointPickerConfig
plugins:
- type: prefix-cache-scorer
parameters:
hashBlockSize: 64
maxPrefixBlocksToMatch: 256
lruCapacityPerServer: 31250
- type: max-score-picker
parameters:
maxNumOfEndpoints: 1
- type: single-profile-handler
- type: my-plugin
parameters:
hashBlockSize: 64
schedulingProfiles:
- name: default
plugins:
- pluginRef: prefix-cache-scorer
- pluginRef: max-score-picker
as an enhancement, I would like to propose adding a parameters section and the usage of parameterRef which may be identical to pluginRef. the intention is to allow simple key value pairs and not complex structs (should be similar to env var).
so one could define the above config as follows:
apiVersion: inference.networking.x-k8s.io/v1alpha1
kind: EndpointPickerConfig
parameters:
- name: hashBlockSize
value: 64
plugins:
- type: prefix-cache-scorer
parameters:
parameterRef: hashBlockSize
maxPrefixBlocksToMatch: 256
lruCapacityPerServer: 31250
- type: max-score-picker
parameters:
maxNumOfEndpoints: 1
- type: single-profile-handler
- type: my-plugin
parameters:
parameterRef: hashBlockSize
schedulingProfiles:
- name: default
plugins:
- pluginRef: prefix-cache-scorer
- pluginRef: max-score-picker
doing the above gives the look and feel of using env vars that potentially could be read from different plugins.
it reduces the risk of configuration inconsistencies when there is a need to share parameters across plugins.