-
Notifications
You must be signed in to change notification settings - Fork 118
Models extractor #553
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Models extractor #553
Conversation
Signed-off-by: irar2 <[email protected]>
| // ModelInfo defines model's data returned from /v1/models API | ||
| type ModelInfo struct { | ||
| ID string `json:"id"` | ||
| Parent string `json:"parent,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parent field is not part of OpenAI standardization.
it's specific to vllm and might not work with other model servers.
I also don't think it's used (or should be used) anywhere.
I recommend removing this field.
OpenAI standard here:
https://platform.openai.com/docs/api-reference/models/list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments
- If not present, the
omitemptykicks in so I don't see the downside of having it. - For use cases that need the parent information for Base/LoRA relations, if it is not provided by model extraction then one must assume the base model name is provided elsewhere. There is currently no other source of truth...
I think it is fine to rely on vLLM specific for that.
- It can be treated as part of the "contract" (same as the case when other model servers are expected to provide the MSP metrics even if by a different name).
- configuration of data sources is per EPP so you can always not enable this for other model servers . This is valid usage as long as we use homogeneous model server in a pool (other code breaks as well when this is not the case...)
Signed-off-by: irar2 <[email protected]>
|
/hold |
Signed-off-by: Ira Rosen <[email protected]>
Signed-off-by: Ira Rosen <[email protected]>
Signed-off-by: Ira Rosen <[email protected]>
Signed-off-by: irar2 <[email protected]>
| } | ||
|
|
||
| // NewModelExtractor returns a new model extractor. | ||
| func NewModelExtractor() (*ModelExtractor, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: at least in theory, the plugin could have a name...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ModelExtractor is a plugin. A plugin has a type and an optional name.
The code does not support setting a plugin name and it should.
| } | ||
| } | ||
|
|
||
| ds := http.NewHTTPDataSource(cfg.Scheme, cfg.Path, cfg.InsecureSkipVerify, ModelsDataSourceType, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q; does NewHTTPDataSource validate the scheme?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, there is only a check if it's https
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we use the scheme passed in by the user it should at least sanitize it to ensure it's one one of a known set of acceptable values (e.g., "http" and "https").
Can be in this PR or separate adding scheme validation to the HTTPDataSource
|
/lgtm overall looks good. minor comments left so placing a hold. Leaving to your discretion if you want to amend or cancel the hold to allow merging as-is |
Signed-off-by: irar2 <[email protected]>
This PR adds an ability to collect information from /v1/models and store it in endpoint's attributes.
Closes #466