Introduce action as indicator to indicate what one wants to do with the model #300
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In other issues and pull requests the model class should be used as a simple struct, only holding scalar values and should not have any business logic effect by inheritance. This has been the case for embeddings, completions and other API calls like Whisper audio, and Dall-e images. Right now this makes it difficult to share different instances of the model class with various providers, although a lot of providers share the same models, and therefore likely the same capabilities. This eventually will reduce the amount of code you have to write to establish new providers that replicate existing API's with a well-known subset of models or add better support for custom models like OpenAI fine-tuned models or ollamas self-built models.
I see this action parameter / enum as an addition to the model. The action and model in here are like method and path in HTTP as the resource can be used in multiple different ways. Without this additional business logic effect by inheritance, it is now impossible to understand what you want to do with the model. Some models can differentiate between chat and chat completion. Right now the differentiation has been done by the model class inheritance. Some other tasks like embeddings for document indexing has been an expectation towards the model, that has been passed into the indexer. The indexer was not able to precisely use the model for an embedding. It had to trust the platform and model that it will be given an answer with an embedding result. Having the code be this unpredictable makes it difficult to maintain and work with. Same for usage of whisper and dall-e. They maybe need to get specific actions later as well.