|
1 | 1 | # Model |
2 | 2 |
|
| 3 | +In the ML Cube Platform, a Model is a representation of the actual artificial intelligence model used for making predictions. The data used |
| 4 | +for its training usually represent the reference data distribution, while production data comprises the data on which the model |
| 5 | +performs inference. |
| 6 | +For more information about reference and production data see the [Data] page. |
3 | 7 |
|
| 8 | +A Model is uniquely associated with a [Task] and it can be created both through the WebApp and the Python SDK. |
| 9 | +Currently, we support only one model per Task. |
4 | 10 |
|
| 11 | +A Model is defined by a name and a version. The version is updated whenever the model is retrained, allowing to |
| 12 | +track the latest version of the model and the data used for its training. When predictions are uploaded to the platform, |
| 13 | +the model version needs to be appropriately specified, following the guidelines in the [Data Schema] page, to ensure that the |
| 14 | +predictions are associated to the correct model version. |
5 | 15 |
|
6 | | -[//]: # () |
7 | | -[//]: # () |
8 | | -[//]: # (What is additional probabilistic output?) |
| 16 | +!!! note |
| 17 | + You don't need to upload the **real** model on the Platform. We only require its training data and predictions. |
| 18 | + The entity you create on the Platform serves more as a placeholder for the actual model. For this reason, |
| 19 | + the ML cube Platform is considered *model agnostic*. |
9 | 20 |
|
10 | | -[//]: # () |
11 | | -[//]: # (What is metric?) |
12 | 21 |
|
13 | | -[//]: # () |
14 | | -[//]: # (What is suggestion type?) |
| 22 | +### RAG Model |
15 | 23 |
|
16 | | -[//]: # () |
17 | | -[//]: # (What is retraining cost?) |
| 24 | +RAG Tasks represent an exception to the model framework presented before. In this type of Tasks, the model |
| 25 | +is a Large Language Model (LLM), that is used to generate responses to user queries. The model is not trained on a specific dataset |
| 26 | +but is rather a pre-trained model, sometimes finetuned on custom domain data, which means that the classic process of training and |
| 27 | +retraining does not apply. |
18 | 28 |
|
19 | | -[//]: # () |
20 | | -[//]: # (What is retraining trigger?) |
| 29 | +To maintain a coherent Model definition across task types, the RAG model is also represented as a Model, |
| 30 | +but an update of its version represents an update of the reference data distribution and not necessarily |
| 31 | +a retraining of the model itself. Moreover, most of the attributes which will be described in the following sections |
| 32 | +are not applicable, as they are related to the retraining module, which is not available for RAG tasks. |
| 33 | + |
| 34 | +### Probabilistic output |
| 35 | + |
| 36 | +When creating a model, you can specify if you want to provide also the probabilistic output of the model along with the predictions. |
| 37 | +The probabilistic output represents the probability or confidence score associated with the model's predictions. If provided, |
| 38 | +the ML cube Platform will use this information to compute additional metrics and insights. |
| 39 | + |
| 40 | +It is optional and currently supported only for Classification and RAG tasks. If specified, the probabilistic output must be provided |
| 41 | +as a new column in the predictions file, following the guidelines in the [Data Schema] page. |
| 42 | + |
| 43 | +!!! example |
| 44 | + For example, Logistic Regression classification model provides both the probability of belonging to the positive class and the predicted class using a threshold. |
| 45 | + In this case, you can upload to ML cube Platform the predicted class as principal prediction and the probability as probabilistic output. |
| 46 | + |
| 47 | +### Model Metric |
| 48 | + |
| 49 | +A Model Metric represents the evaluation metric used to assess the performance of the model. |
| 50 | +It can both represent a performance or an error. The chosen metric will be used in the various views of the WebApp to |
| 51 | +provide insights on the model's performance and in the [Performance View](modules/retraining.md#performance-view) section |
| 52 | +of the Retraining Module. |
| 53 | + |
| 54 | +!!! note |
| 55 | + Note that model metrics can only be computed when target data are available. |
| 56 | + |
| 57 | +The available options are: |
| 58 | + |
| 59 | +| Metric | Task Type | |
| 60 | +|-------------------|----------------------------| |
| 61 | +| Accuracy | Classification tasks | |
| 62 | +| RMSE | Regression tasks | |
| 63 | +| R2 | Regression tasks | |
| 64 | +| Average Precision | For Object Detection tasks | |
| 65 | + |
| 66 | +RAG tasks have no metric, as in that case the model is an LLM for which classic definitions of metrics are not applicable. |
| 67 | + |
| 68 | +!!! warning |
| 69 | + Model Metrics should not be confused with [Monitoring Metrics](monitoring/index.md#monitoring-metrics), which are |
| 70 | + entities being monitoring by the ML cube Platform and not necessarily related to a Model. |
| 71 | + |
| 72 | +### Suggestion Type |
| 73 | + |
| 74 | +The Suggestion Type represents the type of suggestion that the ML cube Platform should provide when computing the |
| 75 | +[Retraining Dataset](modules/retraining.md#retraining-dataset). The available options are provided in the related section. |
| 76 | + |
| 77 | + |
| 78 | +### Retraining Cost |
| 79 | + |
| 80 | +The Retraining Cost represents the cost associated with retraining the model. This information is used by the Retraining Module |
| 81 | +to provide gain-cost analysis and insights on the retraining process. The cost is expressed in the same currency as the one used |
| 82 | +in the Task cost information. The default value is 0.0, which means that the cost is negligible. |
| 83 | + |
| 84 | +### Retrain Trigger |
| 85 | + |
| 86 | +You can associate a [Retrain Trigger] to your Model in order to enable the automatic initiation of your retraining pipeline |
| 87 | +from the ML cube Platform. More information on how to set up a retrain trigger can be found in the related section. |
| 88 | + |
| 89 | + |
| 90 | +[Task]: task.md |
| 91 | +[Data Schema]: data_schema.md#subrole |
| 92 | +[Retrain Trigger]: integrations/retrain_trigger.md |
| 93 | +[Data]: data.md |
0 commit comments