|
| 1 | +# FAQ |
| 2 | + |
| 3 | +Here we will try to answer some of the most common questions about drift detection and the Frouros library. |
| 4 | + |
| 5 | +## What is the difference between *concept drift* and *data drift*? |
| 6 | + |
| 7 | +Concept drift refers to changes in the underlying concept being modeled, such as changes in the relationship between |
| 8 | +the input features and the target variable. It can be caused by changes in the conditional probability $P(y|X)$ with or |
| 9 | +without a change in $P(X)$. Data drift, on the other hand, refers to changes in the distribution of the input features |
| 10 | +$P(X)$, such as changes in the feature distributions over time. It focuses on detecting when the incoming data no longer |
| 11 | +resembles the data the model was trained on, potentially leading to decreased performance or reliability. |
| 12 | + |
| 13 | +## What is the difference between *out-of-distribution* detection and *data drift* detection? |
| 14 | + |
| 15 | +Out-of-distribution detection focuses on identifying samples that fall outside the training distribution, often used |
| 16 | +to detect anomalies or novel data. It aims to detect instances that differ significantly from the data the model was |
| 17 | +trained on. Data drift detection, on the other hand, is concerned with identifying shifts or changes in the |
| 18 | +distribution of the data over time. |
| 19 | + |
| 20 | +## How can I detect *concept drift* without having access to the ground truth labels at inference time? |
| 21 | + |
| 22 | +In cases where ground truth labels are not available at inference time or the verification latency is high, it may not |
| 23 | +be possible to directly detect concept drift using traditional methods. In such cases, it may be necessary to use |
| 24 | +alternative techniques, such as data drift detection, to monitor changes in the feature distributions and identify |
| 25 | +potential drift. By monitoring the feature distributions, it may be possible to detect when the incoming data no |
| 26 | +longer resembles the data the model was trained on, even in the absence of ground truth labels. |
| 27 | + |
| 28 | +## Why do I need to use a *drift* detector? |
| 29 | + |
| 30 | +One of the main mistakes when deploying a machine learning model for consumption is to assume that the data used for |
| 31 | +inference will come from the same distribution as the data on which the model was trained, i.e., that the data will be |
| 32 | +stationary. It may also be the case that the data used at inference time is still similar to those used for training, |
| 33 | +but the concept of what was learned in the first instance has changed over time, making the model obsolete in terms of |
| 34 | +performance. |
| 35 | + |
| 36 | +Drift detectors make it possible to monitor model performance or feature distributions to detect significant deviations |
| 37 | +that can cause model performance decay. By using them, it is possible to know when it is necessary to replace the |
| 38 | +current model with a new one trained on more recent data. |
| 39 | + |
| 40 | +## Is *model drift* the same as *concept drift*? |
| 41 | + |
| 42 | +Model drift is a term used to describe the degradation of a model's performance over time. This can be caused by a |
| 43 | +variety of factors, including concept drift, data drift, or other issues such as model aging. Concept drift, on the |
| 44 | +other hand, refers specifically to changes in the underlying concept being modeled, such as changes in the relationship |
| 45 | +between the input features and the target variable. While concept drift can lead to model drift, model drift can also be |
| 46 | +caused by other factors and may not always be directly related to changes in the underlying concept. |
| 47 | + |
| 48 | +## What actions should I take if *drift* is detected in my model? |
| 49 | + |
| 50 | +If drift is detected in your model, it is important to take action to address the underlying cause of the drift. |
| 51 | +This may involve retraining the model on more recent data, updating the model's features or architecture, or taking |
| 52 | +other steps to ensure that the model remains accurate and reliable. In some cases, it may also be necessary to |
| 53 | +re-evaluate the model's performance and consider whether it is still suitable for its intended use case. |
| 54 | + |
| 55 | +## Can Frouros be integrated with popular machine learning frameworks such as TensorFlow or PyTorch? |
| 56 | + |
| 57 | +Yes, Frouros is designed to be compatible with any machine learning frameworks such as TensorFlow or PyTorch. It is |
| 58 | +framework-agnostic and can be used with any machine learning model or pipeline. |
| 59 | + |
| 60 | +For instance, we provide an [example](./examples/data_drift/MMD_advance.html) that shows how to integrate Frouros with a PyTorch model to detect data |
| 61 | +drift for a computer vision use case. In addition, there is an [example](./examples/concept_drift/DDM_advance.html) that shows how to integrate Frouros with |
| 62 | +scikit-learn to detect concept drift in a streaming manner. |
| 63 | + |
| 64 | +## How frequently should I run *drift* detection checks in my machine learning pipeline? |
| 65 | + |
| 66 | +The frequency of drift detection checks will depend on the specific use case and the nature of the data being |
| 67 | +processed. In general, it is a good practice to run drift detection checks regularly, such as after each batch of |
| 68 | +data or at regular intervals, to ensure that any drift is detected and addressed in a timely manner. |
| 69 | + |
| 70 | +## What are some common causes of *drift* in machine learning models? |
| 71 | + |
| 72 | +Drift in machine learning models can be caused by a variety of factors, including changes in the underlying concept |
| 73 | +being modeled, changes in the distribution of the input features, changes in the relationship between the input |
| 74 | +features and the target variable, and other issues such as model aging or degradation. It is important to monitor |
| 75 | +models for drift and take action to address any detected drift to maintain model accuracy and reliability. |
| 76 | + |
| 77 | +## How can I contribute to the development of Frouros or report issues? |
| 78 | + |
| 79 | +The [contribute section](./contribute.html#how-to-contribute) provides information on how to contribute to the development of Frouros, |
| 80 | +including guidelines for reporting issues, submitting feature requests, and contributing code or documentation. |
| 81 | + |
| 82 | +## Does Frouros provide visualization tools for *drift* detection results? |
| 83 | + |
| 84 | +Frouros does not currently provide built-in visualization tools for drift detection results, but it is planned to |
| 85 | +include them in future releases. |
0 commit comments