contentsβ
Our second report is released! Now, you can use Evidently to explore the changes in your target function and model predictions.
The Data Drift report preset we released earlier helps you detect the change in model features. Similarly, you can look at how your model predictions and target evolve.
When you know the ground truth (i.e. actual labels or values), the Target Drift report helps you explore the changes in the target function and understand how to adapt. If the ground truth is not available, you can use this new report to detect model decay in advance by evaluating prediction drift.
Basically, it is a way to answer the questions quickly:
βIf anything changes, you can further explore:
You are reading a blog about an early Evidently release. This functionality has since been improved, and additional Report types are available. You can browse available reports and check out the current documentation for details.
Once again, you need to prepare two Pandas DataFrames. The first one is your Reference dataset. Usually, this is the data you use to train your model. The second is the most recent Current data.
These data frames can include only the predicted values, only actual values, or both.
Based on what you predict, you can apply the report to:
Evidently will automatically parse the target type, and apply the appropriate statistical tests.
Simply call the Target Drift preset for your data, and all plots and insights will be served!
For Numerical Target Drift, your report will first show the comparison of target distributions:
If no drift is detected, most likely you are good to go. Nothing major is going on.
In this example, the tool has detected distribution drift (based on the two-sample Kolmogorov-Smirnov test at a 0.95 confidence level).
βWhen this happens, you'd want to see the main changes quickly. The tool does just that. It plots the target values from Reference and Current datasets for visual comparison. It also uses index or DateTime, if available.
βNext, the report shows the change in correlations between individual features and the target. By default, it will use the Pearson correlation coefficient. Even though not all relationships are linear, it can be helpful. A significant shift in a correlation coefficient can point to the source of change.
To dig deeper, it gives an overview of the behavior of each feature. This helps understand if they can explain the target shift.
If you have a reasonable number of features, you might want to go through them one by one.
In other cases, you might look:
This way, you can visually discover new data segments in specific features, and see how they associate with the target values.
For example, in a Boston house pricing dataset, you can see a new segment with values of TAX above 600 but the low value of the target (house price).
When you notice the change, you can interpret why it is happening, using your domain knowledge. Sometimes, there is an actual real-world change due to data and concept drift.
In other cases, what looks like a "new segment" can be a result of data quality issues, like measurement errors.
For Categorical Target Drift, the report looks slightly different.
First, it visually compares the target distributions and performs the test to detect drift. Since the target is categorical, for smaller samples it will use the chi-squared test.
For a classification problem with three classes, it can look like this.
This is an extreme case of the target shift. In the Reference dataset, we most frequently observe Target class = 0. In the Current dataset, the most popular Target class = 2.
When this drift is detected, you can again dig deeper into the behavior of the individual features. In this example, we took the well-known Iris dataset, so the feature list is rather short.
Visualisations demonstrate the feature values we tend to observe for different target classes. And, how it changes over time.
In this case, you can quickly understand that we deal with the classic example of data drift. In Current data, we face a new input distribution, different from the Reference dataset.
The "old" classes we already know behave the same way. But a new prevalent class comes with a new feature space. For several features, we see new values visibly aligned with the label "2". In the image above, these are the observations with the petal width more than 2cm.
Training the model on this new dataset will likely help solve the issue.
Here are our suggestions on when to use itβbest combined with our Data Drift report.
1. Before model retraining.β
Before feeding fresh data into the model, you might want to verify whether it even makes sense.
When nothing changes, most likely an update will not improve the performance.
When things change too much, blind retraining might not be enough.
If you observe specific changes in the data or target function, you might want to:
The report will help you decide if certain features or segments need attention.β
2. When you are debugging the model decay.β
If you already observe a drop in performance, this report can help you with debugging. It quickly shows you what has changed so that you know how to explain and address it.β
3. When you are flying blind, and no ground truth is available.β
Not having immediate feedback is no reason to ignore your model. You can run this report every time you generate batch predictions or otherwise schedule some checks.
By observing what exactly changes, you can anticipate data and concept drift. In some cases, you might decide to pause your model. For example, if you suspect a significant change or face a data quality issue.
Go to GitHub, and explore the tool in action using sample notebooks.
For the most recent update on the functionality, explore the docs and examples folder on GitHub.
If you like it, spread the word.
Sign up to the User newsletter to get updates on new features, integrations and code tutorials. No spam, just good old release notes.
Subscribe βΆ
If you have any questions, contact us on hello@evidentlyai.com. That's an early release, so send any bugs that way, or open an issue on Github.ββ