📚 LLM-as-a-Judge: a Complete Guide on Using LLMs for Evaluations. Get your copy

Evidently

Evidently 0.1.35: Customize it! Choose the statistical tests, metrics, and plots to evaluate data drift and ML performance.

Last updated:

January 23, 2025

Published:

December 13, 2021

contents‍

Start testing your AI systems today

Get demo

We are excited to announce our latest release!

‍TL;DR: Now, you can easily customize the pre-built Evidently reports to add your metrics, statistical tests, or change the look of the dashboards with a bit of Python code.

Why do I need this?

Evidently provides a set of default reports to evaluate different aspects of production model performance, such as data drift.

Great defaults are helpful. They make it easy to start and run the tool when you don't want to customize every single thing from scratch. We'll continue improving them!

However, once you get used to the tool, you often want to change a thing or two to adapt it to your use case and workflow.

Evidently is an open-source Python library, so you could always change anything. But before, that would require reading the codebase and implementing the change entirely on your own.

Now, we made it much easier! Some frequently requested customization options are now available directly in the tool. For others, you'd still need to add some minimal Python code, but you can now follow simple instructions and examples.

Let's have a quick look at what you can do now!

You are reading a blog about an early Evidently release. This functionality has since been improved and simplified. You can read more about migrating to a single Reports object instead of Dashboards and JSON profiles and check out the current documentation for details.

Configure the default reports

Let's take the Data Drift report as an example. Now you can directly set the following options:

confidence to set the confidence level for the statistical tests.
drift_share to define the share of drifting features as a condition for the Dataset Drift.
nbinsx to define the number of bins in a histogram.
xbins to define the specific bin size.

You can also override the default statistical tests for data drift detection that Evidently runs based on the feature type. There are two ways to do that:

stattest_func lets you specify a statistical test that will be applied to all the features.
feature_stattest_func lets you define a statistical test applied to an individual feature.

With a couple of lines of code, you can add a function to run your own statistical test. Here is an example where we add the Anderson test instead of the Evidently defaults:

options = DataDriftOptions(num_target_stattest_func=anderson_stat_test, confidence=0.99,  nbinsx={'MedInc':15, 'HouseAge': 25, 'AveRooms':20})
dashboard = Dashboard(tabs=[DataDriftTab(), NumTargetDriftTab()], options=[options])

We suggest using statistical tests from scipy or statsmodels or implementing your own tests.

There are similar customization options for the target drift report. Have a look at the complete documentation for more details. The options for the performance reports are on the way!

Choose between the short or long report

For each default Evidently report, we added a "short" version that includes only a limited set of widgets.

In some cases, the complete version of the report is too long, and you only want to get a quick summary. For example, you might not need to see the Error Bias widget in the regression performance report each time.

You can now get this shorter version by setting the verbose_level at 0.

dashboard = Dashboard(tabs=[RegressionPerformanceTab(verbose_level=0)])

List and order the widgets to display

While the previous option is a new (shorter) variation of the default report, this one gives you more flexibility!

You can list all the widgets from the particular report you want to include, and they will appear in the specified order.

dashboard = Dashboard(tabs=[RegressionPerformanceTab(verbose_level=0, include_widgets=[
    "Regression Model Performance Report.",
    "Reference: Error Distribution",
    "Current: Error Distribution",
])])

‍

Add your own widget

You can also add your implementation of any other test, metric, or visualization that you'd like to see on the dashboard. To do that, you should create a new Widget.

We suggest Plotly for visualization as this is what the Evidently uses, and it will provide a coherent look and interoperability.

The easiest way to create a custom Widget is by editing the existing one as a reference.

Custom Widget in Evidently — How about Evidently in green?

Here are a couple of ideas when you might want to add a custom Widget:

To make and "save" edits to the default Evidently widget. Maybe, you want to change the color of the dashboard? Move or add a line? Add a column to the table to show which statistical test was used? (Feel free to send us a Pull Request with this one!) As long as you can edit the Plotly code, everything is possible.
To add a custom metric or plot. There is a virtually unlimited list of machine learning metrics and visualizations one can build. You might want to keep using the Evidently reports, but add one or two plots that were missing in the defaults.
To add business metrics and KPIs. If you can calculate the business metric your model is impacting; it is a great idea to add them to the performance reports. However, all these metrics are calculated in a custom manner or imported directly. Now, you can create a slot in the report to visualize them.

Create your own report

Finally, you can now create your Tab. That is the ultimate degree of freedom: you are no longer limited to the default Evidently reports. You can now create your own report template that includes the exact combination of metrics, tests, and plots you want to see and reuse.

This functionality lets you create new Tabs (the dashboards displayed inside the Python notebook) and also export them as HTML reports. We will soon make it easy to add custom JSON profiles as well.

To add a custom Tab in Evidently, you need to create a new Tab class, give it a name, and list the widgets you want to include. You can list the existing Evidently widgets or the custom ones you are created, or both.

When might you want to create a custom Tab?

To use your custom Widgets as a part of the Evidently report. After creating a new Widget, you can add it to the existing components in the chosen Evidently report and save it under a new name. For example, you can add a couple more Widgets to analyze residuals to the Regression performance report and save it for regular use as a new Tab.
To combine Evidently widgets from different Tabs. For example, you might want to generate Data Drift and Prediction Drift dashboards altogether. You can now do that if generating them sequentially is not convenient for some reason.
To create a custom dashboard to explore a new aspect of the model performance. You can make your report on literally anything! Maybe, you want to beat us to creating an Evidently data quality report? You can have your version easily.

Even if you create an entirely new report using custom visualizations, it can still make sense as you will benefit from the underlying Evidently framework. You'll be able to generate your custom report with a couple of lines of code based on the same standardized data inputs, save it as an HTML file, display it in the different notebook environments, and treat the pre-built report as a debuggable object.

Show us what you'll build with Evidently!

A call for contributions: if you create your own custom Widgets and Tabs that you want to share with the world, come and tell us about it!

We are now working on the list of Community examples and will be thrilled to showcase the best reports others can benefit from.

What about JSON profiles?

Technically, you can already create a custom JSON Profile. However, it will exist as a separate object, disconnected from the visual Tabs.

We will soon add the functionality to convert a new custom Tab into a JSON profile. We suggest waiting a bit to enjoy this full-feature functionality when all custom Tabs and JSON profiles will be paired just like the default one.

But if you are impatient—feel free to test it already!

Has anything else changed?

Evidently got a bunch of new tests and got rid of a bunch of the old bugs.

Also, we changed the implementation of the column_mapping.

The reason is that some of the options you can now add to the reports were previously available in the column_mapping. Now, we have two separate configurations. Any settings related to the data inputs, such as defining feature types, belong to the column_mapping. Any configurations specific to the given dashboards belong to the options.