📚 LLM-as-a-Judge: a Complete Guide on Using LLMs for Evaluations. Get your copy

Evidently

7 new features at Evidently: ranking metrics, data drift on Spark, and more

Last updated:

January 23, 2025

Published:

December 14, 2023

contents‍

Start testing your AI systems today

Get demo

Did you miss some of the latest updates at Evidently open-source Python library? We summed up a few features we shipped recently in one blog.

All these features are available in Evidently 0.4.11 and above.

🛒 Ranking and RecSys metrics

You can now evaluate and monitor your ranking and recommendation models in Evidently.

Monitor ranking and recommendation models with Evidently AI

What’s cool about it?

We covered not only standard metrics like Normalized Discounted Cumulative Gain (NDCG) or Recall at top-K but also behavioral metrics like Serendipity or Popularity Bias.

Learn more:

Code example

🚦 Warnings in Test Suites

You can set Warnings for non-critical Tests in a Test suite. If you want to get a “Warning” instead of ‘Fail” for a particular test, set the “is_critical” parameter to False.

Test criticality for Evidently Test Suites

What’s cool about it?

You can flexibly design alerting and logging workflows by splitting the Tests into groups: for example, set alerts only on critical failures and treat the rest as informational reports.

📈 Monitoring test outcomes

Are you computing Test Suites on a cadence? You can now add a new type of monitoring panel to track the results of each Test Suite in time in the Evidently UI.

This is in addition to all the panels that help visualize the metric values. You can also choose which subset of tests to show together using tags. Say, you can add one monitoring panel to track failed data quality checks, another for data drift, and so on.

‍What’s cool about it?

You can choose a detailed view option. It will show not just the combined results but also a granular breakdown of all tests, such as which exact features drifted.

‍Learn more:

Live demo dashboard (pick the “adult dataset” example)

🏗 Near real-time monitoring

You can deploy an Evidently collector service to integrate with your ML service.

Near real-time ML monitoring with Evidently AI

In this scenario, you can POST your input data and model predictions directly from your ML service. The Evidently service will collect online events into batches, create Reports or Test Suites over them, and save them as snapshots you can later visualize in the monitoring UI.

What’s cool about it?

No need to write Python code or manage monitoring jobs: you can define the monitoring setup via a configuration file.

Learn more:

Code example
Video from our ML observability course that shows it in action

🛢️ Data drift on Spark

You can finally run data drift calculations on Spark.

Data drift calculation on Spark with Evidently AI

We currently support only some of the drift detection methods on Spark, but we’ll be adding more metrics over time. Do you know which metrics you’d want to work on Spark next? Open an issue on GitHub to tell us.

What’s cool about it?

If you deal with large datasets – your life is now much easier!

Learn more:

Code example

📊 Monitoring UI updates

Our Monitoring UI is getting better day by day!

You can now browse tags in the interface as you look for individual Reports or Test Suites, easily switch between different monitoring periods, view metadata, and more!

Learn more: