Learn about AI observability, ML and LLM evaluation and MLOps with our in-depth guides.
course
LLM evaluations for AI product teams
Building an LLM-powered product? Sign up for our free course on LLM evaluations for AI product teams. A gentle introduction to evaluating LLM-powered apps, no coding knowledge required.
How to evaluate the quality of generative outputs and LLM-based systems? In this guide, we break down different evaluation methods and metrics for AI-powered products.
How to maintain ML models once after you deploy them, and what exactly to prepare for? In this guide, we look into the key concepts that relate to production ML model operations.
"ML monitoring" can mean many things. Are you tracking service latency? Model accuracy? Data quality? This guide organizes everything one can look at in a single framework.
We ran an experiment to help build an intuition on how popular drift detection methods behave. In this guide, we share the key takeaways and the code to run the tests on your data.
Monitoring embedding drift is relevant for the production use of LLM and NLP models. We ran experiments to compare 5 drift detection methods. Here is what we found.
How to evaluate the quality of a classification model? In this guide, we break down different machine learning metrics for binary and multi-class problems.
By clicking “Accept”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.