How to use LLM-as-a-judge to evaluate LLM systems

How do you evaluate the quality of an LLM-powered application, like a chatbot or AI agent? One of the ways is to use another LLM to assess the outputs of your AI system. This approach is called "LLM-as-a-judge." Request our free on-demand webinar to learn what an LLM judge is and how to create, tune, and assess LLM judges.

What we will cover:

What are LLM evaluations and when do you need them? Offline and online evals.

What is "LLM-as-a-judge"?

How to create an LLM judge in 5 simple steps?

What makes an effective evaluation prompt?

How to apply an LLM judge and evaluate its performance?

Request your access

We will send you the link to the video recording.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

team

About the speaker

Elena Samuylova

Co-founder and CEO

Evidently AI

Elena Samuylova is a CEO and Co-founder at Evidently AI, the company behind Evidently, an open-source framework for ML and LLM evaluation and observability with over 20 million downloads.

‍

She has been active in the applied ML space for over 10 years. Previously, she co-founded and served as a CPO of an industrial AI startup, implementing machine learning for production optimization for global metal and chemical companies. Prior to that, she led business development at Yandex Data Factory, an enterprise AI division of Yandex. She focused on delivering ML-based solutions to retail, banking, telecom, and other industries. In 2018, Elena was named 50 Women in Product Europe by Product Management Festival.

‍