Many challenges hinder the creation of a high-quality, production-grade LLM applications, including:
📄️ Quick Start
In this walkthrough, you will evaluate a chain over a dataset of examples. To do so, you will:
Datasets are a collections of examples that can be used to evaluate or otherwise improve a chain, agent, or model. Examples are rows in the dataset, containing the inputs and (optionally) expected outputs for a given interaction. Below we will go over the current types of datasets as well as different ways to create them.
📄️ LangChain Evaluators
LangChain's evaluation module provides evaluators you can use as-is for common evaluation scenarios.
📄️ Custom Evaluators
In this guide, you will create a custom string evaluator for your agent. You can choose to use LangChain components or write your own custom evaluator from scratch.
This guide will walk you through feedback in LangSmith. For more end-to-end examples incorporating feedback into a workflow, see the LangSmith Cookbook.
🗃️ Additional Resources