Skip to main content

How to evaluate on a specific dataset version

Recommended reading

Before diving into this content, it might be helpful to read the guide on versioning datasets. Additionally, it might be helpful to read the guide on fetching examples.

You can take advantage of the fact that evaluate allows passing in an iterable of examples to evaluate on a particular version of a dataset. Simply use list_examples / listExamples to fetch examples from a particular version tag using as_of / asOf.

from langsmith import evaluate

latest_data=client.list_examples(dataset_name=toxic_dataset_name, as_of="latest")

results = evaluate(
lambda inputs: label_text(inputs["text"]),
data=latest_data,
evaluators=[correct_label],
experiment_prefix="Toxic Queries",
)

Was this page helpful?


You can leave detailed feedback on GitHub.