Skip to main content

How to version datasets


Any time you add, update, or delete examples in your dataset, a new version of your dataset is created. This allows you to track changes to your dataset over time and to understand how your dataset has evolved.

By default, the version is defined by the timestamp of the change. When you click on a particular version of a dataset (by timestamp) in the "Examples" tab, you can see the state of the dataset at that point in time.

Version Datasets

Note that examples are read-only when viewing a past version of the dataset. You will also see the operations that were between this version of the dataset and the "latest" version of the dataset. Also, by default the latest version of the dataset is shown in the "Examples" tab and experiments from all versions are shown in the "Tests" tab.

In the "Tests" tab, you can see the results of tests run on the dataset at different versions.

Version Datasets

Tagging versions

You can also tag versions of your dataset to give them a more human-readable name. This can be useful for marking important milestones in your dataset's history.

For example, you might tag a version of your dataset as "prod" and use it to run tests against your LLM pipeline.

Tagging can be done in the UI by clicking on "+ Tag this version" in the "Examples" tab.

Tagging Datasets

You can also tag versions of your dataset using the SDK. Here's an example of how to tag a version of a dataset using the python SDK:

from langsmith import Client
fromt datetime import datetime

client = Client()

initial_time = datetime(2024, 1, 1, 0, 0, 0) # The timestamp of the version you want to tag

# You can tag a specific dataset version with a semantic name, like "prod"
dataset_name=toxic_dataset_name, as_of=initial_time, tag="prod"

Running experiments on specific versions of datasets

You can execute an experiment on a specific version of a dataset in the sdk by using the as_of parameter in list_examples. as_of

Here is an example of how to run an experiment on a specific version of a dataset using the python SDK:

from langsmith.evaluation import evaluate
from langsmith import Client

client = Client()

result = evaluate(
lambda inputs: label_query(**inputs), # Your target to evaluate, [defined elsewhere]
data=client.list_examples(dataset_name="my_dataset", as_of="prod"),
evaluators=[correct_label], # Your evaluators, [defined elsewhere]
experiment_prefix="dataset versioning example",

Was this page helpful?

You can leave detailed feedback on GitHub.