Data Purging for Compliance
This guide covers the various features available after data reaches LangSmith Cloud servers to help you achieve your privacy goals.
Data Retention
LangSmith provides automatic data retention capabilities to help with compliance and storage management. Data retention policies can be configured at the organization and project levels.
For detailed information about data retention configuration and management, please refer to the Data Retention concepts documentation.
Trace Deletes
You can use the API to complete trace deletes. The API supports two methods for deleting traces:
- By trace IDs and session ID: Delete specific traces by providing a list of trace IDs and their corresponding session ID (up to 1000 traces per request)
- By metadata: Delete traces across a workspace that match any of the specified metadata key-value pairs
For more details, refer to the API spec.
All trace deletions will delete related entities like feedbacks, aggregations, and stats across all data storages.
Deletion Timeline
Trace deletions are processed during non-peak usage times and are not instant, usually within a few hours. There is no confirmation of deletion - you'll need to query the data again to verify it has been removed.
Delete Specific Traces
To delete specific traces by their trace IDs from a single session:
curl -X POST "https://api.smith.langchain.com/api/v1/runs/delete" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"run_ids": ["trace-id-1", "trace-id-2", "trace-id-3"],
"session_id": "session-id-1"
}'
Delete by Metadata
When deleting by metadata:
- Accepts a
metadata
object of key/value pairs. KV pair matching uses an or condition. A trace will match if it has any of the key-value pairs specified in metadata (not all) - You don't need to specify a session id when deleting by metadata. Deletes will apply across the workspace.
To delete traces based on metadata across a workspace (matches any of the metadata key-value pairs):
curl -X POST "https://api.smith.langchain.com/api/v1/runs/delete" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"metadata": {
"user_id": "user123",
"environment": "staging"
}
}'
This will delete traces that have either user_id: "user123"
or environment: "staging"
in their metadata.
Remember that you can only schedule up to 1000 traces per session per request. For larger deletions, you'll need to make multiple requests.
Example Deletes
You can delete dataset examples self-serve via our API, which supports both soft and hard deletion methods depending on your data retention needs.
Hard deletes will permanently remove inputs, outputs, and metadata from ALL versions of the specified examples across the entire dataset history.
Deleting Examples is a Two-Step Process
For bulk operations, example deletion follows a two-step process:
1. Search for Examples by Metadata
Find all examples with matching metadata across all datasets in a workspace.
as_of
must be explicitly specified as a timestamp. Only examples created before theas_of
date will be returned
curl -X GET "https://api.smith.langchain.com/api/v1/examples?as_of=2024-01-01T00:00:00Z" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"metadata": {
"user_id": "user123",
"environment": "staging"
}
}'
This will return examples that have either user_id: "user123"
or environment: "staging"
in their metadata across all datasets in your workspace.
2. Hard Delete Examples
Once you have the example IDs, send a delete request. This will zero-out the inputs, outputs, and metadata from all versions of the dataset for that example.
- Specify example IDs and add
"hard_delete": true
to the query params of the request
curl -X DELETE "https://api.smith.langchain.com/api/v1/examples?hard_delete=true" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"example_ids": ["example-id-1", "example-id-2", "example-id-3"]
}'
Deletion Types
Soft Delete (Default)
- Creates tombstoned entries with NULL inputs/outputs in the dataset
- Preserves historical data and maintains dataset versioning
- Only affects the current version of the dataset
Hard Delete
- Permanently removes inputs, outputs, and metadata from ALL dataset versions
- Complete data removal when compliance requires zero-out across all versions
- Add
"hard_delete": true
to the query parameters
For more details, refer to the API spec.