Skip to main content


In this guide we will go over some of the concepts that are important to understand when thinking about production logging and automations in LangSmith. A lot of these concepts build off of tracing concepts - it is recommended to read the Tracing Concepts documentation before.


A Run is a span representing a single unit of work or operation within your LLM application. This could be anything from single call to an LLM or chain, to a prompt formatting call, to a runnable lambda invocation. If you are familiar with OpenTelemetry, you can think of a run as a span.


A Trace is a collection of runs that are related to a single operation. For example, if you have a user request that triggers a chain, and that chain makes a call to an LLM, then to an output parser, and so on, all of these runs would be part of the same trace. If you are familiar with OpenTelemetry, you can think of a LangSmith trace as a collection of spans. Runs are bound to a trace by a unique trace ID.


A Filter is a rule that applies to all runs in a project. The default filter you see on the main page selects only top-level runs (IsRoot is true) but you can filter for sub-runs as well.

You can even filter for runs that are part of a trace with a particular attribute, for example filtering for runs with name "ChatOpenAI" that are part of a trace with user_score equal to 0.


A Thread is a sequence of traces representing a conversation. Each response is represented as its own trace, but these traces are linked together by being part of the same conversation.

You can track threads by attaching a special metadata key to runs (one of session_id, thread_id or conversation_id).

See this documentation for how to add metadata keys to a trace.

Monitoring Dashboard

Monitoring dashboards display key metrics over time for a configurable time window. We track LLM related statistics like latency, feedback, time-to-first-token, cost, and more. You can see these dashboard by going to the Monitors tab.


Rules represent actions that are taken on a set of runs. These are configured by a specifying a filter rule, a sampling rate, and an action to take. Currently, the supported actions are adding to dataset, adding to annotation queue, and sending to online evaluation.

An example of a rule could be, in plain English, "Run a 'vagueness' evaluator on 70% of root runs with a feedback score of 0 for the feedback key user_score"


Datasets are a way to collect examples, which are input-output pairs. You can use datasets for evaluation, as well as fine-tuning and few-shot prompting. For more information, see here

Annotation Queues

Annotation Queues are a user-friendly way to annotate a lot of traces.

Online Evaluation

Online evaluation is when an LLM is used to assign feedback to particular runs.

Was this page helpful?