unit#

langsmith.testing._internal.unit(*args: Any, **kwargs: Any) Callable#

Trace a pytest test case in LangSmith.

This decorator is used to trace a pytest test to LangSmith. It ensures that the necessary example data is created and associated with the test function. The decorated function will be executed as a test case, and the results will be recorded and reported by LangSmith.

Parameters:
  • id (-) – A unique identifier for the test case. If not provided, an ID will be generated based on the test function’s module and name.

  • output_keys (-) – A list of keys to be considered as the output keys for the test case. These keys will be extracted from the test function’s inputs and stored as the expected outputs.

  • client (-) – An instance of the LangSmith client to be used for communication with the LangSmith service. If not provided, a default client will be used.

  • test_suite_name (-) – The name of the test suite to which the test case belongs. If not provided, the test suite name will be determined based on the environment or the package name.

  • args (Any)

  • kwargs (Any)

Returns:

The decorated test function.

Return type:

Callable

Environment:
  • LANGSMITH_TEST_CACHE: If set, API calls will be cached to disk to

    save time and costs during testing. Recommended to commit the cache files to your repository for faster CI/CD runs. Requires the ‘langsmith[vcr]’ package to be installed.

  • LANGSMITH_TEST_TRACKING: Set this variable to the path of a directory
    to enable caching of test results. This is useful for re-running tests

    without re-executing the code. Requires the ‘langsmith[vcr]’ package.

Example

For basic usage, simply decorate a test function with @pytest.mark.langsmith. Under the hood this will call the test method:

import pytest


# Equivalently can decorate with `test` directly:
# from langsmith import test
# @test
@pytest.mark.langsmith
def test_addition():
    assert 3 + 4 == 7

Any code that is traced (such as those traced using @traceable or wrap_* functions) will be traced within the test case for improved visibility and debugging.

import pytest
from langsmith import traceable


@traceable
def generate_numbers():
    return 3, 4


@pytest.mark.langsmith
def test_nested():
    # Traced code will be included in the test case
    a, b = generate_numbers()
    assert a + b == 7

LLM calls are expensive! Cache requests by setting LANGSMITH_TEST_CACHE=path/to/cache. Check in these files to speed up CI/CD pipelines, so your results only change when your prompt or requested model changes.

Note that this will require that you install langsmith with the vcr extra:

pip install -U “langsmith[vcr]”

Caching is faster if you install libyaml. See https://vcrpy.readthedocs.io/en/latest/installation.html#speed for more details.

# os.environ["LANGSMITH_TEST_CACHE"] = "tests/cassettes"
import openai
import pytest
from langsmith import wrappers

oai_client = wrappers.wrap_openai(openai.Client())


@pytest.mark.langsmith
def test_openai_says_hello():
    # Traced code will be included in the test case
    response = oai_client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Say hello!"},
        ],
    )
    assert "hello" in response.choices[0].message.content.lower()

LLMs are stochastic. Naive assertions are flakey. You can use langsmith’s expect to score and make approximate assertions on your results.

import pytest
from langsmith import expect


@pytest.mark.langsmith
def test_output_semantically_close():
    response = oai_client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Say hello!"},
        ],
    )
    # The embedding_distance call logs the embedding distance to LangSmith
    expect.embedding_distance(
        prediction=response.choices[0].message.content,
        reference="Hello!",
        # The following optional assertion logs a
        # pass/fail score to LangSmith
        # and raises an AssertionError if the assertion fails.
    ).to_be_less_than(1.0)
    # Compute damerau_levenshtein distance
    expect.edit_distance(
        prediction=response.choices[0].message.content,
        reference="Hello!",
        # And then log a pass/fail score to LangSmith
    ).to_be_less_than(1.0)

The @test decorator works natively with pytest fixtures. The values will populate the “inputs” of the corresponding example in LangSmith.

import pytest


@pytest.fixture
def some_input():
    return "Some input"


@pytest.mark.langsmith
def test_with_fixture(some_input: str):
    assert "input" in some_input

You can still use pytest.parametrize() as usual to run multiple test cases using the same test function.

import pytest


@pytest.mark.langsmith(output_keys=["expected"])
@pytest.mark.parametrize(
    "a, b, expected",
    [
        (1, 2, 3),
        (3, 4, 7),
    ],
)
def test_addition_with_multiple_inputs(a: int, b: int, expected: int):
    assert a + b == expected

By default, each test case will be assigned a consistent, unique identifier based on the function name and module. You can also provide a custom identifier using the id argument:

import pytest
import uuid

example_id = uuid.uuid4()


@pytest.mark.langsmith(id=str(example_id))
def test_multiplication():
    assert 3 * 4 == 12

By default, all test inputs are saved as “inputs” to a dataset. You can specify the output_keys argument to persist those keys within the dataset’s “outputs” fields.

import pytest


@pytest.fixture
def expected_output():
    return "input"


@pytest.mark.langsmith(output_keys=["expected_output"])
def test_with_expected_output(some_input: str, expected_output: str):
    assert expected_output in some_input

To run these tests, use the pytest CLI. Or directly run the test functions.

test_output_semantically_close()
test_addition()
test_nested()
test_with_fixture("Some input")
test_with_expected_output("Some input", "Some")
test_multiplication()
test_openai_says_hello()
test_addition_with_multiple_inputs(1, 2, 3)