Self-hosting LangSmith on Kubernetes
Self-hosting LangSmith is an add-on to the Enterprise Plan designed for our largest, most security-conscious customers. See our pricing page for more detail, and contact us at sales@langchain.dev if you want to get a license key to trial LangSmith in your environment.
This guide will walk you through the process of deploying LangSmith to a Kubernetes cluster. We will use Helm to install LangSmith and its dependencies.
We've successfully tested LangSmith on the following Kubernetes distributions:
- Google Kubernetes Engine (GKE)
- Amazon Elastic Kubernetes Service (EKS)
- Azure Kubernetes Service (AKS)
- OpenShift
- Minikube and Kind (for development purposes)
Prerequisites
Ensure you have the following tools/items ready. Some items are marked optional but :
-
A working Kubernetes cluster that you can access via
kubectl
. Your cluster should have the following minimum requirements:- Recommended: At least 4 vCPUs, 16GB Memory available
- You may need to tune resource requests/limits for all of our different services based off of organization size/usage
- Valid Dynamic PV provisioner or PVs available on your cluster. You can verify this by running:
kubectl get storageclass
The output should show at least one storage class with a provisioner that supports dynamic provisioning. For example:
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
gp2 (default) kubernetes.io/aws-ebs Delete WaitForFirstConsumer false 161d - Recommended: At least 4 vCPUs, 16GB Memory available
-
Helm
brew install helm
-
LangSmith License Key
- You can get this from your Langchain representative. Contact us at sales@langchain.dev for more information.
-
SSL(optional)
- This should be attachable to a load balancer that will be provisioned by your cloud provider. This will be used for the frontend service.
-
OpenAI API Key(optional).
- Used for natural language search feature(beta). Can specify OpenAI key in browser as well for the playground feature.
-
OAuth Configuration(optional).
- You can configure oauth using the
values.yaml
file. You will need to provide aclient_id
andclient_issuer_url
for your OAuth provider. - Note, we do rely on the OIDC Authorization Code with PKCE flow. We currently support almost anything that is OIDC compliant however Google does not support this flow.
- Without OAuth, you will not be able to create users or organizations.
- You can configure oauth using the
-
External Postgres(optional).
- You can configure external postgres using the
values.yaml
file. You will need to provide connection parameters for your postgres instance. - If using a schema other than public, ensure that you do not have any other schemas with the pgcrypto extension enabled, or you must include that in your search path.
- Note: We do only officially support Postgres versions >= 14.
- You can configure external postgres using the
-
External Redis(optional).
- You can configure external redis using the
values.yaml
file. You will need to provide a connection url for your redis instance. - Currently, we do not support using Redis with TLS. We will be supporting this shortly.
- We only official support Redis versions >= 6.
- You can configure external redis using the
Configure your Helm Charts:
- Create a new file called
langsmith_config.yaml
. This should have a similar structure to thevalues.yaml
file in the LangSmith Helm Chart repository. Only include the values you want to override to avoid having to update the file every time the chart is updated. - Override any values in the file. Refer to the documentation for the LangSmith Helm Chart to see all configurable values. Some values we recommend setting:
- Resources
- SSL(If on EKS or some other cloud provider)
- Add an annotation to the
frontend.service
object to tell your cloud provider to provision a load balancer with said certificate attached
- Add an annotation to the
- OpenAI Api Key
- Images
- Oauth
An example bare minimum config file langsmith_config.yaml
:
config:
langsmithLicenseKey: ""
You can also see some example configurations in the examples directory.
Deploying to Kubernetes:
-
Verify that you can connect to your Kubernetes cluster(note: We highly suggest installing into an empty namespace)
-
Run
kubectl get pods
Output should look something like:
kubectl get pods ⎈ langsmith-eks-2vauP7wf 21:07:46
No resources found in default namespace.
-
-
Ensure you have the Langchain Helm repo added. (skip this step if you are using local charts)
helm repo add langchain https://langchain-ai.github.io/helm/ "langchain" has been added to your repositories
-
Run
helm install langsmith langchain/langsmith --values langsmith_config.yaml --namespace <your-namespace> --version <version>
Output should look something like:
NAME: langsmith
LAST DEPLOYED: Fri Sep 17 21:08:47 2021
NAMESPACE: langsmith
STATUS: deployed
REVISION: 1
TEST SUITE: None -
Run
kubectl get pods
Output should now look something like:langsmith-backend-6ff46c99c4-wz22d 1/1 Running 0 3h2m
langsmith-frontend-6bbb94c5df-8xrlr 1/1 Running 0 3h2m
langsmith-hub-backend-5cc68c888c-vppjj 1/1 Running 0 3h2m
langsmith-playground-6d95fd8dc6-x2d9b 1/1 Running 0 3h2m
langsmith-postgres-0 1/1 Running 0 9h
langsmith-queue-5898b9d566-tv6q8 1/1 Running 0 3h2m
langsmith-redis-0 1/1 Running 0 9h
Validate your deployment:
-
Run
kubectl get services
Output should look something like:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
langsmith-backend ClusterIP 172.20.140.77 <none> 1984/TCP 35h
langsmith-frontend LoadBalancer 172.20.253.251 <external ip> 80:31591/TCP 35h
langsmith-hub-backend ClusterIP 172.20.112.234 <none> 1985/TCP 35h
langsmith-playground ClusterIP 172.20.153.194 <none> 3001/TCP 9h
langsmith-postgres ClusterIP 172.20.244.82 <none> 5432/TCP 35h
langsmith-redis ClusterIP 172.20.81.217 <none> 6379/TCP 35h -
Curl the external ip of the
langsmith-frontend
service:curl <external ip>/api/tenants
[{"id":"00000000-0000-0000-0000-000000000000","has_waitlist_access":true,"created_at":"2023-09-13T18:25:10.488407","display_name":"Personal","config":{"is_personal":true,"max_identities":1},"tenant_handle":"default"}]% -
Visit the external ip for the
langsmith-frontend
service on your browserThe LangSmith UI should be visible/operational
Using LangSmith
Now that LangSmith is running, you can start using it to trace your code. You can find more information on how to use self-hosted LangSmith in the Self-Hosted Usage Guide.
Frequently Asked Questions:
How can we upgrade our application?
- We plan to release new minor versions of the LangSmith application every 6 weeks. This will include release notes and all changes should be backwards compatible. To upgrade, you will need to follow the upgrade instructions in the Helm README and run a
helm upgrade langsmith --values <values file>
How can we back up our application?
- Currently, we rely on PVCs/PV to power storage for our application. We strongly encourage setting up
Persistent Volume
backups or moving to a managed service forPostgres
to support disaster recovery
How does load balancing work/ingress work?
- Currently, our application spins up one load balancer using a k8s service of type
LoadBalancer
for our frontend. If you do not want to set up a load balancer you can simply port-forward the frontend and use that as your external ip for the application. We also have an option for the chart to provision an ingress resource for the application.
How can we authenticate to the application?
- Currently, our self-hosted solution supports oauth as an authn solution. Note, we do offer a no-auth solution but highly recommend setting up oauth before moving into production.
How can I use External Postgres
or Redis
?
- You can configure external postgres or redis using the external sections in the
values.yaml
file. You will need to provide the connection url/params for the database/redis instance. Look at the configuration above example for more information.
What networking configuration is needed for the application?
Our deployment only needs egress for a few things:
- Fetching images (If mirroring your images, this may not be needed)
- Talking to any LLMs
- Talking to any external services you may have configured
- Fetching OAuth information Your VPC can set up rules to limit any other access. Note: We require the X-Tenant-Id to be allowed to be passed through to the backend service. This is used to determine which tenant the request is for.
What resources should we allocate to the application?
- We recommend at least 4 vCPUs and 16GB of memory for our application.
- We have some default resources set in our
values.yaml
file. You can override these values to tune resource usage for your organization. - If the metrics server is enabled in your cluster, we also recommend enabling autoscaling on all deployments.