LangChain is a framework for developing applications powered by large language models (LLMs). It provides a comprehensive suite of tools and integrations that streamline the entire lifecycle of LLM applications, from development to deployment and monitoring.
Modular Components: Offers a range of building blocks including chains, agents, prompt templates, and memory modules that can be composed together to create complex LLM applications
Extensive Integrations: Supports integrations with various LLM providers, embedding models, vector stores, and external tools, facilitating seamless connectivity within the AI ecosystem
Production-Ready Tools: Includes LangGraph for building stateful agents and LangSmith for monitoring and evaluating applications, ensuring robust deployment and maintenance of LLM-powered solutions
Connect to TrueFoundry by updating the ChatOpenAI model in LangChain:
Copy
Ask AI
from langchain_openai import ChatOpenAITRUEFOUNDRY_PAT = "..." # Your TrueFoundry Personal Access TokenTRUEFOUNDRY_BASE_URL = "..." # Your TrueFoundry unified endpointllm = ChatOpenAI( api_key=TRUEFOUNDRY_PAT, base_url=TRUEFOUNDRY_BASE_URL, model="openai-main/gpt-4o" #similarly you can call any model from any model provider like anthropic, gemini)llm.invoke("What is the meaning of life, universe and everything?")
The request is routed through your TrueFoundry gateway to the specified model provider. TrueFoundry automatically handles authentication, load balancing, and logging.
Monitor your LangChain applications through TrueFoundry’s metrics tab:
With TrueFoundry’s AI gateway, you can monitor and analyze:
Performance Metrics: Track key latency metrics like Request Latency, Time to First Token (TTFS), and Inter-Token Latency (ITL) with P99, P90, and P50 percentiles
Cost and Token Usage: Gain visibility into your application’s costs with detailed breakdowns of input/output tokens and the associated expenses for each model
Usage Patterns: Understand how your application is being used with detailed analytics on user activity, model distribution, and team-based usage
Rate limit and Load balancing: You can set up rate limiting, load balancing and fallback for your models