Langfuse is an open source LLM engineering platform that helps teams trace LLM calls, monitor performance, and debug issues in their AI applications. TrueFoundry’s AI Gateway integrates seamlessly with Langfuse, providing comprehensive observability for all your LLM interactions.

What is TrueFoundry AI Gateway?

TrueFoundry AI Gateway is a unified interface that provides access to multiple AI models with advanced features for control, visibility, security, and cost optimization in your Generative AI applications. It offers seamless integration with popular observability tools like Langfuse.

What is Langfuse?

Langfuse is an open source LLM engineering platform that provides:
  • LLM Tracing: Detailed execution traces for debugging and monitoring
  • Performance Analytics: Token usage, latency metrics, and cost tracking
  • Prompt Management: Version control and optimization for prompts
  • Evaluation Tools: LLM-as-a-judge evaluations and custom metrics

Prerequisites

Before integrating Langfuse with TrueFoundry, ensure you have:
  1. TrueFoundry Account: Create a Truefoundry account with atleast one model provider and generate a Personal Access Token by following the instructions in Generating Tokens
  2. Langfuse Account: Sign up for a free Langfuse Cloud account or self-host Langfuse

Integration Guide

Step 1: Install Dependencies

Install the required packages for TrueFoundry and Langfuse integration:
pip install openai langfuse

Step 2: Set Up Environment Variables

Configure your Langfuse API keys. Get these keys from your Langfuse project settings:
import os

# Langfuse Configuration
os.environ["LANGFUSE_PUBLIC_KEY"] = "pk-lf-..."
os.environ["LANGFUSE_SECRET_KEY"] = "sk-lf-..." 
os.environ["LANGFUSE_HOST"] = "https://cloud.langfuse.com"  # 🇪🇺 EU region
# os.environ["LANGFUSE_HOST"] = "https://us.cloud.langfuse.com"  # 🇺🇸 US region

# TrueFoundry Configuration
os.environ["TRUEFOUNDRY_API_KEY"] = "your-truefoundry-token"
os.environ["TRUEFOUNDRY_BASE_URL"] = "https://your-control-plane.truefoundry.cloud/api/llm"
Verify your Langfuse connection:
from langfuse import get_client

# Test Langfuse authentication
get_client().auth_check()

Step 3: Configure Langfuse OpenAI Drop-in Replacement

First, get the base URL and model name from your TrueFoundry AI Gateway:
  1. Navigate to AI Gateway Playground: Go to your TrueFoundry AI Gateway playground
  2. Access Unified Code Snippet: Use the unified code snippet
  3. Copy Base URL: You will get the base path from the unified code snippet
  4. Copy model name: You will get the model name from the same code snippet (ensure you use the same model name as written)
TrueFoundry playground showing unified code snippet with base URL and model name

Get Base URL from Unified Code Snippet

Use Langfuse’s OpenAI-compatible client to automatically trace all requests sent through TrueFoundry’s AI Gateway:
from langfuse.openai import OpenAI
import os

# Initialize OpenAI client with TrueFoundry Gateway
client = OpenAI(
    api_key=os.environ["TRUEFOUNDRY_API_KEY"],
    base_url=os.environ["TRUEFOUNDRY_BASE_URL"]  # Base URL from unified code snippet
)

Step 4: Run an Example

Execute a sample request to test the integration:
# Make a request through TrueFoundry Gateway with Langfuse tracing
response = client.chat.completions.create(
    model="openai-main/gpt-4o",  # Paste the model ID you copied from TrueFoundry Gateway
    messages=[
        {"role": "system", "content": "You are a helpful AI assistant specialized in explaining AI concepts."},
        {"role": "user", "content": "Why does an AI gateway help enterprises?"},
    ],
    max_tokens=500,
    temperature=0.7
)

print(response.choices[0].message.content)

# Ensure all traces are sent to Langfuse
langfuse = get_client()
langfuse.flush()

Step 5: View Traces in Langfuse

After running your code, log in to your Langfuse dashboard to view detailed traces including:
  • Request Parameters: Model, temperature, max tokens, and other configuration
  • Response Content: Full response text and metadata
  • Performance Metrics: Token usage, latency, and cost information
  • Gateway Information: TrueFoundry-specific routing and processing details
Langfuse trace dashboard showing LLM request details and performance metrics

Langfuse Trace Dashboard

Advanced Integration with Langfuse Python SDK

Enhance your observability by combining the automatic tracing with additional Langfuse features.

Using the @observe Decorator

The @observe() decorator automatically wraps your functions and adds custom attributes to traces:
from langfuse import observe, get_client

langfuse = get_client()

@observe()
def analyze_customer_query(query, customer_id):
    """Analyze customer query using TrueFoundry Gateway with full observability"""
    
    response = client.chat.completions.create(
        model="openai-main/gpt-4o",
        messages=[
            {"role": "system", "content": "You are a customer service AI assistant."},
            {"role": "user", "content": query},
        ],
        temperature=0.3
    )
    
    result = response.choices[0].message.content
    
    # Add custom metadata to the trace
    langfuse.update_current_trace(
        input={"query": query, "customer_id": customer_id},
        output={"response": result},
        user_id=customer_id,
        session_id=f"session_{customer_id}",
        tags=["customer-service", "truefoundry-gateway"],
        metadata={
            "model_used": "openai-main/gpt-4o",
            "gateway": "truefoundry",
            "query_type": "customer_support"
        },
        version="1.0.0"
    )
    
    return result

# Usage
result = analyze_customer_query("How do I reset my password?", "customer_123")

Using Context Manager

For more granular control, use context managers to wrap specific code sections:
from langfuse import get_client

langfuse = get_client()

def process_batch_requests(queries):
    """Process multiple queries with detailed tracing"""
    
    with langfuse.start_as_current_span(name="batch-processing") as span:
        results = []
        
        for i, query in enumerate(queries):
            # Process each query through TrueFoundry Gateway
            response = client.chat.completions.create(
                model="gpt-3.5-turbo",
                messages=[{"role": "user", "content": query}],
                temperature=0.5
            )
            
            results.append(response.choices[0].message.content)
        
        # Update the span with batch processing metadata
        span.update_trace(
            input={"queries": queries, "batch_size": len(queries)},
            output={"results": results},
            tags=["batch-processing", "truefoundry"],
            metadata={
                "total_queries": len(queries),
                "gateway": "truefoundry",
                "processing_mode": "batch"
            }
        )
        
        return results

# Ensure traces are sent
langfuse.flush()

Troubleshooting

Common Issues

  • Authentication Errors: Verify your TrueFoundry API key and Langfuse credentials
  • Missing Traces: Ensure langfuse.flush() is called in short-lived applications
  • Model Not Found: Check that the model is available in your TrueFoundry Gateway
  • Network Issues: Verify your TrueFoundry base URL is correctly formatted

Debug Mode

Enable debug logging for troubleshooting:
import logging
logging.basicConfig(level=logging.DEBUG)

Next Steps

With Langfuse integration enabled, explore these advanced features: Your TrueFoundry AI Gateway is now fully integrated with Langfuse for comprehensive LLM observability and optimization.