OpenTelemetry (OTEL) Support

AI Gateway is OpenTelemetry (OTEL) compliant, making it easy to integrate with modern observability tools and platforms. Both Tracing and Metrics are supported for deep observability and monitoring.


Tracing

OpenTelemetry tracing allows you to capture detailed traces of requests as they flow through the AI Gateway. This enables debugging, performance analysis, and end-to-end visibility.

How to Enable Tracing

Set the following environment variables:

  • ENABLE_OTEL_TRACING: Set to "true" to enable OTEL tracing.
  • OTEL_EXPORTER_OTLP_TRACES_ENDPOINT: The OTEL traces exporter endpoint (e.g., your collector or backend).
  • OTEL_EXPORTER_OTLP_TRACES_HEADERS: Any required headers for authentication/configuration.
  • OTEL_SERVICE_NAME: The name of your service as recognized by OTEL.

Example for TrueFoundry Tracing Project

ENABLE_OTEL_TRACING="true"
OTEL_SERVICE_NAME=<custom_service_name>
OTEL_EXPORTER_OTLP_TRACES_ENDPOINT="https://<tfy-control-plane-base-url>/api/otel/v1/traces"
OTEL_EXPORTER_OTLP_TRACES_HEADERS="Authorization=Bearer <TOKEN>,TFY-Tracing-Project=tracing-project:truefoundry/<PROJECT_NAME>/<custom_service_name>"

Example Trace Overview

AI Gateway - OpenTelemetry Tracing

Each row on the left represents a request to the endpoint, with the selected trace showing a detailed breakdown of the request and its spans.

Highlighted Span – chatCompletions (LLM):
The highlighted span is of type genai (LLM), capturing the lifecycle of a large language model (LLM) inference request.

LLM Request Data:

  • Model: openai-main/gpt-4o
  • Max tokens: 200
  • Top-p: 1
  • Temperature: 0.1

Prompt and Completion:
The system prompt, user question, and assistant’s response are all visible, providing full transparency into the LLM interaction.

Span Metadata:
Includes span name, service name, trace and span IDs, and OTEL scope.


AI Gateway Spans

The following sections describe the various spans available in the AI Gateway and their attributes.


Metrics

OpenTelemetry metrics allow you to export gateway metrics to any OTLP-compatible backend such as Prometheus, Grafana, Datadog, and more.

How to Enable OTEL Metrics

Set the following environment variables for the tfy-llm-gateway service:

  • ENABLE_OTEL_METRICS: Set to "true" to enable OTEL metrics.
  • OTEL_EXPORTER_OTLP_METRICS_ENDPOINT: The OTEL metrics exporter endpoint (e.g., your Prometheus or backend).
  • OTEL_EXPORTER_OTLP_METRICS_HEADERS: Any required headers for authentication/configuration.
  • LLM_GATEWAY_METADATA_LOGGING_KEYS: (Optional) For advanced filtering on x-tfy-metadata header.
ENABLE_OTEL_METRICS: 'true'
OTEL_EXPORTER_OTLP_METRICS_ENDPOINT: https://<prometheus-host>/api/v1/otlp/v1/metrics
OTEL_EXPORTER_OTLP_METRICS_HEADERS: <OTEL_EXPORTER_OTLP_HEADERS>
LLM_GATEWAY_METADATA_LOGGING_KEYS: '["customer"]' # Optional, for advanced filtering on x-tfy-metadata header

Add Dashboard Variable for Metadata

To filter metrics by customer, add a variable to your Grafana dashboard:

{
  "definition": "label_values(llm_gateway_metadata_customer)",
  "label": "llm_gateway_metadata_customer",
  "multi": true,
  "name": "llm_gateway_metadata_customer",
  "query": "label_values(llm_gateway_metadata_customer)",
  "refresh": 2,
  "type": "query"
}

Apply Metadata Filter in All Graph Queries

Update all Prometheus queries to include the customer metadata label:

Replace:

model_name=~"$model_name", username=~"$username", tenant_name=~"$tenant_name"

With:

model_name=~"$model_name", username=~"$username", tenant_name=~"$tenant_name", llm_gateway_metadata_customer=~"$llm_gateway_metadata_customer"

Compatible Backends

  • OpenTelemetry Collector
  • Jaeger
  • Datadog
  • New Relic
  • Any OTLP-compatible backend

Example: Enabling Both OTEL Tracing and Metrics

Set the following environment variables in your deployment to enable both tracing and metrics via OpenTelemetry:

# Enable OTEL Tracing
ENABLE_OTEL_TRACING="true"
OTEL_EXPORTER_OTLP_TRACES_ENDPOINT="<your-otel-collector-or-backend-url>/v1/traces"
OTEL_EXPORTER_OTLP_TRACES_HEADERS="<any-required-headers>"

# Enable OTEL Metrics
ENABLE_OTEL_METRICS="true"
OTEL_EXPORTER_OTLP_METRICS_ENDPOINT="<your-otel-collector-or-backend-url>/v1/metrics"
OTEL_EXPORTER_OTLP_METRICS_HEADERS="<any-required-headers>"

# Enable custom metadata logging (for advanced filtering in dashboards)
LLM_GATEWAY_METADATA_LOGGING_KEYS='["customer"]'