Skip to main content
The LLM Gateway delivers detailed request logs through tracing. Retrieve these logs using the Query Spans API.
To better understand tracing concepts like traces, spans, and how they work together, see our tracing overview.

Quickstart

  • Using TrueFoundry SDK
  • Using HTTP API

Setup the TrueFoundry SDK

To start querying request logs, install and configure the TrueFoundry SDK and CLI.
Follow the CLI Setup Guide for installation instructions and authentication steps.
Once setup is complete, you can use the SDK to query tracing data programmatically.

Fetch using TrueFoundry SDK

Each request to the LLM Gateway generates a trace—a timeline of everything that happened, from the incoming request to guardrails to the model call and any external APIs. Let’s pull the latest Gateway traces and see real data quickly.Fetch the latest LLM Gateway request logs:
You can get the tracing_project_fqn from the Fetch via API button on the Request Logs page
from truefoundry import client

# Fetch LLM Gateway request logs
spans = client.traces.query_spans(
    tracing_project_fqn="{tenant-name}:tracing-project:tfy-default"  # E.g. truefoundry:tracing-project:tfy-default",
    start_time="2025-01-21T00:00:00.000Z",
    application_names=["tfy-llm-gateway"],
)

for span in spans:
    print(span.span_name, span.span_attributes.get('tfy.span_type'))

Deep Dive: Inspect a Single Trace

Now that you’ve run a basic query, inspect one request end-to-end. The examples below fetch all spans for a specific trace_id, so you can see the full hierarchy (root request, guardrail processing, model span, and outbound HTTP calls).
from truefoundry import client
from truefoundry_sdk import SortDirection

# Fetch all spans for a specific trace ID with guardrail processing
trace_id = "019a047ee43577009b6bb5b6ab9477d2"
spans = client.traces.query_spans(
    tracing_project_fqn="{tenant-name}:tracing-project:tfy-default"  # E.g. truefoundry:tracing-project:tfy-default",
    start_time="2025-10-21T00:00:00.000Z",
    end_time="2025-10-21T23:59:59.999Z",
    trace_ids=["019a047ee43577009b6bb5b6ab9477d2"],
    application_names=["tfy-llm-gateway"],
    limit=200,
    sort_direction=SortDirection.DESC
)

for span in spans:
    print(span)
The following trace contains 5 spans that form a hierarchical relationship, demonstrating a request flow with PII redaction guardrail processing:

Chat completion request with guardrail processing span hierarchy

Span Hierarchy Breakdown

The following example demonstrates a complete trace with 5 spans that form a hierarchical relationship. Each span represents a different phase of the request processing, from the initial chat completion request through guardrail processing to the final model inference and network calls.
The ChatCompletion span with ID bddb6503c0eeb940 serves as the root span with no parent span ID (empty parent_span_id). This span represents the complete chat completion request lifecycle from the client’s perspective, capturing the total time from when the request enters the gateway until the response is sent back to the client. With a duration of 7.09 seconds, it provides the overall performance measurement for the entire request flow. The span includes the tfy.triggered_guardrail_fqns attribute showing which guardrails were triggered during processing.
The Guardrail span with ID d46e8d5202edc22c has a parent span “ChatCompletion Span (Root Span)” with ID bddb6503c0eeb940, representing the PII redaction guardrail processing. This span shows the guardrail configuration used.
The Guardrail Network Call span with ID fb07005b3c28a98b has a parent span “Guardrail Span” with ID d46e8d5202edc22c, representing the actual HTTP communication with the external guardrail service (AWS Bedrock). This span captures the network latency and external guardrail service processing time. With a duration of 0.48 seconds, it shows the time spent on the actual guardrail API call.
The Model span with ID de09be32ba8e0c37 has a parent span “ChatCompletion Span (Root Span)” with ID bddb6503c0eeb940, making it a sibling to the Guardrail span. This span contains all the detailed model metrics and represents the LLM model inference processing within the gateway. Notice how the input content has been redacted from I am sateesh. Hi to I am {NAME}. Hi, demonstrating the PII redaction working.
The Model Network Call span with ID 95794dcfbaad832a has a parent span “Model Span” with ID de09be32ba8e0c37, representing the actual HTTP communication with the external provider (OpenAI). This span captures pure network latency and external provider processing time. With a duration of 6.60 seconds, it shows the time spent on the actual API call to the external service.

Filter Request Logs

While fetching all Gateway request logs is useful for general monitoring, you’ll often want to filter logs based on specific criteria such as user identity, model names, etc. You can achieve this using the filters parameter in the query_spans method. The API supports the following common filter types:
  • Span fields filtering: Filter logs by span fields such as spanName, traceId, spanId, etc. See API Reference to understand the supported options for spanFieldName and operator.
  • Span attributes filtering: Filter logs by span attributes, e.g., using tfy.model.name for model name. See Attributes section to understand the supported options for spanAttributeKey
  • Gateway request metadata filtering: Filter logs based on Custom Metadata keys and values that you passed to Gateway requests.
The following example demonstrates how to retrieve request logs for OpenAI models submitted by a specific user, where custom metadata matches certain values:
from truefoundry import client

# Fetch LLM Gateway request logs with filters
spans = client.traces.query_spans(
    tracing_project_fqn="{tenant-name}:tracing-project:tfy-default"  # E.g. truefoundry:tracing-project:tfy-default",
    start_time="2025-01-21T00:00:00.000Z",
    application_names=["tfy-llm-gateway"],
    filters=[
        {"spanFieldName": "createdBySubjectSlug", "operator": "EQUAL", "value": "user@example.com"},
        {"spanAttributeKey": "tfy.model.name", "operator": "STRING_CONTAINS", "value": "openai"},
        {"gatewayRequestMetadataKey": "foo", "operator": "IN", "value": ["bar1", "bar2"]},
    ]
)

for span in spans:
    print(span.span_name, span.span_attributes.get('tfy.span_type'))
The tfy-llm-gateway application name is crucial for filtering spans specifically from the LLM Gateway. This ensures you only get request logs related to your LLM operations, excluding other application traces in your tracing project.

Common Use Cases

Define the time range for your query. Use ISO 8601 format for timestamps.
from truefoundry import client
from truefoundry_sdk import SortDirection

spans = client.traces.query_spans(
    tracing_project_fqn="{tenant-name}:tracing-project:tfy-default"  # E.g. truefoundry:tracing-project:tfy-default",
    start_time="2025-10-08T00:00:00.000Z",
    end_time="2025-10-08T23:59:59.999Z",
    application_names=["tfy-llm-gateway"],
    limit=200,
    sort_direction=SortDirection.DESC
)

# Process all spans across all pages
for span in spans:
    print(span.span_name, span.duration, span.span_attributes.get("tfy.span_type"))
A root span is the top-level span in a trace hierarchy that has no parent span.Define the time range for your query. Use ISO 8601 format for timestamps.
from truefoundry import client
from truefoundry_sdk import SortDirection

spans = client.traces.query_spans(
    tracing_project_fqn="{tenant-name}:tracing-project:tfy-default"  # E.g. truefoundry:tracing-project:tfy-default",
    start_time="2025-10-08T00:00:00.000Z",
    end_time="2025-10-08T23:59:59.999Z",
    application_names=["tfy-llm-gateway"],
    parent_span_ids=[""],
    limit=200,
    sort_direction=SortDirection.DESC
)

# Process all spans across all pages
for span in spans:
    print(span.span_name, span.duration, span.span_attributes.get("tfy.span_type"))
Define the time range for your query. Use ISO 8601 format for timestamps.
from truefoundry import client
from truefoundry_sdk import SortDirection

spans = client.traces.query_spans(
    tracing_project_fqn="{tenant-name}:tracing-project:tfy-default"  # E.g. truefoundry:tracing-project:tfy-default",
    start_time="2025-10-08T00:00:00.000Z",
    end_time="2025-10-08T23:59:59.999Z",
    application_names=["tfy-llm-gateway"],
    created_by_subject_types=["virtualaccount"],
    limit=200,
    sort_direction=SortDirection.DESC
)

# Process all spans across all pages
for span in spans:
    print(span.span_name, span.duration, span.span_attributes.get("tfy.span_type"))
Define the time range for your query. Use ISO 8601 format for timestamps.
from truefoundry import client
from truefoundry_sdk import SortDirection

spans = client.traces.query_spans(
    tracing_project_fqn="{tenant-name}:tracing-project:tfy-default"  # E.g. truefoundry:tracing-project:tfy-default",
    start_time="2025-10-08T00:00:00.000Z",
    end_time="2025-10-08T23:59:59.999Z",
    application_names=["tfy-llm-gateway"],
    created_by_subject_slugs=["exampleaccount"],
    limit=200,
    sort_direction=SortDirection.DESC
)

# Process all spans across all pages
for span in spans:
    print(span.span_name, span.duration, span.span_attributes.get("tfy.span_type"))
Define the time range for your query. Use ISO 8601 format for timestamps.
from truefoundry import client
from truefoundry_sdk import SortDirection

spans = client.traces.query_spans(
    tracing_project_fqn="{tenant-name}:tracing-project:tfy-default"  # E.g. truefoundry:tracing-project:tfy-default",
    start_time="2025-10-08T00:00:00.000Z",
    end_time="2025-10-08T23:59:59.999Z",
    application_names=["tfy-llm-gateway"],
    created_by_subject_types=["user"],
    limit=200,
    sort_direction=SortDirection.DESC
)

# Process all spans across all pages
for span in spans:
    print(span.span_name, span.duration, span.span_attributes.get("tfy.span_type"))
Define the time range for your query. Use ISO 8601 format for timestamps.
from truefoundry import client
from truefoundry_sdk import SortDirection

spans = client.traces.query_spans(
    tracing_project_fqn="{tenant-name}:tracing-project:tfy-default"  # E.g. truefoundry:tracing-project:tfy-default",
    start_time="2025-10-08T00:00:00.000Z",
    end_time="2025-10-08T23:59:59.999Z",
    application_names=["tfy-llm-gateway"],
    created_by_subject_slugs=["example@email.com"],
    limit=200,
    sort_direction=SortDirection.DESC
)

# Process all spans across all pages
for span in spans:
    print(span.span_name, span.duration, span.span_attributes.get("tfy.span_type"))
Fetch spans of a specific traceId
from truefoundry import client
from truefoundry_sdk import SortDirection

spans = client.traces.query_spans(
    tracing_project_fqn="{tenant-name}:tracing-project:tfy-default"  # E.g. truefoundry:tracing-project:tfy-default",
    start_time="2025-10-08T00:00:00.000Z",
    end_time="2025-10-08T23:59:59.999Z",
    trace_ids=[
        "0199c25e124a70989b0455584fbbf7b7"
    ],
    application_names=["tfy-llm-gateway"],
    limit=200,
    sort_direction=SortDirection.DESC
)

# Process all spans across all pages
for span in spans:
    print(span.span_name, span.duration, span.span_attributes.get("tfy.span_type"))
Filter spans based on custom metadata keys and values that were passed to Gateway requests using the X-TFY-METADATA header.
from truefoundry import client
from truefoundry_sdk import SortDirection

spans = client.traces.query_spans(
    tracing_project_fqn="{tenant-name}:tracing-project:tfy-default"  # E.g. truefoundry:tracing-project:tfy-default",
    start_time="2025-10-08T00:00:00.000Z",
    end_time="2025-10-08T23:59:59.999Z",
    application_names=["tfy-llm-gateway"],
    filters=[
        {"gatewayRequestMetadataKey": "application", "operator": "EQUAL", "value": "booking-bot"},
        {"gatewayRequestMetadataKey": "environment", "operator": "IN", "value": ["staging", "production"]},
    ],
    limit=200,
    sort_direction=SortDirection.DESC
)

# Process all spans across all pages
for span in spans:
    print(span.span_name, span.span_attributes.get("tfy.span_type"))
Filter spans that have MCP in the span name.
from truefoundry import client
from truefoundry_sdk import SortDirection

spans = client.traces.query_spans(
    tracing_project_fqn="{tenant-name}:tracing-project:tfy-default"  # E.g. truefoundry:tracing-project:tfy-default",
    start_time="2025-10-08T00:00:00.000Z",
    end_time="2025-10-08T23:59:59.999Z",
    application_names=["tfy-llm-gateway"],
    filters=[
        {"spanFieldName": "spanName", "operator": "STRING_CONTAINS", "value": "MCP"},
    ],
    limit=200,
    sort_direction=SortDirection.DESC
)

# Process all spans across all pages
for span in spans:
    print(span.span_name, span.span_attributes.get("tfy.span_type"))
Filter spans by model name using the tfy.model.name span attribute filter.
from truefoundry import client
from truefoundry_sdk import SortDirection

spans = client.traces.query_spans(
    tracing_project_fqn="{tenant-name}:tracing-project:tfy-default"  # E.g. truefoundry:tracing-project:tfy-default",
    start_time="2025-10-08T00:00:00.000Z",
    end_time="2025-10-08T23:59:59.999Z",
    application_names=["tfy-llm-gateway"],
    filters=[
        {"spanAttributeKey": "tfy.model.name", "operator": "EQUAL", "value": "openai-main/gpt-4"},
    ],
    limit=200,
    sort_direction=SortDirection.DESC
)

# Process all spans across all pages
for span in spans:
    print(span.span_name, span.span_attributes.get("tfy.model.name"))

Understanding Span Attributes

Each span you query from LLM Gateway captures key request and model details. Recognizing these attributes helps you analyze and debug usage effectively.

Core Span Attributes

AttributeDescription
tfy.span_typeType of span, with possible values:
"ChatCompletion" - Complete chat request lifecycle
"Completion" - Text completion requests without chat context
"MCP" - Model Context Protocol server interactions and tool calls
"Rerank" - Document reranking operations for search relevance
"Embedding" - Vector embedding generation operations
"Model" - Actual LLM model inference processing
"AgentResponse" - Multi-tool agent orchestration workflows
"Guardrail" - Safety, compliance, and content validation checks
tfy.tracing_project_fqnFully qualified name of the tracing project
tfy.inputComplete input data sent to the model, mcp_server, guardrail, etc..
tfy.outputComplete output response from the model, mcp_server, guardrail, etc..
tfy.input_short_handAbbreviated version of the input for display purposes
tfy.error_messageError message if the request failed
tfy.prompt_version_fqnFQN of the prompt version used (if applicable)
tfy.prompt_variablesVariables used in prompt templating
tfy.triggered_guardrail_fqnsList of guardrails that were triggered during the request

Request Context Attributes

AttributeDescription
tfy.request.model_nameName of the model that was requested
tfy.request.created_by_subjectSubject (user/service account) that made the request
tfy.request.created_by_subject_teamsTeams associated with the requesting subject
tfy.request.metadataAdditional metadata associated with the request (e.g., {'foo': 'bar'})
tfy.request.conversation_idUnique identifier for the conversation (if part of a chat)

Model Attributes

AttributeDescription
tfy.model.idUnique identifier of the model
tfy.model.nameDisplay name of the model
tfy.model.fqnFully qualified name of the model
tfy.model.request_urlURL endpoint used for the model request
tfy.model.streamingWhether the request used streaming mode
tfy.model.request_typeType of request (e.g., "ChatCompletion", "Completion", "Embedding", "Rerank", "AgentResponse", "MCPGateway", "CreateModelResponse")

Model Performance Metrics

AttributeDescription
tfy.model.metric.time_to_first_token_in_msTime taken to receive the first token (streaming)
tfy.model.metric.latency_in_msTotal request latency in milliseconds
tfy.model.metric.input_tokensNumber of tokens in the model input
tfy.model.metric.output_tokensNumber of tokens in the model output
tfy.model.metric.cost_in_usdCost of the request in USD
tfy.model.metric.inter_token_latency_in_msAverage latency between tokens (streaming)

Load Balancing Attributes

AttributeDescription
applied_loadbalance_rule_idsIDs of load balancing rules that were applied (e.g., ['gpt-4-dev-load'])

Budget Control Attributes

AttributeDescription
applied_budget_rule_idsIDs of budget rules that were applied to this request (e.g., ['virtualaccount1-monthly-budget'])

Rate Limiting Attributes

AttributeDescription
applied_ratelimit_rule_idsIDs of all rate limiting rules that were applied (e.g., ['virtualaccount1-daily-ratelimit'])

MCP (Model Context Protocol) Server Attributes

AttributeDescription
tfy.mcp_server.idUnique identifier of the MCP server
tfy.mcp_server.nameDisplay name of the MCP server
tfy.mcp_server.urlURL endpoint of the MCP server
tfy.mcp_server.fqnFully qualified name of the MCP server
tfy.mcp_server.server_nameInternal name of the MCP server
tfy.mcp_server.methodMCP method that was called
tfy.mcp_server.primitive_nameName of the MCP primitive used
tfy.mcp_server.error_codeError code if the MCP call failed
tfy.mcp_server.is_tool_call_execution_errorWhether the error was from tool call execution

MCP Server Metrics

AttributeDescription
tfy.mcp_server.metric.latency_in_msLatency of the MCP server call in milliseconds
tfy.mcp_server.metric.number_of_toolsNumber of tools available in the MCP server

Guardrail Attributes

AttributeDescription
tfy.guardrail.idUnique identifier of the guardrail
tfy.guardrail.nameDisplay name of the guardrail
tfy.guardrail.fqnFully qualified name of the guardrail
tfy.guardrail.resultResult of the guardrail check (e.g., 'pass', 'mutate', 'flag')

Guardrail Applied Entity Attributes

AttributeDescription
tfy.guardrail.applied_on_entity.typeType of entity the guardrail was applied to
tfy.guardrail.applied_on_entity.idID of the entity
tfy.guardrail.applied_on_entity.nameName of the entity
tfy.guardrail.applied_on_entity.fqnFQN of the entity
tfy.guardrail.applied_on_entity.scopeScope of the entity

Guardrail Metrics

AttributeDescription
tfy.guardrail.metric.latency_in_msTime taken for the guardrail check in milliseconds

HTTP Response Attributes

AttributeDescription
http.response.status_codeHTTP status code of the response

References