To better understand tracing concepts like traces, spans, and how they work together, see our tracing overview.Example: A chat completion request creates a span hierarchy: Chat Completion Span (parent) → Model Span (child, stores input/output tokens, metrics, costs) → Network Call Span (child, actual provider call).

Chat completion request span hierarchy
Setup the TrueFoundry CLI
Begin by installing the TrueFoundry SDK. Follow the CLI Setup guide for instructions.Common Use Cases
Fetch All spans for a time interval
Fetch All spans for a time interval
Define the time range for your query. Use ISO 8601 format for timestamps.
Fetch All Root spans for a time interval
Fetch All Root spans for a time interval
A root span is the top-level span in a trace hierarchy that has no parent span.Define the time range for your query. Use ISO 8601 format for timestamps.

Creating tracing projects with collaborators
Fetch All spans for virtual accounts
Fetch All spans for virtual accounts
Define the time range for your query. Use ISO 8601 format for timestamps.
Fetch All spans for a virtual account `exampleaccount`
Fetch All spans for a virtual account `exampleaccount`
Define the time range for your query. Use ISO 8601 format for timestamps.
Fetch All spans for users
Fetch All spans for users
Define the time range for your query. Use ISO 8601 format for timestamps.
Fetch All spans for a user with email example@email.com
Fetch All spans for a user with email example@email.com
Define the time range for your query. Use ISO 8601 format for timestamps.
Filter by Specific Trace ID
Filter by Specific Trace ID
Fetch spans of a specific traceId
Understanding Span Attributes
Each span you query from LLM Gateway captures key request and model details. Recognizing these attributes helps you analyze and debug usage effectively.Model Span Attributes Example
Model Span Attributes Example
This example shows the span attributes for a model request span (
tfy.span_type: "Model"
). Model spans capture the actual LLM inference call with complete input/output data, performance metrics (latency, token counts, cost), model configuration details, and error handling information. This span type is essential for monitoring model performance, tracking costs, and debugging inference issues.Agent Response Span Attributes Example
Agent Response Span Attributes Example
This example shows the span attributes for an agent response span (
tfy.span_type: "AgentResponse"
). Agent response spans capture the orchestration of multiple tools and MCP servers, including complete input/output data, network details, tool execution results, and how the agent combines outputs from different tools into a coherent response. This span type is crucial for understanding agent behavior, tool usage patterns, and debugging multi-step agent workflows.Core Span Attributes
Attribute | Description |
---|---|
tfy.span_type | Type of span, with possible values:"ChatCompletion" , "Completion" , "MCP" , "Rerank" , "Embedding" , "Model" , "AgentResponse" , "Guardrail" |
tfy.tracing_project_fqn | Fully qualified name of the tracing project |
tfy.input | Complete input data sent to the model, mcp_server, guardrail, etc.. |
tfy.output | Complete output response from the model, mcp_server, guardrail, etc.. |
tfy.input_short_hand | Abbreviated version of the input for display purposes |
tfy.error_message | Error message if the request failed |
tfy.prompt_version_fqn | FQN of the prompt version used (if applicable) |
tfy.prompt_variables | Variables used in prompt templating |
tfy.triggered_guardrail_fqns | List of guardrails that were triggered during the request |
Request Context Attributes
Attribute | Description |
---|---|
tfy.request.model_name | Name of the model that was requested |
tfy.request.created_by_subject | Subject (user/service account) that made the request |
tfy.request.created_by_subject_teams | Teams associated with the requesting subject |
tfy.request.metadata | Additional metadata associated with the request |
tfy.request.conversation_id | Unique identifier for the conversation (if part of a chat) |
Model Attributes
Attribute | Description |
---|---|
tfy.model.id | Unique identifier of the model |
tfy.model.name | Display name of the model |
tfy.model.fqn | Fully qualified name of the model |
tfy.model.request_url | URL endpoint used for the model request |
tfy.model.streaming | Whether the request used streaming mode |
tfy.model.request_type | Type of request (e.g., “chat”, “completion”) |
Model Performance Metrics
Attribute | Description |
---|---|
tfy.model.metric.time_to_first_token_in_ms | Time taken to receive the first token (streaming) |
tfy.model.metric.latency_in_ms | Total request latency in milliseconds |
tfy.model.metric.input_tokens | Number of tokens in the model input |
tfy.model.metric.output_tokens | Number of tokens in the model output |
tfy.model.metric.cost_in_usd | Cost of the request in USD |
tfy.model.metric.inter_token_latency_in_ms | Average latency between tokens (streaming) |
Load Balancing Attributes
Attribute | Description |
---|---|
applied_loadbalance_rule_ids | IDs of load balancing rules that were applied |
Budget Control Attributes
Attribute | Description |
---|---|
applied_budget_rule_ids | IDs of budget rules that were applied to this request |
Rate Limiting Attributes
Attribute | Description |
---|---|
applied_ratelimit_rule_ids | IDs of all rate limiting rules that were applied |
MCP (Model Context Protocol) Server Attributes
Attribute | Description |
---|---|
tfy.mcp_server.id | Unique identifier of the MCP server |
tfy.mcp_server.name | Display name of the MCP server |
tfy.mcp_server.url | URL endpoint of the MCP server |
tfy.mcp_server.fqn | Fully qualified name of the MCP server |
tfy.mcp_server.server_name | Internal name of the MCP server |
tfy.mcp_server.method | MCP method that was called |
tfy.mcp_server.primitive_name | Name of the MCP primitive used |
tfy.mcp_server.error_code | Error code if the MCP call failed |
tfy.mcp_server.is_tool_call_execution_error | Whether the error was from tool call execution |
MCP Server Metrics
Attribute | Description |
---|---|
tfy.mcp_server.metric.latency_in_ms | Latency of the MCP server call in milliseconds |
tfy.mcp_server.metric.number_of_tools | Number of tools available in the MCP server |
Guardrail Attributes
Attribute | Description |
---|---|
tfy.guardrail.id | Unique identifier of the guardrail |
tfy.guardrail.name | Display name of the guardrail |
tfy.guardrail.fqn | Fully qualified name of the guardrail |
tfy.guardrail.result | Result of the guardrail check (e.g., “passed”, “failed”, “blocked”) |
Guardrail Applied Entity Attributes
Attribute | Description |
---|---|
tfy.guardrail.applied_on_entity.type | Type of entity the guardrail was applied to |
tfy.guardrail.applied_on_entity.id | ID of the entity |
tfy.guardrail.applied_on_entity.name | Name of the entity |
tfy.guardrail.applied_on_entity.fqn | FQN of the entity |
tfy.guardrail.applied_on_entity.scope | Scope of the entity |
Guardrail Metrics
Attribute | Description |
---|---|
tfy.guardrail.metric.latency_in_ms | Time taken for the guardrail check in milliseconds |
HTTP Response Attributes
Attribute | Description |
---|---|
http.response.status_code | HTTP status code of the response |