"true"
to enable OTEL tracing.AI Gateway - OpenTelemetry Tracing
genai
(LLM), capturing the lifecycle of a large language model (LLM) inference request.
LLM Request Data:
openai-main/gpt-4o
200
1
0.1
Chat Completions API
chatCompletions
Description: Streaming spans are created when a request involves streaming data, such as chat completions. These spans capture the details of the streaming process, including the model used and the parameters affecting the streaming behavior.Attributes:gen_ai.request.model
: The model being used for the chat completion request.gen_ai.request.max_tokens
: The maximum number of tokens allowed in the chat completion request.gen_ai.request.temperature
: The temperature setting used in the chat completion request.gen_ai.operation.name
: The operation being performed, such as ‘chat’.gen_ai.system
: The system or platform being used, e.g., ‘openai’.gen_ai.request.top_p
: The top-p sampling parameter used in the request.gen_ai.system.message
: Events related to system messages in the request.gen_ai.user.message
: Events related to user messages in the request.gen_ai.assistant.message
: Events related to assistant messages in the request.gen_ai.tool.message
: Events related to tool messages in the request.gen_ai.unknown.message
: Events related to unknown message roles in the request.gen_ai.prompt.{index}.content
: The content of the message at a specific index in the request.gen_ai.prompt.{index}.role
: The role of the message at a specific index in the request.gen_ai.completion.{index}.content
: The content of the completion message at a specific index.gen_ai.completion.{index}.role
: The role of the completion message at a specific index.gen_ai.completion.{index}.finish_reason
: The reason why the completion finished, at a specific index.gen_ai.completion.{index}.tool_calls.{toolIndex}.name
: The name of the tool call at a specific index in the completion.gen_ai.completion.{index}.tool_calls.{toolIndex}.id
: The ID of the tool call at a specific index in the completion.gen_ai.completion.{index}.tool_calls.{toolIndex}.arguments
: The arguments of the tool call at a specific index in the completion.Chat Completions API
Agent Responses API
agentResponsesHandler
Description: This span is created when handling agent responses. It captures details about the request method and URL.Attributes:handler.name
: The name of the handler.request.method
: The HTTP method of the request.request.url
: The URL of the request.gen_ai.prompt.{index}.content
: The content of the message at a specific index in the request.gen_ai.prompt.{index}.role
: The role of the message at a specific index in the request.gen_ai.completion.{index}.content
: The content of the completion message at a specific index.gen_ai.completion.{index}.role
: The role of the completion message at a specific index.gen_ai.completion.{index}.finish_reason
: The reason why the completion finished, at a specific index.gen_ai.completion.{index}.tool_calls.{toolIndex}.name
: The name of the tool call at a specific index in the completion.gen_ai.completion.{index}.tool_calls.{toolIndex}.id
: The ID of the tool call at a specific index in the completion.gen_ai.completion.{index}.tool_calls.{toolIndex}.arguments
: The arguments of the tool call at a specific index in the completion.Agent Responses API
MCP Server Spans
MCP Server Initialization
mcp_server_fqn
: The FQN of the MCP server being initialized.MCP Server Initialization
Connect to MCP Server
mcp_server_url
: The URL of the MCP server being connected to.Connect to MCP Server
List Tools
tools
: The list of tools retrieved from the MCP server.List Tools
Tool Call: <toolName>
toolName
: The name of the tool being called.args
: The arguments passed to the tool call.integrationId
: The integration ID associated with the tool call.integrationFqn
: The fully qualified name of the integration.result
: The result of the tool call.status
: The status of the tool call.mcp_server_url
: The URL of the MCP server used for the tool call.tools
: The list of tools used in the call.Tool Call
Fallback
fallbackRequest
Description: Fallback spans are created when a request to a primary model fails and a fallback model is invoked. These spans capture the transition from the primary model to the fallback model.Attributes:fallback.http.url
: The URL to which the fallback request is made.fallback.http.method
: The HTTP method used for the fallback request.fallback.requested_model
: The original model that was requested before the fallback.fallback.resolved_model
: The model that is used as a fallback.fallback.config_id
: The configuration ID associated with the fallback mechanism.fallback.max_tokens
: The maximum number of tokens allowed in the fallback request.fallback.temperature
: The temperature setting used in the fallback request.Rate Limiting
RateLimiterMiddleware
Description: These spans represent the execution of the rate limiting middleware. The span captures information about the user, the model being accessed, and the rate limiting rules applied.Attributes:rate_limiter.model
: The model being accessed by the request.rate_limiter.metadata
: Additional metadata associated with the request.rate_limiter.user.subject_type
: The type of user making the request.rate_limiter.user.subject_slug
: A unique identifier for the user.rate_limiter.user.tenant_name
: The tenant or organization to which the user belongs.rate_limiter.rules
: The specific rate limiting rules applied to the request.rate_limiter.rule.id
: The ID of a specific rate limiting rule that was checked.rate_limiter.status
: The status of the rate limit check.rate_limiter.remaining
: The number of requests remaining before the rate limit is exceeded.Load Balancing
loadBalanceMiddleware
Description: Load balancing spans are created when a request is processed through the load balancing middleware. These spans capture the details of the load balancing process.Attributes:load_balance.http.url
: The URL of the request being load balanced.load_balance.http.method
: The HTTP method of the request being load balanced.user.tenantName
: The tenant name of the user making the request.load_balance.requested_model
: The model that was initially requested for load balancing.load_balance.resolved_model
: The target model selected by the load balancing process.