POST
/
responses
Model Responses
curl --request POST \
  --url https://{controlPlaneURL}/api/llm/responses \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "model": "<string>",
  "input": "<any>",
  "background": true,
  "include": [
    "<string>"
  ],
  "instructions": "<string>",
  "max_output_tokens": 123,
  "metadata": {},
  "parallel_tool_calls": true,
  "previous_response_id": "<string>",
  "reasoning": {
    "effort": "<string>"
  },
  "service_tier": "auto",
  "store": true,
  "stream": true,
  "temperature": 123,
  "text": {
    "format": {
      "type": "text"
    }
  },
  "tool_choice": "none",
  "tools": [
    "<any>"
  ],
  "top_p": 123,
  "truncation": "auto",
  "user": "<string>"
}'
{
  "id": "<string>",
  "object": "<string>",
  "created_at": 123,
  "status": "<string>",
  "error": "<any>",
  "incomplete_details": "<any>",
  "instructions": "<any>",
  "max_output_tokens": 123,
  "model": "<string>",
  "output": [
    {
      "id": "<string>",
      "type": "<string>",
      "status": "<string>",
      "content": [
        {
          "type": "<string>",
          "annotations": [
            "<any>"
          ],
          "text": "<string>"
        }
      ],
      "role": "<string>"
    }
  ],
  "parallel_tool_calls": true,
  "previous_response_id": "<string>",
  "reasoning": {
    "effort": "<any>",
    "summary": "<any>"
  },
  "service_tier": "<string>",
  "store": true,
  "temperature": 123,
  "text": {
    "format": {
      "type": "<string>"
    }
  },
  "tool_choice": "<string>",
  "tools": [
    "<any>"
  ],
  "top_p": 123,
  "truncation": "<string>",
  "usage": {
    "input_tokens": 123,
    "input_tokens_details": {
      "cached_tokens": 123
    },
    "output_tokens": 123,
    "output_tokens_details": {
      "reasoning_tokens": 123
    },
    "total_tokens": 123
  },
  "user": "<any>",
  "metadata": {},
  "provider": "<string>"
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Headers

x-tfy-metadata
string

Optional metadata for the request

Body

application/json

Parameters for generating model responses.

model
string
required

Model identifier to generate the response

input
any

Text, image, or file inputs to the model

background
boolean | null

Whether to run the model response in the background

include
string[] | null

Additional output data to include

instructions
string | null

System message as first item in context

max_output_tokens
number | null

Upper bound for tokens generated

metadata
object | null

Key-value pairs for additional information

parallel_tool_calls
boolean | null

Allow parallel tool calls

previous_response_id
string | null

ID of previous response for multi-turn

reasoning
object | null

Configuration for reasoning models

service_tier
enum<string> | null

Latency tier for processing

Available options:
auto,
default,
flex
store
boolean | null

Whether to store the response

stream
boolean | null

Enable streaming response

temperature
number | null

Sampling temperature between 0 and 2

text
object

Text response configuration

tool_choice

Tool selection behavior

Available options:
none
tools
(any | null)[]

Available tools for the model

top_p
number | null

Nucleus sampling parameter

truncation
enum<string> | null

Truncation strategy

Available options:
auto,
disabled
user
string

End-user identifier

Response

Model Response generated successfully.

id
string
required

Response ID.

object
string
required

Object type.

created_at
number
required

Creation timestamp.

status
string
required

Response status.

max_output_tokens
number | null
required

Maximum output tokens allowed.

model
string
required

Model used for the response.

output
object[]
required
parallel_tool_calls
boolean
required

Indicates if parallel tool calls were used.

previous_response_id
string | null
required

ID of the previous response, if any.

reasoning
object | null
required

Reasoning details.

service_tier
string
required

Service tier.

store
boolean
required

Indicates if the response is stored.

temperature
number
required

Temperature setting for the model.

text
object
required
tool_choice
string
required

Tool choice used.

tools
(any | null)[]
required

Tools used in the response.

top_p
number
required

Top-p sampling parameter.

truncation
string
required

Truncation setting.

usage
object
required
metadata
object | null
required

Additional metadata.

provider
string
required

Provider of the response.

error
any

Error details, if any.

incomplete_details
any

Details about incomplete responses, if any.

instructions
any

Instructions provided for the response.

user
any

User details, if any.