POST
/
chat
/
completions
Chat Completions
curl --request POST \
  --url https://{controlPlaneURL}/api/llm/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "model": "<string>",
  "messages": [
    {
      "role": "system",
      "content": "<string>",
      "name": "<string>"
    }
  ],
  "tools": [
    {
      "type": "<string>",
      "function": {
        "name": "<string>",
        "description": "<string>",
        "parameters": {}
      }
    }
  ],
  "tool_choice": "none",
  "temperature": 123,
  "top_p": 123,
  "top_k": 123,
  "n": 123,
  "stream": true,
  "logprobs": true,
  "stop": "<string>",
  "max_tokens": 123,
  "presence_penalty": 123,
  "frequency_penalty": 123,
  "logit_bias": {},
  "user": "<string>"
}'
{
  "id": "<string>",
  "object": "<string>",
  "created": 123,
  "model": "<string>",
  "choices": [
    {
      "index": 123,
      "message": {
        "role": "<string>",
        "content": "<string>"
      },
      "logprobs": "<any>",
      "finish_reason": "<string>"
    }
  ],
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 123,
    "total_tokens": 123,
    "prompt_tokens_details": {
      "cached_tokens": 123,
      "audio_tokens": 123
    },
    "completion_tokens_details": {
      "reasoning_tokens": 123,
      "audio_tokens": 123,
      "accepted_prediction_tokens": 123,
      "rejected_prediction_tokens": 123
    }
  },
  "service_tier": "<string>",
  "system_fingerprint": "<string>"
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Headers

x-tfy-metadata
string

Optional metadata for the request

Body

application/json

Parameters for generating chat completions.

model
string
required

Identifier for the language model to be used for generation.

messages
(System Message · object | User Message · object | Assistant Message · object | Function Message · object | Tool Message · object | Developer Message · object)[]
required

Conversation history as an array of messages for contextual generation.

tools
object[] | null

Array of tool definitions available for the model to use during generation.

tool_choice

Controls tool usage behavior. Use "none" to disable, "auto" for model choice, "required" to force tool use, or specify tool for forced usage.

Available options:
none
temperature
number | null

Sampling temperature between 0 and 2. Higher values make output more random, lower values more deterministic.

top_p
number | null

Nucleus sampling parameter. Limits cumulative probability of tokens considered for sampling.

top_k
number | null

Top-k sampling parameter

n
number | null

Number of response alternatives to generate.

stream
boolean | null

Enable streaming of partial response chunks as they are generated.

logprobs

Include log probabilities of tokens: set to boolean for Chat Completions; integer for legacy Completions (number of top-token logprobs).

stop

Sequence(s) at which to stop generation. Can be a single string or array of strings.

max_tokens
number | null

Maximum number of tokens to generate in the response.

presence_penalty
number | null

Penalty factor for new tokens based on their presence in existing text. Range: -2.0 to 2.0

frequency_penalty
number | null

Penalty factor for new tokens based on their frequency in existing text. Range: -2.0 to 2.0

logit_bias
object | null

Modify the likelihood of specified tokens appearing in the completion.

user
string | null

Unique identifier for the end-user making the request, for monitoring and detecting abuse.

Response

Chat completions generated successfully.

id
string
required

Id of the response.

object
string
required

Type of the object, e.g., 'chat.completion'.

created
number
required

Timestamp of when the response was created.

model
string
required

The model used to generate the response.

choices
object[]
required

Array of choices returned by the model.

usage
object
required

Details about token usage.

service_tier
string
required

Service tier used for the request.

system_fingerprint
string | null
required

System fingerprint, if available.