Truefoundry Docs

Chat Completions

curl --request POST \
  --url https://{controlPlaneURL}/api/llm/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "model": "<string>",
  "messages": [
    {
      "role": "system",
      "content": "<string>",
      "name": "<string>"
    }
  ],
  "tools": [
    {
      "type": "<string>",
      "function": {
        "name": "<string>",
        "description": "<string>",
        "parameters": {}
      }
    }
  ],
  "tool_choice": "none",
  "temperature": 123,
  "top_p": 123,
  "top_k": 123,
  "n": 123,
  "stream": true,
  "logprobs": true,
  "stop": "<string>",
  "max_tokens": 123,
  "presence_penalty": 123,
  "frequency_penalty": 123,
  "logit_bias": {},
  "user": "<string>"
}'

{
  "id": "<string>",
  "object": "<string>",
  "created": 123,
  "model": "<string>",
  "choices": [
    {
      "index": 123,
      "message": {
        "role": "<string>",
        "content": "<string>"
      },
      "logprobs": "<any>",
      "finish_reason": "<string>"
    }
  ],
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 123,
    "total_tokens": 123,
    "prompt_tokens_details": {
      "cached_tokens": 123,
      "audio_tokens": 123
    },
    "completion_tokens_details": {
      "reasoning_tokens": 123,
      "audio_tokens": 123,
      "accepted_prediction_tokens": 123,
      "rejected_prediction_tokens": 123
    }
  },
  "service_tier": "<string>",
  "system_fingerprint": "<string>"
}

POST

chat

completions

Chat Completions

curl --request POST \
  --url https://{controlPlaneURL}/api/llm/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "model": "<string>",
  "messages": [
    {
      "role": "system",
      "content": "<string>",
      "name": "<string>"
    }
  ],
  "tools": [
    {
      "type": "<string>",
      "function": {
        "name": "<string>",
        "description": "<string>",
        "parameters": {}
      }
    }
  ],
  "tool_choice": "none",
  "temperature": 123,
  "top_p": 123,
  "top_k": 123,
  "n": 123,
  "stream": true,
  "logprobs": true,
  "stop": "<string>",
  "max_tokens": 123,
  "presence_penalty": 123,
  "frequency_penalty": 123,
  "logit_bias": {},
  "user": "<string>"
}'

{
  "id": "<string>",
  "object": "<string>",
  "created": 123,
  "model": "<string>",
  "choices": [
    {
      "index": 123,
      "message": {
        "role": "<string>",
        "content": "<string>"
      },
      "logprobs": "<any>",
      "finish_reason": "<string>"
    }
  ],
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 123,
    "total_tokens": 123,
    "prompt_tokens_details": {
      "cached_tokens": 123,
      "audio_tokens": 123
    },
    "completion_tokens_details": {
      "reasoning_tokens": 123,
      "audio_tokens": 123,
      "accepted_prediction_tokens": 123,
      "rejected_prediction_tokens": 123
    }
  },
  "service_tier": "<string>",
  "system_fingerprint": "<string>"
}

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Headers

x-tfy-metadata

string

Optional metadata for the request

Body

application/json

Parameters for generating chat completions.

model

string

required

Identifier for the language model to be used for generation.

messages

required

Conversation history as an array of messages for contextual generation.

System Message
User Message
Assistant Message
Function Message
Tool Message
Developer Message

Show child attributes

tools

object[] | null

Array of tool definitions available for the model to use during generation.

Show child attributes

tool_choice

Controls tool usage behavior. Use "none" to disable, "auto" for model choice, "required" to force tool use, or specify tool for forced usage.

Available options:

none

temperature

number | null

Sampling temperature between 0 and 2. Higher values make output more random, lower values more deterministic.

top_p

number | null

Nucleus sampling parameter. Limits cumulative probability of tokens considered for sampling.

top_k

number | null

Top-k sampling parameter

number | null

Number of response alternatives to generate.

stream

boolean | null

Enable streaming of partial response chunks as they are generated.

logprobs

Include log probabilities of tokens: set to boolean for Chat Completions; integer for legacy Completions (number of top-token logprobs).

stop

Sequence(s) at which to stop generation. Can be a single string or array of strings.

max_tokens

number | null

Maximum number of tokens to generate in the response.

presence_penalty

number | null

Penalty factor for new tokens based on their presence in existing text. Range: -2.0 to 2.0

frequency_penalty

number | null

Penalty factor for new tokens based on their frequency in existing text. Range: -2.0 to 2.0

logit_bias

object | null

Modify the likelihood of specified tokens appearing in the completion.

Show child attributes

user

string | null

Unique identifier for the end-user making the request, for monitoring and detecting abuse.

Response

Chat completions generated successfully.

string

required

Id of the response.

object

string

required

Type of the object, e.g., 'chat.completion'.

created

number

required

Timestamp of when the response was created.

model

string

required

The model used to generate the response.

choices

object[]

required

Array of choices returned by the model.

Show child attributes

usage

object

required

Details about token usage.

Show child attributes

service_tier

string

required

Service tier used for the request.

system_fingerprint

string | null

required

System fingerprint, if available.

Headers & Authentication Agent Responses

⌘I

Get Started

Developer Guide

MCP Registry and Gateway

Prompt Management

Observability

Integrations

Deployment

API Reference

Chat

Agent

Embeddings

Rerank

Responses

Image

Audio

Batch

Files

Moderations

Models

Chat Completions

Authorizations

Headers

Body

Response