Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Headers
Optional metadata for the request
Body
Parameters for generating chat completions.
Identifier for the language model to be used for generation.
Conversation history as an array of messages for contextual generation.
Array of tool definitions available for the model to use during generation.
Controls tool usage behavior. Use "none" to disable, "auto" for model choice, "required" to force tool use, or specify tool for forced usage.
none
Sampling temperature between 0 and 2. Higher values make output more random, lower values more deterministic.
Nucleus sampling parameter. Limits cumulative probability of tokens considered for sampling.
Top-k sampling parameter
Number of response alternatives to generate.
Enable streaming of partial response chunks as they are generated.
Include log probabilities of tokens: set to boolean for Chat Completions; integer for legacy Completions (number of top-token logprobs).
Sequence(s) at which to stop generation. Can be a single string or array of strings.
Maximum number of tokens to generate in the response.
Penalty factor for new tokens based on their presence in existing text. Range: -2.0 to 2.0
Penalty factor for new tokens based on their frequency in existing text. Range: -2.0 to 2.0
Modify the likelihood of specified tokens appearing in the completion.
Unique identifier for the end-user making the request, for monitoring and detecting abuse.
Response
Chat completions generated successfully.
Id of the response.
Type of the object, e.g., 'chat.completion'.
Timestamp of when the response was created.
The model used to generate the response.
Array of choices returned by the model.
Details about token usage.
Service tier used for the request.
System fingerprint, if available.