Responses
Model Responses
Get Started
Developer Guide
- Providers
- Code Integration
Configure Gateway
- Access Control
- Rate Limiting
- Load Balancing
- Fallback
- Guardrails
Observability
Deployment
Embeddings
Rerank
Responses
Moderations
Responses
Model Responses
Generate model responses using the specified model.
POST
/
responses
curl --request POST \
--url https://{controlPlaneURL}/api/llm/api/inference/openai/responses \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '{
"model": "<string>",
"messages": [
{
"role": "system",
"content": "<string>"
}
],
"tools": [
{
"type": "<string>",
"function": {
"name": "<string>",
"description": "<string>",
"parameters": {}
}
}
],
"tool_choice": "none",
"temperature": 123,
"top_p": 123,
"n": 123,
"stream": true,
"logprobs": 123,
"stop": "<string>",
"max_tokens": 123,
"presence_penalty": 123,
"frequency_penalty": 123,
"logit_bias": {},
"user": "<string>"
}'
{
"id": "<string>",
"object": "<string>",
"created_at": 123,
"status": "<string>",
"error": "<any>",
"incomplete_details": "<any>",
"instructions": "<any>",
"max_output_tokens": 123,
"model": "<string>",
"output": [
{
"id": "<string>",
"type": "<string>",
"status": "<string>",
"content": [
{
"type": "<string>",
"annotations": [
"<any>"
],
"text": "<string>"
}
],
"role": "<string>"
}
],
"parallel_tool_calls": true,
"previous_response_id": "<string>",
"reasoning": {
"effort": "<any>",
"summary": "<any>"
},
"service_tier": "<string>",
"store": true,
"temperature": 123,
"text": {
"format": {
"type": "<string>"
}
},
"tool_choice": "<string>",
"tools": [
"<any>"
],
"top_p": 123,
"truncation": "<string>",
"usage": {
"input_tokens": 123,
"input_tokens_details": {
"cached_tokens": 123
},
"output_tokens": 123,
"output_tokens_details": {
"reasoning_tokens": 123
},
"total_tokens": 123
},
"user": "<any>",
"metadata": {},
"provider": "<string>"
}
Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Body
application/json
Parameters for generating model responses.
The body is of type object
.
Response
200
application/json
Model Response generated successfully.
The response is of type object
.
Was this page helpful?
curl --request POST \
--url https://{controlPlaneURL}/api/llm/api/inference/openai/responses \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '{
"model": "<string>",
"messages": [
{
"role": "system",
"content": "<string>"
}
],
"tools": [
{
"type": "<string>",
"function": {
"name": "<string>",
"description": "<string>",
"parameters": {}
}
}
],
"tool_choice": "none",
"temperature": 123,
"top_p": 123,
"n": 123,
"stream": true,
"logprobs": 123,
"stop": "<string>",
"max_tokens": 123,
"presence_penalty": 123,
"frequency_penalty": 123,
"logit_bias": {},
"user": "<string>"
}'
{
"id": "<string>",
"object": "<string>",
"created_at": 123,
"status": "<string>",
"error": "<any>",
"incomplete_details": "<any>",
"instructions": "<any>",
"max_output_tokens": 123,
"model": "<string>",
"output": [
{
"id": "<string>",
"type": "<string>",
"status": "<string>",
"content": [
{
"type": "<string>",
"annotations": [
"<any>"
],
"text": "<string>"
}
],
"role": "<string>"
}
],
"parallel_tool_calls": true,
"previous_response_id": "<string>",
"reasoning": {
"effort": "<any>",
"summary": "<any>"
},
"service_tier": "<string>",
"store": true,
"temperature": 123,
"text": {
"format": {
"type": "<string>"
}
},
"tool_choice": "<string>",
"tools": [
"<any>"
],
"top_p": 123,
"truncation": "<string>",
"usage": {
"input_tokens": 123,
"input_tokens_details": {
"cached_tokens": 123
},
"output_tokens": 123,
"output_tokens_details": {
"reasoning_tokens": 123
},
"total_tokens": 123
},
"user": "<any>",
"metadata": {},
"provider": "<string>"
}