!(function () {
  var e, t, n;
  (e = "7dcab2ef5bdce24"),
    (t = function () {
      Reo.init({ clientID: "7dcab2ef5bdce24" });
    }),
    ((n = document.createElement("script")).src =
      "https://static.reo.dev/" + e + "/reo.js"),
    (n.defer = !0),
    (n.onload = t),
    document.head.appendChild(n);
})();


MCP Servers

Features

truefoundry Docs

Setup Your Account

Key Concepts

Setup For CLI

Introduction

Model Registry

Getting Started

Powered by Flyte (An OpenSource Workflow Orchestrator)

Introduction To Workflow

Creating Your First Workflow

Interacting With Workflow

Workflow Concepts

Introduction to Async Service

Deploy your first Async Service

Monitor Your Async Service

Introduction to Volume

Creating And Utilizing Volume

Interacting With Your Volume

Creating Statically Provisioned Volumes

Introduction To ML Repo

Introduction To Secrets

Manage Secrets

Access Control

Manage Users and Teams

Personal Access Tokens and Virtual Accounts

Generating TrueFoundry API Keys

Create Custom K8s Objects

Store deployment configuration in Git and implement a approval process for changes

Setup Gitops Using TrueFoundry

Deployment Guardrails and Policies

Using TrueFoundry Secrets In Integrations

preview and apply changes to your resources using tfy apply

Using CLI to Deploy Resources with tfy apply

Understanding modes of deployment in truefoundry

Modes Of Deployment

Deploy TrueFoundry In An Air Gapped Environment

Quick Start

Cost Tracking

Overview of MCP

MCP Server Chat API

Playground

MCP Gateway

Control access of models among teams, users and applications

Rate Limiting

Budget Limiting

Fallback a request to another model on failure from one model

Fallback

Analytics

Metadata

Export Logs/Traces

Architecture

Deploy AI Gateway Only

Authentication for AI Gateway

Request and Response Headers

Generate chat-based completions using the specified model.

Chat Completions

Generate embeddings for the given input using the specified model.

Generate Embeddings

Rerank documents based on the given query and parameters.

Rerank Documents

List input items for a specific model response

List Model Response Input Items

Get Model Response

Delete Model Response

Generate model responses using the specified model.

Model Responses

Transcribes audio into the input language.

Transcribe Audio

List batches

Create Batch

Get information about a specific batch process

Get Batch

Get Batch Output

Cancel Batch

List Files

Upload File

Get File

Delete file

Get File Content

Post moderations

Deploying LLMs

Deploy optimized TensorRT-LLM Engines using NIM Containers

Deploying NVIDIA NIM Models

Measure token generation throughput, Time to First Token (TTFT), Inter Token Latency of LLMs via the chat completions API

Benchmarking LLMs

Finetune Llama, Mistral, Mixtral and more on one or more GPUs

Finetuning LLMs

Prompt Management

Overview

Retrieves a list of all latest Clusters. Pagination is available based on query parameters.

List Clusters

Create or Update cluster with provided manifest

Create or Update Cluster

Get Cluster

Delete cluster associated with provided cluster id

Delete cluster

List addons for the provided cluster.Pagination is available based on query parameters.

List Addons

Get Cluster status

List workspaces associated with the user. Optional filters include clusterId, fqn, and workspace name. Pagination is available based on query parameters.

List Workspace

Creates a new workspace or updates an existing one based on the provided manifest.

Create or Update Workspace

Get workspace associated with provided workspace id

Get Workspace

Deletes the workspace with the given workspace ID.
    - Removes the associated namespace from the cluster.
    - Deletes the corresponding authorization entry.

Delete Workspace

Creates or updates an MLRepo entity based on the provided manifest.

Create or Update MLRepo

Get a ml repo by id
Args:
    id: Unique identifier of the ml repo to get
    user_info: Authenticated user information

Returns:
    GetMLRepoResponse: The ml repo

Get Ml Repo

Delete a ml repo
Args:
    id: Unique identifier of the ml repo to delete
    user_info: Authenticated user information

Returns:
    EmptyResponse: Empty response indicating successful deletion

Delete Ml Repo

List ml repos
Args:
    filters: Filters for the ml repos
    user_info: Authenticated user information

Returns:
    ListMLReposResponse: List of ml repos

List Ml Repos

Retrieves a list of all latest applications. Supports filtering by application ID, name, type, and other parameters. Pagination is available based on query parameters.

List Applications

Create a new Application Deployment based on the provided manifest.

Create Application Deployment

Get Application associated with the provided application ID.

Get Application

Delete Application associated with the provided application ID.

Delete Application

Fetch all deployments for a given application ID with optional filters such as deployment ID or version. Supports pagination.

List Application Deployments

Get Deployment associated with the provided application ID and deployment ID.

Get Deployment details

Pause a running application by scaling to 0 replicas

Pause an Application

Resume a paused application by scaling back to the original number of replicas

Resume an Application

Cancel an ongoing deployment associated with the provided application ID and deployment ID.

Cancel an Ongoing Deployment

List Job Runs for provided Job Id. Filter the data based on parameters passed in the query

List Job Runs

Get Job Run for provided jobRunName and jobId

Get Job Run

Delete Job Run for provided jobRunName and jobId

Delete Job Run

Trigger Job for provided deploymentId or applicationId

Trigger Job

Terminate Job for provided deploymentId and jobRunName

Terminate Job

Get Model

Delete Model

List Models

Get Model Version

Delete Model Version

List Model Versions

Apply Model Version

Get Artifact

Delete Artifact

List Artifacts

Get Artifact Version

Delete Artifact Version

List Artifact Versions

Apply Artifact Version

Get Signed Urls For Artifact Version

Create Multi Part Upload

Stage Artifact Version

List Files For Artifact Version

Mark Stage Failure

Get Prompt

Delete Prompt

List Prompts

Get Prompt Version

Delete Prompt Version

List Prompt Versions

Apply Prompt Version

List the secret groups associated with a user along with the associated secrets for each group. Filtered with the options passed in the query fields. Note: This method does not return the secret values of the associatedSecrets in the response. A separate API call to `/v1/secrets/{id}` should be made to fetch the associated secret value.

List Secret Groups

Creates a secret group with secrets in it. A secret version for each of the created secret is created with version number as 1. The returned secret group does not have any secret values in the associatedSecrets field. A separate API call to `/v1/secrets/{id}` should be made to fetch the associated secret value.

Create Secret Group

Get Secret Group by id. This method does not return the secret values of the associatedSecrets in the response. A separate API call to `/v1/secrets/{id}` should be made to fetch the associated secret value.

Get Secret Group

Updates the secrets in a secret group with new values. A new secret version is created for every secret that has a modified value and any omitted secrets are deleted. The returned updated secret group does not have any secret values in the associatedSecrets field. A separate API call to `/v1/secrets/{id}` should be made to fetch the associated secret value.

Update Secret Group

Deletes the secret group, its associated secrets and secret versions of those secrets.

Delete Secret Group

List secrets associated with a user filtered with optional parameters passed in the body.

List Secrets

Get Secret associated with provided id. The secret value is not returned if the control plane has `DISABLE_SECRET_VALUE_VIEW` set

Get Secret

Deletes a secret and its versions along with its values.

Delete Secret

List the active deployments that are associated with a secret.

List Associated Active Deployments

Fetches deployment specifications for a model version or a HuggingFace model URL.

Get Deployment Specifications

Fetches deployment specifications for a NIM Model

Get Nvidia NIM Model Deployment Specifications

Fetches finetuning specifications for a model version or a HuggingFace model URL

Get Finetuning Specifications

List all users of tenant filtered by query and showInvalidUsers. Pagination is available based on query parameters.

List Users

This endpoint allows tenant administrators to pre-register users within their tenant.

Pre-Register Users

This endpoint allows tenant administrators to update the roles of a user within their tenant.

Get Started

Developer Guide

MCP Server

Configure Gateway

Observability

Deployment

API Reference

Chat

Embeddings

Rerank

Responses

Audio

Batch

Files

Moderations

Features

Unified API Interface

Rate Limiting

Budget Limiting

Multimodal Inputs

Fallback

Load Balancing

API Key Management

Guardrails

Observability & Metrics

Access Control

Custom Metadata Routing

Latency-Based Load Balancing

Prompt Management

Tracing

Responses API

Batch Predictions

PII Detection

Real-time API

Export Logs & Traces

Self-Hosted Models

MCP Servers

MCP Servers

OAuth2 Support

Connecting to MCP Servers via Gateway

Get Started

Developer Guide

MCP Server

Configure Gateway

Observability

Deployment

API Reference

Chat

Embeddings

Rerank

Responses

Audio

Batch

Files

Moderations

Unified API Interface

Rate Limiting

Budget Limiting

Multimodal Inputs

Fallback

Load Balancing

API Key Management

Guardrails

Observability & Metrics

Access Control

Custom Metadata Routing

Latency-Based Load Balancing

Prompt Management

Tracing

Responses API

Batch Predictions

PII Detection

Real-time API

Export Logs & Traces

Self-Hosted Models

​MCP Servers

MCP Servers

OAuth2 Support

Connecting to MCP Servers via Gateway

MCP Servers