Truefoundry Docs

Adding nvcr.io Docker Registry
Adding NGC API Key to Secrets
Deploying a NIM Model
(Optional) Caching NIM Model to External Volume
Running Inferences

Supported Model TypesCurrently we list NIM models of following types

Large Language Models (LLMs)
Vision Language Models (VLMs)
Embedding Models
Reranking Models

Adding `nvcr.io` Docker Registry

Generate an API Key from https://org.ngc.nvidia.com/setup/api-keys Make sure to give it access to NGC Catalog
Add a Custom Docker Registry to the Platform
- Registry URL: nvcr.io
- Username: $oauthtoken
- Password: The API Key from the previous step

Adding NGC API Key to Secrets

Add the same API Key as a Secret on the Platform. We are calling the secret NGC_API_KEY

Deploying a NIM Model

From New Deployment page, select NVIDIA NIM.
- Select the workspace you want to deploy to
- Select the NVCR Model Registry Integration we created in the previous step
- Select the NGC API Key Secret we created in the previous step
- Select the model you want to deploy
Click Next. You will be presented with optimized profiles (for latency or throughput ) for differerent precision and GPU options for which TRT-LLM engines are prebuilt and available. You can select any of the profile and Continue to Deployment.

(Optional) Caching NIM Model to External Volume

Recommended for Large Models and Production Environments

To avoid re-downloading Model on every restart, you can Create a Volume and Mount the Volume at /opt/nim/.cache

4fcd568c 5d90c572d92dd8c09ec0590fba11f5058f486f2d663594b95086fec59f074340 Image Pn

Running Inferences

You can now run inferences via the OpenAPI tab. You can also Add the Model to LLM Gateway using the button on the top

84390370 9d4356719c8ce8f41dc04108e037fab0f980d78a06cbbb8f5d85f55bac0d5887 Image Pn

Deploying LLMs Benchmarking LLMs

⌘I

Getting Started

Train and Deploy Models

Service Deployment

Job Deployment

LLM Deployment

LLM Finetuning

Workflow Deployment

Async Service Deployment

Volumes

ML Repository

LLM Tracing

Platform

Deploying On Your Own Cloud

Deploying NVIDIA NIM Models

Adding `nvcr.io` Docker Registry

Adding NGC API Key to Secrets

Deploying a NIM Model

(Optional) Caching NIM Model to External Volume

Running Inferences

Getting Started

Train and Deploy Models

Service Deployment

Job Deployment

LLM Deployment

LLM Finetuning

Workflow Deployment

Async Service Deployment

Volumes

ML Repository

LLM Tracing

Platform

Deploying On Your Own Cloud

​Adding nvcr.io Docker Registry

​Adding NGC API Key to Secrets

​Deploying a NIM Model

​(Optional) Caching NIM Model to External Volume

​Running Inferences

Adding `nvcr.io` Docker Registry

Adding NGC API Key to Secrets

Deploying a NIM Model

(Optional) Caching NIM Model to External Volume

Running Inferences