Deploy optimized TensorRT-LLM Engines using NIM Containers
nvcr.io
Docker RegistryNGC Catalog
nvcr.io
$oauthtoken
NGC_API_KEY
New Deployment
page, select NVIDIA NIM
.
latency
or throughput
) for differerent precision and GPU options for which TRT-LLM engines are prebuilt and available. You can select any of the profile and Continue to Deployment.
/opt/nim/.cache