What you'll learn
- Deploying LLM as a Service from Model Catalogue
This is a guide to deploying an LLM (Large-Language Model) from TrueFoundry's Model Catalogue
TrueFoundry's Model Catalogue can be found under the
Models tab. The model catalogue will look like this:
Here, you may choose any model based on what task you want to perform.
Click on the Deploy button to configure the deployment of a model of your choice.
Where would you like to deploy?: Enter the workspace name where you would like to deploy the model
Preconfigured Deployment Options: Here, you will find different deployment options using GPUs and only CPUs. The name of the configuration gives an idea about any GPU used (if any). You may choose any according to your requirements.
Click on Next Step when done configuring the workspace and deployment options.
Endpoints, environment variables and resources - CPU, Memory, Storage, GPUs are pre-populated based on workspace and deployment options.
Click on Submit to complete deploying the model. You may see the Model's Logs, Events, and Metrics to track its deployment.
The deployed model will look like this:
Updated 10 days ago