Model Deployment

📘

Note

While using TrueFoundry python SDK type is not a required field in any of the imported classes

ModelDeployment

Description

Describes the configuration for the model deployment

Schema

{
  "type": "string",
  "name": "string",
  "model_source": {},
  "env": null,
  "endpoint": {
    "host": "string",
    "path": "string"
  },
  "grpc": false,
  "resources": {
    "cpu_request": 0.2,
    "cpu_limit": 0.5,
    "memory_request": 200,
    "memory_limit": 500,
    "ephemeral_storage_request": 1000,
    "ephemeral_storage_limit": 2000,
    "gpu_count": 0,
    "shared_memory_size": 64,
    "node": {}
  },
  "replicas": 1,
  "mounts": [
    {}
  ],
  "labels": {
    "property1": "string",
    "property2": "string"
  }
}

Properties

NameTypeRequiredDescription
typestringtrue+value=model-deployment
namestringtrueName of the model deployment. This uniquely identifies this deployment in the workspace.
> Name can only contain alphanumeric characters and '-' and can be atmost 20 characters long
model_sourceobjecttrueSpecify the model to be deployed
envobject¦nullfalseConfigure environment variables to be injected in the service. Docs
endpointEndpointfalseDescribes an endpoint configuration
grpcbooleantrueUse grpc
resourcesResourcesfalseDescribes the resource constraints for the application so that it can be deployed accordingly on the cluster
To learn more you can go here
replicasintegertrueReplicas of service you want to run
mounts[object]falseConfigure data to be mounted to model pod(s)
labelsobjectfalseAdd labels to service metadata

Endpoint

Description

Describes an endpoint configuration

Schema

{
  "host": "string",
  "path": "string"
}

Properties

NameTypeRequiredDescription
hoststringtrueHost e.g. ai.example.com, app.truefoundry.com
pathstringfalsePath e.g. /v1/api/ml/, /v2/docs/

Model Source

The modules below help define the model source for the ModelServer

TruefoundryModelRegistry

Description

Truefoundry model registry (deploy a model from TFY model registry)

Schema

{
  "type": "string",
  "model_version_fqn": "string"
}

Properties

NameTypeRequiredDescription
typestringtrue+value=tfy-model-registry
model_version_fqnstringtrueThe FQN of the model version that you want to deploy.

HuggingfaceModelHub

Description

Huggingface model hub (deploy a model from HF model hub)

Schema

{
  "type": "string",
  "model_id": "string",
  "pipeline": "string",
  "model_library": "transformers"
}

Properties

NameTypeRequiredDescription
typestringtrue+value=hf-model-hub
model_idstringtrueThe name of the model that you want to deploy. Example:- t5-small, philschmid/bart-large-cnn-samsum.
pipelinestringfalsePipeline to use for inference. If not set, we will try to infer the task name for the given model.
Examples: summarization, text-generation, text-classification, etc.
You can read more about HF pipelines here:- https://huggingface.co/docs/transformers/pipeline_tutorial
You can find a list of tasks here:- https://huggingface.co/docs/transformers/v4.24.0/en/main_classes/pipelines#transformers.pipeline.task
Note: With model_library=sentence-transformers only feature-extraction pipeline is supported
model_librarystringtrueWhich library to use to load the model. Should be either transformers or sentence-transformers

Enumerate Values

PropertyValue
model_librarytransformers
model_librarysentence-transformers