Model Deployment
Note
While using ServiceFoundry python SDK
type
is not a required field in any of the imported classes
ModelDeployment
Description
Describes the configuration for the model deployment
Schema
{
"type": "string",
"name": "string",
"model_source": {},
"env": null,
"endpoint": {
"host": "string",
"path": "string"
},
"grpc": false,
"resources": {
"cpu_request": 0.2,
"cpu_limit": 0.5,
"memory_request": 200,
"memory_limit": 500,
"ephemeral_storage_request": 1000,
"ephemeral_storage_limit": 2000,
"gpu_count": 0,
"shared_memory_size": 64,
"node": {}
},
"replicas": 1,
"mounts": [
{}
],
"labels": {
"property1": "string",
"property2": "string"
}
}
Properties
Name | Type | Required | Description |
---|---|---|---|
type | string | true | +value=model-deployment |
name | string | true | Name of the model deployment. This uniquely identifies this deployment in the workspace. > Name can only contain alphanumeric characters and '-' and can be atmost 20 characters long |
model_source | object | true | Specify the model to be deployed |
env | object¦null | false | Configure environment variables to be injected in the service. Docs |
endpoint | Endpoint | false | Describes an endpoint configuration |
grpc | boolean | true | Use grpc |
resources | Resources | false | Describes the resource constraints for the application so that it can be deployed accordingly on the cluster To learn more you can go here |
replicas | integer | true | Replicas of service you want to run |
mounts | [object] | false | Configure data to be mounted to model pod(s) |
labels | object | false | Add labels to service metadata |
Endpoint
Description
Describes an endpoint configuration
Schema
{
"host": "string",
"path": "string"
}
Properties
Name | Type | Required | Description |
---|---|---|---|
host | string | true | Host e.g. ai.example.com, app.truefoundry.com |
path | string | false | Path e.g. /v1/api/ml/, /v2/docs/ |
Model Source
The modules below help define the model source for the ModelServer
TruefoundryModelRegistry
Description
Truefoundry model registry (deploy a model from TFY model registry)
Schema
{
"type": "string",
"model_version_fqn": "string"
}
Properties
Name | Type | Required | Description |
---|---|---|---|
type | string | true | +value=tfy-model-registry |
model_version_fqn | string | true | The FQN of the model version that you want to deploy. |
HuggingfaceModelHub
Description
Huggingface model hub (deploy a model from HF model hub)
Schema
{
"type": "string",
"model_id": "string",
"pipeline": "string",
"model_library": "transformers"
}
Properties
Name | Type | Required | Description |
---|---|---|---|
type | string | true | +value=hf-model-hub |
model_id | string | true | The name of the model that you want to deploy. Example:- t5-small , philschmid/bart-large-cnn-samsum . |
pipeline | string | false | Pipeline to use for inference. If not set, we will try to infer the task name for the given model.Examples: summarization , text-generation , text-classification , etc.You can read more about HF pipelines here:- https://huggingface.co/docs/transformers/pipeline_tutorial You can find a list of tasks here:- https://huggingface.co/docs/transformers/v4.24.0/en/main_classes/pipelines#transformers.pipeline.task Note: With model_library= sentence-transformers only feature-extraction pipeline is supported |
model_library | string | true | Which library to use to load the model. Should be either transformers or sentence-transformers |
Enumerate Values
Property | Value |
---|---|
model_library | transformers |
model_library | sentence-transformers |
Updated 20 days ago