Model Deployment
Note
While using ServiceFoundry python SDK
type
is not a required field in any of the imported classes
ModelDeployment
Description
Describes the configuration for the model deployment
Schema
{
"type": "string",
"name": "string",
"model_source": {},
"resources": {
"cpu_request": 0.2,
"cpu_limit": 0.5,
"memory_request": 200,
"memory_limit": 500,
"ephemeral_storage_request": 1000,
"ephemeral_storage_limit": 2000,
"instance_family": [
"string"
]
},
"grpc": false
}
Properties
Name | Type | Required | Description |
---|---|---|---|
type | string | true | none |
name | string | true | Name of the model deployment. This uniquely identifies this deployment in the workspace. > Name can only contain alphanumeric characters and '-' and can be atmost 20 characters long |
model_source | object | true | Specify the model to be deployed |
resources | Resources | false | Describes the resource constraints for the application so that it can be deployed accordingly on the cluster To learn more you can go here |
grpc | boolean | true | Use grpc |
Python Examples
from servicefoundry import (
ModelDeployment, TruefoundryModelRegistry, Resources
)
model = ModelDeployment(
name="my-service",
model_source=TruefoundryModelRegistry(
model_version_fqn="..."
),
resources=Resources(
cpu_request=1,
memory_request=1000, # in Megabytes
ephemeral_storage_request=1000, # in Megabytes
cpu_limit=4,
memory_limit=4000,
ephemeral_storage_limit=10000,
instance_family=["c6i", "t3", "m4"],
),
)
deployment = model.deploy(workspace_fqn='...')
from servicefoundry import (
ModelDeployment, HuggingfaceModelHub, Resources
)
model = ModelDeployment(
name="my-service",
model_source=HuggingfaceModelHub(
model_id="...",
pipeline="..."
),
resources=Resources(
cpu_request=1,
memory_request=1000, # in Megabytes
ephemeral_storage_request=1000, # in Megabytes
cpu_limit=4,
memory_limit=4000,
ephemeral_storage_limit=10000,
instance_family=["c6i", "t3", "m4"],
),
)
deployment = model.deploy(workspace_fqn='...')
Model Source
The modules below help define the model source for the ModelServer
TruefoundryModelRegistry
Description
Truefoundry model registry (deploy a model from TFY model registry)
Schema
{
"type": "string",
"model_version_fqn": "string"
}
Properties
Name | Type | Required | Description |
---|---|---|---|
type | string | true | none |
model_version_fqn | string | true | The FQN of the model version that you want to deploy. |
Python Examples
from servicefoundry import (
ModelDeployment, TruefoundryModelRegistry, Resources
)
model = ModelDeployment(
...
model_source=TruefoundryModelRegistry(
model_version_fqn="..."
),
...
)
)
HuggingfaceModelHub
Description
Huggingface model hub (deploy a model from HF model hub)
Schema
{
"type": "string",
"model_id": "string",
"pipeline": "string"
}
Properties
Name | Type | Required | Description |
---|---|---|---|
type | string | true | none |
model_id | string | true | The name of the model that you want to deploy. Example:- t5-small , philschmid/bart-large-cnn-samsum . |
pipeline | string | false | Pipeline to use for inference. If not set, we will try to infer the task name for the given model.Examples:- summarization , text-generation , text-classification , etc.You can read more about HF pipelines here:- https://huggingface.co/docs/transformers/pipeline_tutorial You can find a list of tasks here:- https://huggingface.co/docs/transformers/v4.24.0/en/main_classes/pipelines#transformers.pipeline.task |
Python Examples
from servicefoundry import (
ModelDeployment, HuggingfaceModelHub, Resources
)
model = ModelDeployment(
...
model_source=HuggingfaceModelHub(
model_id="...",
pipeline="..."
),
...
)
)
Updated 10 months ago