Nvidia MIG and Nvidia TimeSlicing
Feature | TimeSlicing | MIG (Multi-Instance GPU) |
---|---|---|
GPU Support | Works on most GPUs. | Only supported on NVIDIA A100 and H100 GPUs. |
Isolation | No real isolation. User is responsible for memory management. Potential for crashes if one workload exceeds its allocated memory. | Strong Isolation. Compute and memory are isolated between instances. Guaranteed resource allocation. |
Resource Allocation | Divides GPU into fractional parts (e.g., 0.3, 0.5, 0.2). Workloads can use these fractional parts. | Divides GPU into pre-defined, discrete instance types (as per NVIDIA’s configurations). Workloads are assigned entire instances. |
VRAM Management | User-managed. VRAM allocation is not enforced by the hardware. | Hardware-enforced. Each instance has dedicated VRAM. |
Compute Sharing | Compute is shared via context-switching. Workloads can potentially use the entire GPU when others are idle. | Compute is partitioned and isolated to each instance. No sharing of compute resources beyond the instance’s allocation. |
Flexibility | More flexible in terms of resource allocation fractions (e.g., can request 0.3, 0.5, etc.). | Limited to NVIDIA’s pre-defined instance types. Less flexible in terms of fine-grained resource requests. |
Create a Nodepool with MIG enabled
GPU | GPU Compute Fraction / Instance | Number of instances per GPU | GPU Memory / Instance | Configuration Name | GPU Instance Profile (for Azure) |
---|---|---|---|---|---|
A100 (40GB) | 1/7 | 7 | 5GB | 1g.5gb | MIG1g |
A100 (40GB) | 2/7 | 3 | 10GB | 2g.10gb | MIG2g |
A100 (40GB) | 3/7 | 2 | 20GB | 3g.20gb | MIG3g |
A100 (80GB) | 1/7 | 7 | 10GB | 1g.10gb | MIG1g |
A100 (80GB) | 2/7 | 3 | 20GB | 2g.20gb | MIG2g |
A100 (80GB) | 3/7 | 2 | 40GB | 3g.40gb | MIG3g |
--gpu-instance-profile
of Azure CLI.Deploy your workload on the MIG nodepool
Resources
section in deployment.Using MIG GPU
Create a Nodepool with Timeslicing enabled
device-plugin.config
pointing to the correct time-slicing config with Azure CLI.Deploy your workload on the Timeslicing nodepool
Resources
section in deployment.Using Timeslicing GPU