Volumes are used to provide persistent storage to containers within pods so that read and write data to a central disk across multiple pods. They are especially useful in the context of machine learning when you need to store and access data, models, and other artifacts required for training, serving, and inference tasks.
A few usecases where volumes turn out to be very useful:
The choice of when to choose a blob storage like S3 vs volume is important from a performance, reliability and cost perspective.
Performance:
In most cases, reading data from S3 will be slower than reading data directly from a volume. So if the speed of loading is important for you, volume is the right choice. A good example for this is downloading and loading the model at inference time in multiple replicas of the service. A volume is a better choice since you don’t incur the time of downloading the model repeatedly, and can load the model in memory from volume much faster.
Reliability
Blob storages like S3/GCS/ACS will in general be more reliable compared to volumes. So you should ideally always have the raw data backed in one of the blob storages and use volumes only for intermediate data. You should also always save a copy of the models in S3.
Cost
Access to volumes like EFS is a bit cheaper than using S3 - so if you are doing reading for the same data quite frequently - it might be useful to store it in a volume. If you are reading or writing very infrequently, then S3 should be just fine.
Access Constraints
Data in volumes should ideally only be accessed among workloads within the same region and cluster. S3 is meant to be accessed globally and also across cloud environments. So volumes is not a great choice if you want to access the data in a different region or cloud provider.
When deploying a volume on truefoundry, there can be two modes of provisioning:
The volumes you create in TrueFoundry can be mounted to multiple replicas - and hence you can read and write from multiple replicas to the volume. However, we should be very careful about not writing to the volume at the same path from multiple replicas since it can cause data corruption.
Volumes are used to provide persistent storage to containers within pods so that read and write data to a central disk across multiple pods. They are especially useful in the context of machine learning when you need to store and access data, models, and other artifacts required for training, serving, and inference tasks.
A few usecases where volumes turn out to be very useful:
The choice of when to choose a blob storage like S3 vs volume is important from a performance, reliability and cost perspective.
Performance:
In most cases, reading data from S3 will be slower than reading data directly from a volume. So if the speed of loading is important for you, volume is the right choice. A good example for this is downloading and loading the model at inference time in multiple replicas of the service. A volume is a better choice since you don’t incur the time of downloading the model repeatedly, and can load the model in memory from volume much faster.
Reliability
Blob storages like S3/GCS/ACS will in general be more reliable compared to volumes. So you should ideally always have the raw data backed in one of the blob storages and use volumes only for intermediate data. You should also always save a copy of the models in S3.
Cost
Access to volumes like EFS is a bit cheaper than using S3 - so if you are doing reading for the same data quite frequently - it might be useful to store it in a volume. If you are reading or writing very infrequently, then S3 should be just fine.
Access Constraints
Data in volumes should ideally only be accessed among workloads within the same region and cluster. S3 is meant to be accessed globally and also across cloud environments. So volumes is not a great choice if you want to access the data in a different region or cloud provider.
When deploying a volume on truefoundry, there can be two modes of provisioning:
The volumes you create in TrueFoundry can be mounted to multiple replicas - and hence you can read and write from multiple replicas to the volume. However, we should be very careful about not writing to the volume at the same path from multiple replicas since it can cause data corruption.