Launch Jupyter Notebook
Jupyter Notebooks are probably the first tool to be used in any ML Project. Truefoundry enables you to run Jupyter Notebooks on Kubernetes on any hardware you need, and auto-shut down when it's not being used hence saving costs. You can read more about Truefoundry Notebooks vs other notebook solutions here.
To get started with launching a Jupyter Notebook, you can follow the steps outlined below (You can find details on the configuration options below)
The notebook will go live in a few minutes. It takes a few minutes since the image will need to be downloaded on the machine first. Once the notebook becomes live, you will see the status turn to Running and an endpoint will start showing up. Clicking on the endpoint will take you to the Jupyter Notebook.
Stop and resume your notebook
When you're done working with a notebook instance, you can stop it to conserve resources and reduce costs. Clicking on the Stop button shuts down the notebook instance environment and releases associated resources.
Your notebook instance data (apart from the apt packages installed as a root user) persists, and you can easily restart it later by clicking the Resume button.
Configure your Notebook
In the notebook creation form, you can configure the following options:
Image
When creating a Jupyter Notebook, you can choose between two image options:
- Base Image: A lightweight image with no pre-installed packages, providing a clean slate for customization and package installation.
- Full Image: A pre-configured image with popular machine learning libraries pre-installed, like TensorFlow, Keras, PyTorch, scikit-learn, Pandas, etc.
To select the desired image, use the Image Type dropdown.
Auto Shutdown on Inactivity
By default, Notebook instances are configured to automatically stop after 30 minutes of inactivity. This helps prevent unnecessary resource consumption when Notebook instances are left idle. You can change the Stop After (minutes of inactivity) setting within the deployment form.
Install Apt packages in the notebook
Imagine you're working on a computer vision project and need to do video manipulation using the ffmpeg
package to process your videos. Given ffmpeg
is an apt package, you will have to install it using sudo apt install ffmpeg
. However, the default Notebook instance container doesn't have root access, which will prevent you from installing this package.
In this case, you will require root access in the Notebook instance to install your system-level dependencies. You can enable root access by ticking on the Enable root access to the container
checkbox:
This will grant you sudo
access within the Notebook instance container. And now you will be able to run the sudo apt install ffmpeg
command.
It is important to note that, any apt packages installed as root users will not be persisted across Notebook instance restarts. Any installed packages will be removed when the Notebook instance is restarted and will require reinstallation.
So if you restart your Notebook and then run any ffmpeg
command, it will show command not found
You will have to reinstall ffmpeg
to use it.
To avoid reinstalling every apt package again and again, once you have figured out the apt packages you require for your project use the temporary Enable root access
solution, you can define the apt packages to be installed while creating/editing the Notebook instance. This will ensure that the apt packages remain available even after the Notebook instance restarts.
You can do so by following these steps:
Storage Size
Specify the amount of storage space you require for your notebook instance. This is persistent storage that will be used to store your notebook files, data, and any other artifacts generated during your work.
Endpoint
Choose the endpoint to which your notebook instance will be deployed. This endpoint will determine the URL at which your notebook will be accessible.
Set resources for your Notebook
Define the computational resources allocated to your notebook instance. You can adjust the CPU and memory allocation to meet the requirements of your data science tasks.
Running a Notebook instance with GPU
In case you want to use GPU in your Notebook Instance, you can follow these steps:
Upon successful deployment, your GPU Notebook instance instance will be provisioned with the specified GPU type. You can then utilize the GPU resources to accelerate your computations, such as training deep learning models or running GPU-intensive workloads.
Login Credentials
Imagine a scenario where someone outside your team stumbles upon the Notebooks Endpoint. This could potentially grant them access to sensitive data or allow them to tamper with the code. Login credentials are essential in safeguarding the notebook instance from unauthorized access. Even if an unauthorized user acquires the endpoint URL, they'll be unable to access the Notebook without proper authorization.
Specify the credentials for accessing your notebook instance. This includes the username and password for logging into the notebook environment.
Access data from S3 or other clouds
In some instances, your Jupyter Notebooks may need to access data stored in S3 or other cloud storage platforms. To facilitate this access, you can employ one of two approaches:
Credential-Based Access through environment variables
This approach involves defining specific environment variables that contain the necessary credentials for accessing the cloud storage platform. For instance, to access S3, you would set environment variables for the AWS access key ID and secret access key, the environment variables being: AWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
IAM Role-Based Access through Service Account
The second approach is to provide your Notebook with a Role with the necessary permission through Service Accounts.
Service Accounts provide a streamlined approach to managing access without the need for tokens or complex authentication methods. This approach involves creating a Principal IAM Role within your cloud platform, granting it the necessary permissions for your project's requirements. Here are detailed guides for creating Principal IAM Roles in your respective cloud platforms and integrating them as Service Accounts within the workspace:
- AWS: Authenticate to AWS services using IAM service account
- GCP: Authenticate to GCP using IAM serviceaccount
Once you've configured the Service Account within the workspace, you can simply toggle the Show Advanced Fields
option at the bottom of the form. This will reveal an expanded set of options, from which you can select the desired Service Account using the provided dropdown menu.
Access data in volume from the notebook
Mounting a volume to a notebook allows you to access data stored on the volume from within the notebook environment. This can be useful for a variety of tasks, such as loading data for analysis, training machine learning models, and deploying applications.
You can follow these steps for mounting a volume to the notebook:
Once mounted, in your Notebook you will be able to access the data in your Volume from within the Notebook
Configuring Python Environments Inside a Notebook
Truefoundry's Deployed Notebooks by default start with a conda environment with Python Version = 3.8.10.
In case you are working on several projects simultaneously, and you want to maintain multiple Python environments for different Python versions / different sets of tasks. You can do so by following these steps:
Command to be executed:
# Create a new conda environment named "myenv" with Python 3.11
# You need to hard refresh after executing this command for kernel to show up in the listing page
conda create -y -n myenv python=3.11
Launching Jupyter Notebook with Custom Images
Instead of using the pre-built Jupyter Lab Images provided by TrueFoundry, you have the flexibility to create and deploy your custom images. This allows you to pre-install specific libraries, tools, and configurations within your notebook environment, tailoring it to your specific needs and project requirements.
To create a custom image, start by using the TrueFoundry Jupyter Notebook Image as a base. Below are the docker images TrueFoundry uses to support Jupyter base and Jupyter full notebooks.
Image URI | Size | Jupyter Lab | CUDA 12.1 Toolkit | Common ML Libraries |
---|---|---|---|---|
public.ecr.aws/truefoundrycloud/jupyter:0.3.20-sudo | ~0.7 GB | ✅ | ||
public.ecr.aws/truefoundrycloud/jupyter:0.3.0-cu121-sudo | ~6 GB | ✅ | ✅ | |
public.ecr.aws/truefoundrycloud/jupyter-full:0.2.20-sudo | ~12.5 GB | ✅ | ✅ | ✅ |
Latest Images
Please visit the following links for latest versions:
You can use one of these images as a base for creating your custom image while keeping some invariants unchanged.
Let's take an example where we want to customize the image by including a few apt packages like ffmpeg
and a few pip packages like gradio
. We will create a new Dockerfile
using these existing images as the base and push to a registry for later use.
FROM truefoundrycloud/jupyter:0.2.20
# Install apt packages
RUN DEBIAN_FRONTEND=noninteractive apt install -y --no-install-recommends ffmpeg
# Install pip packages
RUN python3 -m pip install --use-pep517 --no-cache-dir
Do not overwrite the ENTRYPOINT or CMD instructions. These are built into the base images and are critical for things to work correctly
Other possible customizations can be found here
Build and push the image to a registry for it to be used in TrueFoundry Notebook. Make sure the destination registry is already integrated with the TrueFoundry platform. Detailed instructions are available here
Example: Installing CUDA 11.8 and cuDNN 8
FROM truefoundrycloud/jupyter:0.2.20-sudo
ENV TORCH_CUDA_ARCH_LIST="7.0 7.5 8.0 8.6 9.0+PTX"
ENV DEBIAN_FRONTEND=noninteractive
USER root
# Install CUDA 11.8
RUN apt update && \
apt install -y --no-install-recommends git curl wget htop && \
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-keyring_1.1-1_all.deb -O /tmp/cuda-keyring_1.1-1_all.deb && \
dpkg -i /tmp/cuda-keyring_1.1-1_all.deb && \
apt update && \
apt install -y --no-install-recommends cuda-toolkit-11-8 libcudnn8=8.9.7.29-1+cuda11.8 libcudnn8-dev=8.9.7.29-1+cuda11.8
USER jovyan
Deploying a Custom Notebook Image
Now you can deploy the customized image following the instructions provided below.
Now you can deploy the customized image following the instructions provided below.
Updated 10 days ago