9. Jupyter Notebooks

Tip

Jupyter notebooks are very popular in science for interactive work. In this page you will learn:

  • how to use Jupyter notebooks on Spider

  • which available flavors to choose

Three methods of running jupyter notebooks in jupyter lab are discussed here: with virtual environments, with apptainer containers and with EESSI software module. Some of this has also been covered in the compute section.

9.1. Where to run notebooks

Interactive notebooks should be run on the worker nodes mentioned in Prepare your workloads and not on the UI machines. Building the environments and containers can be done on the UI, but once you start to run your code, please connect to a machine interactively with:

srun --partition=short --time=12:00:00 --pty bash -i -l

Which will open an interactive session on a machine in the short partition for 12 hours. In this way, the other users on the UI machines will not be disadvantaged by resources being used up by the notebook users.

Warning

If all resources (worker nodes in the selected partition) are in use, the srun command will hang until the resource becomes available.

9.2. Virtual environment

Starting with python virtual environments called a venv, these are a contained python environment you can create and load that has all the python modules and packages installed that the user needs. This ensures no componentes leak into the system environment.

You can create a virtual environment (or venv) at a path by doing:

python3.9 -m venv test_venv/

This will create a folder called test_venv which contains the entire python environment. You can also use other python versions if you prefer. To load this environment run:

source test_venv/bin/activate

This will show in some shells a (test_venv) next to your command line. In the environment you can now install packages using pip:

pip install jupyterlab pandas docopt

To start a jupyter session, run

jupyter lab --ip="*" --no-browser

Where the ip flag and the no-browser respectively ensure that the session is forwarded through the network and that no browser is opened in an X11 session that may be running through your ssh connection.

To properly forward the lab session to your local machine, a second terminal has to be opened running:

ssh -NL 8888:wn-la-06:8888 spider

where the machine name has to match where the kernel is running (wn-la-06 has to match) and the forwarded port (in this example 8888) has to match the port given by the jupyter-lab instance. Again, do not run notebooks on UI machines. Now that the tunnel is opened and should forward the connection to your browser, open the link provided by jupyter in your favorite browser. The link has the shape http://localhost:8888/lab?token=abc123.

Once you are done with the virtual environment and want to go back to the inital user environment type:

deactivate

and the python environment is unloaded. To reload the environment again do:

source test_venv/bin/activate

Warning

Some jupyter instances provide a link of that contains hostname:8888. Replace hostname with localhost or 127.0.0.1 to properly fetch the notebook.

9.3. Apptainer container

9.3.1. Pre-built container

To run a notebook in a apptainer container, we have to fetch or build the container first. A tutorial on containers can be found in Building and running a apptainer container, but note that this particular example focuses on using GPUs. A more general introduction is provided here.

First we start by fetching a container:

apptainer build jupyter.sif docker://jupyter/scipy-notebook:latest

This will pull one of the official jupyter containers from docker hub, and build a apptainer container from it. This container encapsulates the entire environment and can be entered to start a notebook session. Supported jupyter containers can be found here, and more docker images in general can be found at docker hub.

After the build procedure is complete, you can start the jupyter instance on a worker node (not a UI) with

apptainer run jupyter.sif

which will automatically start the instance. Alternatively, you can start an interactive shell session in the container and start it manually:

apptainer shell jupyter.sif
jupyter lab

To receive the notebook locally in your browser, as mentioned above, a tunnel has to be opened in a new terminal, with:

ssh -NL 8888:wn-la-01:8888 spider

Where, again, the machine name and port name have to match where you are running the job and the port chosen by jupyter, respectively. Now you can open the link provided by jupyter, which has the shape of http://localhost:8888/lab?token=abc123.

9.3.2. Custom image

Apptainer images can be customised to suit your needs, by adding extra steps during the build process. This is done with so-called ‘definition’ files. These are plaintext files with instructions for the apptainer build. For a full overview, see the apptainer documentation. Here is a small example of a custom image that can be expanded. This example also has docopt installed during installation, and calling the apptainer run command opens the container and starts the notebook instance for you. Make a file called jup-custom.def and fill it with:

Bootstrap: docker
From: jupyter/scipy-notebook:latest

%post
  pip install docopt

%runscript
  jupyter lab --ip=0.0.0.0

%help
  This is a demo container to show how to run jupyter lab

You can build this with:

apptainer build jup-custom.sif jup-custom.def

and once it is finished building, you can enter the sif file with the apptainer shell command, or start jupyter directly with apptainer run. You still have to forward the connection as described above before you can open the notebook in a browser. To save your notebook, in the browser you can use Save As from the menu. For more information on running jupyter lab and notebooks, see the official jupyter documentation.

To get a full overview of what is possible during building in terms of installing packages, raising permissions, setting paths, mounting local folders and more, see the official apptainer documentation.

9.3.3. Notebook resources

A few resources on prebuilt images and documentation:

9.4. EESSI software module

EESSI software repository is a common stack of scientific software installations for HPC systems. You can use Jupyter Notebook software module from EESSI repository combined with SSH port forwarding. After starting an interactive session on a work node, to set up the EESSI environment simply run the command:

source /cvmfs/software.eessi.io/versions/2023.06/init/bash

Please check EESSI website for newer repository release than 2023.06.

Next load module

module load nodejs/18.17.1-GCCcore-12.3.0
module load JupyterNotebook/7.0.2-GCCcore-12.3.0

The nodejs module is necessary for resolving an error message caused by older nodejs version. The module load commands will load JupyterNotebook and all its dependencies automatically. You can check this by running command

module list

Next start jupyter lab by running command

jupyter lab --ip="*" --no-browser --port=8888

Note that if port 8888 is already in use by another JupyterNootbook program, you will be assigned the next port number, such as 8889, 8890… Let’s call this Spider-port-number. The terminal output contains a link you can use to open the Jupyter Notebook in the web browser. The link looks like this

http://localhost:8888/lab?token=xxxxxxxxxxxxx

To properly forward the Jupyter session to your local machine, a second terminal needs to be opened in your laptop running

ssh -NL 8888:wn-la-06:8888 username@spider.surf.nl

Adjust wn-la-06:8888, if necessary, with actual node name and Spider-port-number.

Once the connection is successful, open the link in your web browser. Note that if the link contains port number other than 8888, you need to adjust it to 8888 as this is the number you set in the port forwarding for your local machine.

To stop the Jupyter Notebook and port forwarding, simply close the web page and exit all the terminals.

See also

Still need help? Contact our helpdesk