Technical Overview | Structure | Spawner: Prepare the Jupyter Notebook Image | Deploy JupyterHub on an OpenStack instance
Disclaimer
This repository is inspired to the official Jupyterhub work, so in order to have further clarifications, please feel free to explore it.
JupyterHub deployer
This deployer provides a reference implementation of JupyterHub, a multi-user Jupyter Notebook environment, on a single host using Docker.
Possible use cases include:
- Creating a JupyterHub demo environment that you can spin up relatively quickly.
- Providing a multi-user Jupyter Notebook environment for small classes, teams, or departments.
Technical Overview
Key components of this reference deployment are:
-
Host: Runs the JupyterHub components in a Docker container on the host.
-
Authenticator: Uses Native Authenticator to authenticate users.
-
Spawner:Uses DockerSpawner to spawn single-user Jupyter Notebook servers in separate Docker containers on the same host.
-
Persistence of Hub data: Persists JupyterHub data in a Docker volume on the host.
-
Persistence of user notebook directories: Persists user notebook directories in Docker volumes on the host.
Structure
Docker
This deployment uses Docker, via Docker Compose, for all the things. Docker Engine 1.12.0 or higher is required.
HTTPS and SSL/TLS certificate
This deployment configures JupyterHub to use HTTPS. It will be necessary then to redirect the service to a custom domain name that you wish to use for JupyterHub, for example, myfavoritesite.com
or jupiterplanet.org
.
Authenticator
This deployment uses Native Authenticator to authenticate users.
Native Authenticator provides the following features:
-
New users can signup on the system;
-
New users can be blocked of accessing the system and need an admin authorization;
-
Option of increase password security by avoiding common passwords or minimum password length;
-
Option to block users after a number attempts of login;
-
Option of open signup and no need for initial authorization;
-
Option of adding more information about users on signup.
By default, the user "admin" has administrator privileges. Once logged in, this user will have the ability to manage new users registrations through JupyterHub's admin console at https://jupyterhub_ip_address/hub/authorize.
Spawner: Prepare the Jupyter Notebook Image
You can configure JupyterHub to spawn Notebook servers from any Docker image, as
long as the image's ENTRYPOINT
and/or CMD
starts a single-user instance of
Jupyter Notebook server that is compatible with JupyterHub.
Whether you build a custom Notebook image or pull an image from a public or private Docker registry, the image must reside on the host.
If the Notebook image does not exist on host, Docker will attempt to pull the image the first time a user attempts to start his or her server. In such cases, JupyterHub may timeout if the image being pulled is large, so it is better to pull the image to the host before running JupyterHub.
This deployment defaults to the
ROOT notebook
Notebook image, which is built on top of the scipy-notebook
Docker stacks.
Deploy JupyterHub on an OpenStack instance
DISCLAIMER: this operations work best on Linux; we recommend to such an OS to setup the infrastructure.
Prerequisites
It is required to install the OpenStack CLI and the Terraform CLI; as a backup, the setup script will initialize anyway the installation of both, but it is preferred to have them pre-installed.
Install Terraform CLI
To install terraform, execute the following commands (you need to have SUDO privileges):
sudo apt-get update && sudo apt-get install -y gnupg software-properties-common curl
curl -fsSL https://apt.releases.hashicorp.com/gpg | sudo apt-key add -
sudo apt-add-repository "deb [arch=amd64] https://apt.releases.hashicorp.com $(lsb_release -cs) main"
sudo apt-get update && sudo apt-get --yes install terraform
Install OpenStack CLI
To install OpenStack CLI, execute the following commands (you need to have SUDO privileges):
pip install python-openstackclient
If you don't have pip
installed, please run sudo apt install python3-pip
.
For further informations, please read the official documentation.
Configuration and Deployment
To operatively deploy the JupyterHub server on top of an OpenStack VM, you need to follow a few steps:
- Clone this repository in a directory of your choice on your PC, by executing one of the two commands below:
git clone https://gitlab.cern.ch/atlas-open-data-iac-qt-2021/aws_automated_jh_deployment.git
git clone ssh://git@gitlab.cern.ch:7999/atlas-open-data-iac-qt-2021/aws_automated_jh_deployment.git
-
Fill the
config.ini
file, which contains the following fields:-
fileconfig
: the path to the .sh file used to configure the credentials (openrc.sh). See here how to download this file. -
username
: the name that will be created for the user in the instance. -
password
: the password for the instance created. To be used together with username in case of direct ssh connection. -
hostname
: the name for the created instance. -
sshkey
: the name of the ssh key created for the instance.
-
image
: the image of the VM created in OpenStack. For this deployment, two different images were tested:- Ubuntu 18.04
- Ubuntu 20.04
If you do not have an Ubuntu image, you can create one by following the guide.
-
flavor
: the flavor of the VM created in OpenStack. This parameter encodes the resources available for the instance, such as CPU, GPU, memory, etc. The list of available flavors can be found at https://docs.openstack.org/nova/rocky/admin/flavors.html. -
dnsrecord
: [optional] the domain to be used for the certificate. As mentioned before, it is not necessary; if you possess a valid domain, this variable will produce the related certificate and will set up the accessibility. -
notebook
: the image of the notebook that will be used. By default, this parameter points to atlasopendata/root_notebook. You can insert any custom image compatible with JupyterHub.
-
-
Run the
setup.sh
script by executing
source setup.sh
As a first operation, you will configure the OpenStack service: follow the prompts to input your OpenStack password. Then, a summary of the whole configuration will be displayed for you to confirm. Once done that, terraform will begin the deployment process, and will return the URL of the Hub at the end of the procedure.
Appendix: create an Image with OpenStack CLI
If you do not have the image of the OS you want to install, you can use openstack to build your own image. First download the image, for instance:
wget https://cloud-images.ubuntu.com/bionic/current/bionic-server-cloudimg-amd64.img
As an example, an Ubuntu 18.04 server image is downloaded with the command above. All the Ubuntu server distributions can be inspected at this page.
To create an image, use openstack image create
; the following list explains the optional arguments that you can use with the create and set commands to modify image properties.
For more information, refer to the OpenStack Image command reference.
Remember that the setup.sh
file or the openrc.sh
file have to be sourced before lauching the image creation command.
openstack image create --disk-format qcow2 --container-format bare --shared --file bionic-server-cloudimg-amd64.img bionic-server
-
--disk-format qcow2
refers to the format used by openstack to deploy the image, in this case qemu. -
--container-format bare
refers to the image container format. The supported options are: ami, ari, aki, bare, docker, ova, ovf. The default format is: bare. -
--shared
refers to the sharability of the image once created. -
-file bionic-server-cloudimg-amd64.img
provides the path to the image file. -
bionic-server
is the name with which the image will be saved on Openstack.
Additional informations can be found here.
This operation can take several minutes. Once finished, the list of images can be accessed by running:
openstack image list