Set Up Runners

Expose GPUs on your runner

Use Dotscience for GPU-accelerated workflows

You can use Dotscience with NVIDIA GPU-accelerated workflows. At present, we do not support non-NVIDIA GPUs.

To access your machine’s NVIDIA GPUs from within the Dotscience Docker container that runs on the machine, you need to register NVIDIA GPU-aware container runtime. If you are using NVIDIA DGX servers, this runtime will already be installed. Otherwise, you may need to install it yourself.

There are two versions of the NVIDIA container runtime:

  1. The first generation of NVIDIA GPU-aware container runtime, released in 2016, is nvidia-docker.

  2. The second generation of NVIDIA GPU-aware container runtime is nvidia-container-runtime. This is the latest version.

Which container runtime do I have?

If you don’t know which container runtime is available on your system, try the following command:

$ docker run --runtime=nvidia --rm nvidia/cuda:9.0-base nvidia-smi

If this shows you a table with your GPU specifications, you have nvidia-container-runtime. If it instead shows an error, try:

$ nvidia-docker run --rm nvidia/cuda:9.0-base nvidia-smi

If this shows you a table with your GPU specifications, but the first command did not, then you have nvidia-docker. However, if this command also shows an error, you do not have any NVIDIA container-aware runtime, and should install it, instructions for installing NVIDIA runtime are in the section below.

Which container runtime do I need? And how do I upgrade?

We recommend upgrading your runtime to nvidia-container-runtime if you can. Instructions for upgrading your container runtime are provided by NVIDIA here. However, both versions of the runtime are supported by Dotscience.

Installing NVIDIA container aware runtime on Ubuntu

Make sure you have the latest docker-ce installed

curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh

Install NVIDIA Docker

# Add the package repositories
$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
$ curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
$ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

The command sets up GPG keys and adds nvidia-docker to the apt repository.

Then update the apt list

$ sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit nvidia-docker2

Restart Docker

$ sudo systemctl restart docker

Make sure the appropriate NVIDIA drivers are installed. This will vary depending on your driver, a common way to install the drivers is by using the packaged nvidia drivers. For more information see Frequently Asked Questions · NVIDIA/nvidia-docker Wiki · GitHub. The CUDA toolkit provides the necessary user libraries to expose the GPU to generic workloads, make sure the appropriate version is installed from CUDA Toolkit 10.2 Download | NVIDIA Developer

Check that the drivers are installed correctly with

$ sudo nvidia-container-cli -k -d /dev/tty list

I0106 18:07:13.149008 414975 nvc.c:281] initializing library context (version=1.0.5, build=13b836390888f7b7c7dca115d16d7e28ab15a836)   
I0106 18:07:13.149053 414975 nvc.c:255] using root / 
...
I0106 18:07:14.107453 414975 nvc.c:318] shutting down library context
I0106 18:07:14.107670 414977 driver.c:192] terminating driver service
I0106 18:07:14.386696 414975 driver.c:233] driver service terminated successfully

and

$ docker run --runtime=nvidia --rm nvidia/cuda:9.0-base nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.33.01    Driver Version: 440.33.01    CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla V100-PCIE...  Off  | 00000000:3B:00.0 Off |                    0 |
| N/A   34C    P0    36W / 250W |      0MiB / 32510MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

If this shows a table with NVIDIA smi(system management interface), your installation has succeeded.

Add GPU runners to Dotscience

On the Dotscience Runners page, click on Add your own runner select the runner type GPU - NVIDIA or GPU NVIDIA (legacy) corresponding to your container runtime and set your desired runner storage size allocation and click continue.

Copy the runner startup instructions on the next page into a terminal or command line on your runner to connect the runner to Dotscience.

Your new GPU enabled runner should now appear on the list of available runners.