Integrating with GitLab

Dotscience allows you to hook into GitLab to give you more control over your model builds.

Integrating with GitLab

Dotscience allows you to integrate with GitLab to give you more control over your model builds than you get with our in-built model builder. By getting GitLab to do your model builds for you, you can control exactly what goes into the Docker image and easily put custom code in there. It also allows you to host your model images on a container registry of your choosing. This is great, for example, with sklearn models that need to have custom model serving logic.

Fork example project

For this example we will be using a public gitlab.com account and the (free) gitlab.com public runners to build our model images. You can also follow along using an account on your own GitLab instance, just be sure to change the settings where needed.

Fork the example project at https://gitlab.com/dotscience/gitlab-model-builder-v2 to your own gitlab.com account.

Configure container registry

The example project is configured to push to the GitLab container registry that is automatically configured with a gitlab project.

If you want to change the Docker registry gitlab will use - you can change these variables in the .gitlab-ci.yml file:

  • DOCKER_REGISTRY - e.g. quay.io
  • DOCKER_REGISTRY_USERNAME
  • DOCKER_REGISTRY_PASSWORD
  • DOCKER_IMAGE - e.g. quay.io/myusername/myrepo

GitLab authentication

We need to create a new Access Token for your GitLab profile with API scope. This will allow Dotscience to communicate with GitLab’s API.

You can set up Access Tokens in your GitLab User Settings which you can get to by clicking on your avatar in the top right corner. Whilst in the User Settings section, click on Access Tokens in the menu.

When creating the Access Token, be sure to enable the api scope. Your new token will be displayed at the top of the page - be sure to copy it somewhere safe as you’ll need this soon.

Creating the GitLab configuration

You’ll need to make a note of the following 3 things:

  1. The URL of the GitLab instance (e.g. https://gitlab.com);
  2. The name of the GitLab project (e.g. <YOUR_USERNAME>/gitlab-model-builder-v2).
  3. The Access Token you created above;

Note: If you’re using the CLI you can use the project’s ID instead of its name. This can be found in the project’s General settings page under the General project heading.

Using the GUI

Create a free Dotscience account. Once logged in, you’ll find a link to the CI Configurations page in the navigation menu. This page allows you to easily create, view, modify and remove CI integrations.

Begin by clicking the Add new button in the top right of the page and selecting GitLab from the dropdown menu. You’ll now be presented with a form where you can enter the following:

  • Name - A custom configuration name used to identify the configuration when configuring builds.
  • Description - An optional description of the configuration.
  • Access Token - Your GitLab Access Token.
  • GitLab URL - The URL of the GitLab instance.
  • GitLab Project - The name of the project on GitLab.
  • Reference- A custom reference.

Upon clicking Continue you will now see your new integration in the configurations list, giving you the control to make modifications or remove it if you so desire:

You’ll notice that the initial dotscience-default configuration is still marked as Default. This means that all model builds will currently be using this configuration and not the one we’ve just created. To make the new configuration the default for Dotscience to use, simply click on the tick ✓ icon on the right of the configuration you’ve just created:

Upon confirming this change, you will notice the ‘Default’ label is now next to your new configuration. All future model builds will now be run using this configuration.

Train and build a model

Now we have our GitLab configured to perform model builds - let’s put it to use by training a model then building it with a GitLab pipeline.

We will use an example Dotscience notebook to train a Tensorflow model and then our GitLab pipeline to build a Tensorflow serving image and push it to a container registry.

It’s easy to use another machine learning framework, for example, sklearn models that need to have custom model serving logic.

There are two components involved in this:

  • notebook - produces model files using any framework you want
  • gitlab - downloads the model files and creates/pushes a container image

You are free to change both the notebook and the gitlab config to suit your needs.

Before we being, make sure you have selected your GitLab configuration as the Default using the instructions above.

Fork sample project

Login to your dotscience account and scroll to the bottom of the projects page towards the Public projects section

You will notice a project called MNIST Example - click this project and then click the Fork this project button

Launch Jupyter

We will now launch this project on a Jupyter instance. Click the Settings button on our forked project and make sure we have a runner.

Your managed runner might still be starting, in which case wait a short time for it to be ready.

Otherwise you will need to add a runner (you can add a managed runner easily).

Once our runner is online - click the Open button for Jupyter.

Train model

Once Jupyter has loaded - you should see a notebook called mnist-model.ipynb in the file tree.

Open this notebook and Run all cells.

You should notice the Dotscience plugin on the right hand side pick up the run.

Build model

Click on the Runs button top left and you should see your run.

Click the Models button on the top and you should your model.

Click the Build button for your model. This will have triggered a gitlab pipeline on your gitlab account.

Wait for the build to finish and then click the Logs button - this will open the gitlab pipeline.

Deploy model

Now we can deploy this model using aD managed deployer. Click Deploy for the model we just built.

Choose a managed deployer (in our case eu-west) and choose Create new deployment.

Then enter a name without dashes (e.g. mnisttest) then click the Deploy button.

Your model server is now being deployed to a Kubernetes cluster. Copy the hostname from the deployments page using the copy to clipboard button.

Open our example application at https://deploy-demo.dotscience.com/.

This allows us to test our model running in production. Paste the URL you copied above into the text field, click on some numbers and your Dotscience trained, gitlab built model, is now serving predictions!

Take a look in the sample gitlab project to understand how the model files from Dotscience are turned into a container image.

Further reading:

Using the CLI

The following command can be used to create the GitLab configuration from the CLI:

ds ci create gitlab --name {NAME} --url {URL} --project {PROJECT} --ref {REFERENCE} --token {TOKEN}
  • {NAME} - A custom configuration name used to identify the configuration by the dotscience-python library.
  • {URL} - The URL of the GitLab instance.
  • {PROJECT} - The name of the GitLab project.
  • {TOKEN} - Your GitLab Access Token.
  • {REFERENCE} - A git ref, such as a branch name.

Using our sample project, you’d use something similar to:

ds ci create gitlab --name my-gitlab-configuration --url https://gitlab.com --project <YOUR_USERNAME>/gitlab-model-builder-v2 --token 9RX-3CvNmX7voy1cszt_ --ref master

To confirm that this went through successfully, you can now run ds ci ls and you should see something like:

ID                                     NAME                      TYPE     AGE
e3d46e96-01e9-40e7-a229-abe4811f513b   my-gitlab-configuration   gitlab   2 seconds