Tutorials

Start here to learn how to use Dotscience

This section contains tutorials showing various ways that Dotscience can be used.

Quick Start

To see a quick start guide, start with the tutorial on Notebook based development and Script based development

In this section.

Collaboration tools

Jupyter Notebooks are notoriously hard to use well with Git and GitHub. Dotscience lets you fork someone else's project, create new runs in notebooks and propose them back along with their metrics. See a full, clear full notebook diff and merge conflicting changes with ease.

Dotscience with Node-RED pipelines

Automation flows built with visual pipelines

Dotscience for Data Engineering

We explore how data engineering scripts can be instrumented with functions from the Dotscience python library to enable tracking and provenance.

Git Integration for Code

Dotscience Git integration allows you to synchronize GitHub repositories with Dotscience filesystems during the run.

Hyperparameter Optimization

We look at how you can use Dotscience to explore relationships between hyperparameters & metrics

Integrating with GitLab

Dotscience allows you to hook into GitLab to give you more control over your model builds.

Monitoring models in production

We demonstrate how you can use Dotscience to provision a dashboard to monitor your models in production.

Notebook-based development

In this section we'll cover Jupyter notebook-based development with Dotscience, from setting up your model to deployment and monitoring.

Provenance

Learn how you can trace from a model to its training data and back from that to the raw data

Review experiments

Dotscience enables users to collaborate and offer feedback directly on individual runs.

S3 Integration for Datasets

This tutorial demonstrates Dotscience S3 integration with readonly datasets.

Deploying scikit-learn models

In this section we'll demonstrate how you can build and deploy scikit-learn models with Dotscience.

Using Dotscience with arbitrary programs

If you have arbitrary code you need to run, you can use the ds command-line tool to integrate it with Dotscience.

Using Dotscience in remote mode with Python scripts

If you want to develop Python scripts for model training – as opposed to Jupyter notebooks – the best way to do that is by using the Dotscience Python library in 'remote' mode, known as Dotscience Anywhere.