Install Dotscience on AWS

Dotscience installation on Amazon Web Services

Prerequisites

Clone repo

Start by cloning our open-source terraform repo:

git clone https://github.com/dotmesh-io/dotscience-tf
cd dotscience-tf

Choose the AWS as the cloud provider

cd aws

Set up Terraform

Init your terraform, this will ensure the relevant plugins for Terraform are installed:

terraform init -upgrade

Ensure you are authenticated to your AWS account:

aws configure

There are several authentication mechanisms, please see instructions here from AWS about different ways to set up access to AWS.

Set variables

Open inputs.tfvars in your favorite text editor, and put the following in it:

admin_password          = "a-secure-password"
letsencrypt_mode        = "production"
license_key             = "a valid license key for Dotscience"
key_name                = "An existing AWS keypair name"
hub_ingress_cidrs        = ["0.0.0.0/0"]
ssh_access_cidrs         = ["0.0.0.0/0"]
remote_runner_ingress_cidrs = ["0.0.0.0/0"]
grafana_admin_password  = "a-secure-password"
aws_role_arn            = "arn:aws:iam::account-id:role/role-name"
create_eks              = "true/false"
create_deployer         = "true/false"
create_monitoring       = "true/false"

tls_config_mode         = "dns_route53" / "http" (defaults to http)
dotscience_domain      = "your-route53-domain.net"

admin_password - Password to access Dotscience Hub. The default admin username is admin. See defaults in variables.tf. Additional users can be set up by navigating to https:///sign-up

letsencrypt_mode - Setting this to production tells Dotscience Hub to setup TLS with the production servers from Let’s Encrypt. This is optimised for a generic one-click install of Dotscience on Terraform. TLS setup with DNS challenges with custom domains are also available.

license_key - Get your license key from our Licensing service.

key_name - This is the name of the AWS key pair that is used to SSH in to virtual machines; you can create a new one here.

grafana_admin_password - Password to access the Grafana monitoring dashboard. The default admin username is admin. See defaults in variables.tf

You can optionally toggle the creation of EKS clusters, and also model deployment and monitoring infrastructure (that use the underlying EKS cluster) by setting create_eks = false, create_deployer = false and create_monitoring = false. See defaults in variables.tf

aws_role_arn - The AWS role ARN, Terraform will attempt to assume this role using the supplied credentials. For other ways to authenticate with AWS using environment variables and static credentials please refer the Terraform AWS provider docs

*ingress_cidrs - By default the ingress CIDR’s for various firewall rules are restrictive. Use variables hub_ingress_cidrs, ssh_access_cidrs, remote_runner_ingress_cidrs to enable whitelisting IP addresses (in CIDR format) to enable access. Remote runners refer to runners that can be provisioned outside AWS (eg. on your workstation, or in your own servers).

Terraforming!

Now deploy the Dotscience stack:

make apply

It should print out the hostname you can access your stack on!

Wait a few minutes for the hub to set itself up. (To see the startup logs, ssh into the Hub instance and tail -f /var/log/cloud-init-output.log).

Now log in and do some data science!

Runners - where training happens

Runners are where model training happens, such as within Jupyter notebooks or via ds run.

Managed runners are VMs which are auto-provisioned when users create them through Dotscience. They are by default of type t3.medium. They will be destroyed automatically when idle to save money.

You can also attach non-managed runners, such as on-prem physical hardware, which can include GPUs. Simply go to the menu (top-right) in the app and click Runners, Add New Runner, and you’ll be given a docker run command to execute on your runner (e.g. DGX server).

Deployers - where inference and monitoring happens

Deployers are where models run.

You can also attach non-managed deployers, such as an on-prem Kubernetes cluster. Simply go to the menu (top-right) in the app and click Deployers, Add New Deployer, and you’ll be given a kubectl apply command to execute on your Kubernetes cluster.

DNS configuration for Dotscience Hub and Dotscience Deployer

DNS configuration for Dotscience Hub

Default mode

By default, tls_config_mode="http". This tells Dotscience to use Let’s Encrypt HTTP-01 challenge to generate an SSL cert for the domain specified in dotscience_domain

For ease of use, dotscience_domain defaults to .your.dotscience.net. This setting provisions the Hub on a URL such as, https://a-b-c-d.your.dotscience.net, and uses Let’s encrypt for SSL certificates. This works out of the box, and the wildcard DNS server is maintained by the Dotscience team.

However in real production environments you would want to configure this to point to a custom domain - eg. http://dotscience.corp.com. We support this mode, and also other ways to set up TLS - with a DNS challenge (see below), HTTP challenge or custom certs.

Using AWS Route53 for certificate validation

Setting tls_config_mode="dns_route53" tells Dotscience to use AWS Route53 as a DNS Provider to validate SSL certificates. This option has a requirement that you control the domain that you intend to use for Dotsience and have it managed as a AWS Route53 Hosted Zone.

This mode also requires setting dotscience_domain = "your-route53-domain.com" that is controlled by AWS Route53. This allows Dotscience to register a subdomain on this hosted zone, and proceed to do certificate validation on that domain. By default, this is a URL like https://ds.your-route53-domain.com. The subdomain name and format can be changed with the variable environment = "ds" by default.

Dotscience Deployer and DNS configuration deployed models

Dotscience supports an option to use your own domain for model deployments as well. This is configured to use the dotscience_domain specified earlier. Models deployed with the setting below will be available as http://deployment-id.model-ds.your-domain.net

tls_config_mode   = "dns_route53"
dotscience_domain = "your-domain.net"

Roadmap

We plan to improve the Terraform stack in the following ways:

  • Add the ability to migrate data from older installations of Dotscience without Terraform.

Need help?

Jump on our Slack or contact us.