Install Dotscience on AWS

Dotscience installation on Amazon Web Services

Prerequisites

Clone repo

Start by cloning our open-source terraform repo:

git clone https://github.com/dotmesh-io/dotscience-tf
cd dotscience-tf

Choose the AWS as the cloud provider

cd aws

Set up Terraform

Init your terraform, this will ensure the relevant plugins for Terraform are installed:

terraform init

Ensure you are authenticated to your AWS account:

aws configure

There are several authentication mechanisms, please see instructions here from AWS about different ways to set up access to AWS.

Set variables

Open inputs.tfvars in your favorite text editor, and put the following in it:

admin_password         = "a-secure-password"
letsencrypt_mode       = "production"
license_key            = "a valid license key for Dotscience"
hub_ingress_cidr       = "0.0.0.0/0"
key_name               = "An existing AWS keypair name"
ssh_access_cidr        = "0.0.0.0/0"
grafana_admin_password = "a-secure-password"
aws_role_arn           = "arn:aws:iam::account-id:role/role-name"

admin_password - Password to access Dotscience Hub. The default admin username is admin. See defaults in variables.tf. Additional users can be set up by navigating to https:///sign-up

letsencrypt_mode - Setting this to production tells Dotscience Hub to setup TLS with the production servers from Let’s Encrypt with an pre-allocated wildcard domain from *.your.dotscience.net. This is optimised for a generic one-click install of Dotscience on Terraform. TLS setup with DNS challenges with custom domains are also available. Please contact us for setup instructions.

license_key - Get your license key from our Licensing service.

key_name - This is the name of the AWS key pair that is used to SSH in to virtual machines; you can create a new one here.

grafana_admin_password - Password to access the Grafana monitoring dashboard. The default admin username is admin. See defaults in variables.tf

You can optionally toggle the creation of EKS clusters, and also model deployment and monitoring infrastructure (that use the underlying EKS cluster) by setting create_eks = false, create_deployer = false and create_monitoring = false. See defaults in variables.tf

aws_role_arn - The AWS role ARN, Terraform will attempt to assume this role using the supplied credentials. For other ways to authenticate with AWS using environment variables and static credentials please refer the Terraform AWS provider docs

Terraforming!

Now deploy the Dotscience stack:

terraform apply -var-file inputs.tfvars

It should print out the hostname you can access your stack on!

Wait a few minutes for the hub to set itself up. (To see the startup logs, ssh into the Hub instance and tail -f /var/log/cloud-init-output.log).

Now log in and do some data science!

Runners - where training happens

Runners are where model training happens, such as within Jupyter notebooks or via ds run.

Managed runners are VMs which are be auto-provisioned when users create them through Dotscience. They are by default are of type t3.medium. They will be destroyed automatically when idle to save money.

You can also attach non-managed runners, such as on-prem physical hardware, which can include GPUs. Simply go to menu (top-right) in the app and click Runners, and Add New Runner, and you’ll be given a docker run command to execute on your runner (e.g. DGX server).

Deployers - where inference and monitoring happens

Deployers are where models run.

You can also attach non-managed deployers, such as an on-prem Kubernetes cluster. Simply go to menu (top-right) in the app and click Deployers, and Add New Deployer, and you’ll be given a kubectl apply command to execute on your Kubernetes cluster.

Roadmap

We plan to improve the Terraform stack in the following ways:

  • Modularise terraform templates (for those who are curious, this makes the templates look nicer!)
  • Add periodic backups to the Dotscience Hub Volumes.
  • Add the ability to migrate data from older installations of Dotscience without Terraform.
  • Document how to set up your own domain and hostname in DNS and having Let’s Encrypt work for it.

Need help?

Jump on our Slack or contact a sales rep.