Tasks

Enable Meltano addon

A short guide how to enable access to Meltano running on Dotscience

Meltano is an open source convention over configuration product for the whole data life cycle, all the way from loading data to analyzing it [1]. You can create Meltano projects inside Dotscience workspaces and ensure data versioning. We automatically configure access so you can connect to the Meltano instance when it’s running in a remote runner.

Setting up Meltano

  1. Start Jupyter task in your project (instructions on how to create and start a Jupyter Notebook can be found here)
  2. Open terminal in Jupyter notebook
  3. Using CLI from your own computer, find out the ID of the task:

    ds task ls
    ID                                     STATUS              WORKLOAD            ADDONS              AGE
    c1d5ee7c-6d6b-455c-a2e9-67ec11570d6a   running             jupyter                                 4 days
  4. Enable meltano addon for this task, you will get some basic instructions and your publicly accessible URL:

    ds addon enable meltano c1d5ee7c-6d6b-455c-a2e9-67ec11570d6a

    And the response:

    $ ds addon enable meltano c1d5ee7c-6d6b-455c-a2e9-67ec11570d6a
    To use Meltano, create a project:
    
    $ meltano init my-project
    
    Go into the project directory:
    
    $ cd my-project
    
    Set remote connection environment variables:
    
    export MELTANO_AIRFLOW_URL=https://c1d5ee7c-5010.app.cloud.dotscience.com
    export MELTANO_UI_URL=https://c1d5ee7c-5000.app.cloud.dotscience.com
    
    Start Meltano:
    
    $ meltano ui
    
    Now, visit public Meltano URL:
    
    https://c1d5ee7c-5000.app.cloud.dotscience.com
  5. Install Meltano (until configurable tunnel branch is merged, please use our fork, instructions are here)

  6. Create Meltano project (skip if you already have one):

    meltano init my-meltano-project
    cd my-meltano-project
  7. Before starting, ensure that you have set tunnelling environment variables from step 4:

    export FLASK_ENV=development
    export MELTANO_AIRFLOW_URL=https://c1d5ee7c-5010.app.cloud.dotscience.com
    export MELTANO_UI_URL=https://c1d5ee7c-5000.app.cloud.dotscience.com
  8. Now, start meltano and go the URL set by MELTANO_UI_URL environment variable:

    meltano ui

    That’s it! Check out Meltano “Getting Started” docs: https://meltano.com/docs/getting-started.html#create-your-first-project for the next steps.

Meltano installation (temporary)

These steps are temporary until our fix for remote access is merged into the Meltano upstream.

  1. Clone our forked repository:

    git clone https://github.com/dotmesh-io/meltano.git
    cd meltano
  2. Checkout the feature branch:

    git checkout feature/configurable_flask_context
  3. Install dependencies:

    pip3 install --upgrade pip
    pip3 install --upgrade setuptools
        
    apt-get install python3-venv
    python -m venv ~/virtualenvs/meltano-development
    source ~/virtualenvs/meltano-development/bin/activate
    pip3 install -r requirements.txt
    pip3 install -e .[dev]
  4. Prepare frontend

    curl -sS https://dl.yarnpkg.com/debian/pubkey.gpg | apt-key add -
    echo "deb https://dl.yarnpkg.com/debian/ stable main" | tee /etc/apt/sources.list.d/yarn.list
    
    apt update && apt install yarn    

    Now, to compile it:

    make bundle