Masterful CLI Trainer: Docker

Introduction

The easiest way to get started with Masterful is to use Docker and the pre-built Masterful Docker Images on Docker Hub. You can train and view the training results with the Docker Images, without having to configure or setup any custom Python environments.

Prerequisites

In order to use the Masterful Docker containers, you must have both Docker as well as the NVIDIA Docker Container Extensions installed to allow your Docker containers to use the GPU.

Masterful Docker Images

The Masterful Docker Images are stored under the masterful/masterful namespace and tagged by version number. For example, the 0.6.0 release of Masterful can be pulled with:

docker pull masterful/masterful:0.6.0.

The latest version of Masterful is always tagged as latest, so in general, you should always run using the latest tag, such as:

docker pull masterful/masterful:latest or docker run --rm --gpus all masterful/masterful:latest masterful-train --config <config_file>

You can see a list of all of the Masterful Docker Image tags here.

CPU vs GPU

By default, you should always train a machine learning model on a GPU. However, Masterful also provides CPU only Docker images for exploring the product on hardware that does not have NVIDIA GPUs installed. These images are tagged with <version>-cpu in Docker Hub. For example, to run the latest CPU image, you run:

$ docker run --rm masterful/masterful:latest-cpu masterful-train --config <config_file>

Training with the Docker Images

Training with the Masterful CLI using the Docker Images is as simple as running the latest image from Docker Hub. For example, the following command will run the Quickstart tutorial using the latest Masterful release:

$ docker run --rm --gpus all masterful/masterful:latest masterful-train --config=https://masterful-public.s3.us-west-1.amazonaws.com/datasets/quickstart/training.yaml

Configuring the Docker Images

In general, the Docker Images are self-contained, and data written during the training process are not accessible to the host OS by design. However, you still want to access the output of your training session so that you can access the model and other artifacts produced during training. In order to do this, you need to map the directory in the container where the artifacts are stored to a directory on the host system where they will be accessed. For example, all of the sample YAML files store their output in the ~/model_output directory (see the output section of the configuration YAML). With a default docker setup, running as root in the container, this will output the training artifacts into /root/model_output in the container. Therefore, you just need to map the container directory to a local directory using the --volume/-v docker run option (reference). For example:

$ docker run -v $HOME/model_output:/root/model_output:rw --rm --gpus all masterful/masterful:latest masterful-train --config <config_file>

Pro Tip

Masterful caches log files and optimized datasets under $HOME/.masterful, so you can save some time and avoid re-downloading datasets if you also map this directory. For example:

$ docker run -v $HOME/.masterful:/root/.masterful:rw -v $HOME/model_output:/root/model_output:rw --rm --gpus all masterful/masterful:latest masterful-train --config <config_file>

The following are directories that you will probably want to mount in your container:

Masterful Directory:      -v $HOME/.masterful:/root/.masterful:rw
Model Output (from YAML): -v $HOME/model_output:/root/model_output:rw
AWS Credentials:          -v $HOME/.aws/credentials:/root/.aws/credentials:ro

Masterful GUI with the Docker Images

You can also run the Masterful GUI using the docker images, in order to visualize the output of your training runs. You will need to forward the network port used by Masterful GUI from your container to the host system. The easiest way to do this is to use the host networking option in the docker run command. The following will run Masterful GUI in the container and expose the client port on the host system:

docker run -v $HOME/.masterful:/root/.masterful:rw --network="host" --rm --gpus all masterful/masterful:latest masterful-gui