Masterful CLI Trainer: Docker¶
Introduction¶
The easiest way to get started with Masterful is to use Docker and the pre-built Masterful Docker Images on Docker Hub. You can train and view the training results with the Docker Images, without having to configure or setup any custom Python environments.
Prerequisites¶
In order to use the Masterful Docker containers, you must have both Docker as well as the NVIDIA Docker Container Extensions installed to allow your Docker containers to use the GPU.
Masterful Docker Images¶
The Masterful Docker Images are stored under the masterful/masterful
namespace and tagged by version number. For example, the 0.6.0
release of Masterful can be pulled with:
docker pull masterful/masterful:0.6.0
.
The latest version of Masterful is always tagged as latest
, so in general, you should always run using the latest
tag, such as:
docker pull masterful/masterful:latest
or
docker run --rm --gpus all masterful/masterful:latest masterful-train --config <config_file>
You can see a list of all of the Masterful Docker Image tags here.
CPU vs GPU¶
By default, you should always train a machine learning model on a GPU. However, Masterful also provides CPU only Docker images for exploring the product on hardware that does not have NVIDIA GPUs installed. These images are tagged with <version>-cpu
in Docker Hub. For example, to run the latest CPU image, you run:
$ docker run --rm masterful/masterful:latest-cpu masterful-train --config <config_file>
Training with the Docker Images¶
Training with the Masterful CLI using the Docker Images is as simple as running the latest
image from Docker Hub. For example, the following command will run the Quickstart tutorial using the latest
Masterful release:
$ docker run --rm --gpus all masterful/masterful:latest masterful-train --config=https://masterful-public.s3.us-west-1.amazonaws.com/datasets/quickstart/training.yaml
Configuring the Docker Images¶
In general, the Docker Images are self-contained, and data written during the training process are not accessible to the host OS by design. However, you still want to access the output of your training session so that you can access the model and other artifacts produced during training. In order to do this, you need to map the directory in the container where the artifacts are stored to a directory on the host system where they will be accessed. For example, all of the sample YAML files store their output in the ~/model_output
directory (see the output section of the configuration YAML). With a default docker setup, running as root in the container, this will output the training artifacts into /root/model_output
in the container. Therefore, you just need to map the container directory to a local directory using the --volume/-v
docker run option (reference). For example:
$ docker run -v $HOME/model_output:/root/model_output:rw --rm --gpus all masterful/masterful:latest masterful-train --config <config_file>
Pro Tip¶
Masterful caches log files and optimized datasets under $HOME/.masterful
, so you can save some time and avoid re-downloading datasets if you also map this directory. For example:
$ docker run -v $HOME/.masterful:/root/.masterful:rw -v $HOME/model_output:/root/model_output:rw --rm --gpus all masterful/masterful:latest masterful-train --config <config_file>
The following are directories that you will probably want to mount in your container:
Masterful Directory: -v $HOME/.masterful:/root/.masterful:rw
Model Output (from YAML): -v $HOME/model_output:/root/model_output:rw
AWS Credentials: -v $HOME/.aws/credentials:/root/.aws/credentials:ro
Masterful GUI with the Docker Images¶
You can also run the Masterful GUI using the docker images, in order to visualize the output of your training runs. You will need to forward the network port used by Masterful GUI from your container to the host system. The easiest way to do this is to use the host
networking option in the docker run
command. The following will run Masterful GUI in the container and expose the client port on the host system:
docker run -v $HOME/.masterful:/root/.masterful:rw --network="host" --rm --gpus all masterful/masterful:latest masterful-gui