# Masterful CLI Trainer: Docker ## Introduction The easiest way to get started with Masterful is to use Docker and the pre-built [Masterful Docker Images](https://hub.docker.com/r/masterful/masterful) on Docker Hub. You can train and view the training results with the Docker Images, without having to configure or setup any custom Python environments. ## Prerequisites In order to use the Masterful Docker containers, you must have both [Docker](https://docs.docker.com/get-docker/) as well as the [NVIDIA Docker Container Extensions](https://github.com/NVIDIA/nvidia-docker) installed to allow your Docker containers to use the GPU. ## Masterful Docker Images The Masterful Docker Images are stored under the `masterful/masterful` namespace and tagged by version number. For example, the `0.6.0` release of Masterful can be pulled with: `docker pull masterful/masterful:0.6.0`. The latest version of Masterful is always tagged as `latest`, so in general, you should always run using the `latest` tag, such as: `docker pull masterful/masterful:latest` or `docker run --rm --gpus all masterful/masterful:latest masterful-train --config ` You can see a list of all of the Masterful Docker Image tags [here](https://hub.docker.com/r/masterful/masterful/tags). ### CPU vs GPU By default, you should always train a machine learning model on a GPU. However, Masterful also provides CPU only Docker images for exploring the product on hardware that does not have NVIDIA GPUs installed. These images are tagged with `-cpu` in Docker Hub. For example, to run the latest CPU image, you run: `$ docker run --rm masterful/masterful:latest-cpu masterful-train --config ` ## Training with the Docker Images Training with the Masterful CLI using the Docker Images is as simple as running the `latest` image from Docker Hub. For example, the following command will run the [Quickstart](../notebooks/tutorial_quickstart_cli.ipynb) tutorial using the `latest` Masterful release: ```shell $ docker run --rm --gpus all masterful/masterful:latest masterful-train --config=https://masterful-public.s3.us-west-1.amazonaws.com/datasets/quickstart/training.yaml ``` ### Configuring the Docker Images In general, the Docker Images are self-contained, and data written during the training process are not accessible to the host OS by design. However, you still want to access the output of your training session so that you can access the model and other artifacts produced during training. In order to do this, you need to map the directory in the container where the artifacts are stored to a directory on the host system where they will be accessed. For example, all of the sample YAML files store their output in the `~/model_output` directory (see the [output](../markdown/guide_cli_yaml_config.md#output) section of the configuration YAML). With a default docker setup, running as root in the container, this will output the training artifacts into `/root/model_output` *in the container*. Therefore, you just need to map the container directory to a local directory using the `--volume/-v` docker run option ([reference](https://docs.docker.com/engine/reference/commandline/run/)). For example: ```shell $ docker run -v $HOME/model_output:/root/model_output:rw --rm --gpus all masterful/masterful:latest masterful-train --config ``` #### Pro Tip Masterful caches log files and optimized datasets under `$HOME/.masterful`, so you can save some time and avoid re-downloading datasets if you also map this directory. For example: ```shell $ docker run -v $HOME/.masterful:/root/.masterful:rw -v $HOME/model_output:/root/model_output:rw --rm --gpus all masterful/masterful:latest masterful-train --config ``` The following are directories that you will probably want to mount in your container: ``` Masterful Directory: -v $HOME/.masterful:/root/.masterful:rw Model Output (from YAML): -v $HOME/model_output:/root/model_output:rw AWS Credentials: -v $HOME/.aws/credentials:/root/.aws/credentials:ro ``` ## Masterful GUI with the Docker Images You can also run the Masterful GUI using the docker images, in order to visualize the output of your training runs. You will need to forward the network port used by Masterful GUI from your container to the host system. The easiest way to do this is to use the `host` networking option in the `docker run` command. The following will run Masterful GUI in the container and expose the client port on the host system: ```shell docker run -v $HOME/.masterful:/root/.masterful:rw --network="host" --rm --gpus all masterful/masterful:latest masterful-gui ```