Intro to the Masterful CLI Trainer

The Masterful CLI Trainer is a command line tool that trains a production quality model with no code required. It takes care of many of the not-very-fun parts of CV model development:

  • Loading data into tf.data.Dataset.

  • Preprocessing Data.

  • Debugging tensor format mismatches.

  • Managing console output and logging.

  • Writing metrics and evaluation code.

Running the CLI Trainer

After installing Masterful via Pip, the Masterful CLI Trainer is installed as masterful-train. To run the trainer, use the following command:

$ masterful-train --config=<config_file>

When running from a Docker installation, the CLI trainer is already installed and can be run from an interactive terminal with the above command, or can be run directly from Docker using:

$ docker run --rm --gpus all masterful/masterful:latest masterful-train --config=<config_file> 

See the quickstart tutorial to setup a <config_file> and train your first model.

Inputs and Outputs

The Masterful CLI Trainer takes as input:

The Masterful CLI Trainer calls the Masterful Python API to

  • Metalearn optimal hyperparameters.

  • Train a model with advanced regularization, SSL, transfer learning, and optimization techniques.

And outputs:

To get started with your first training run, see The Quickstart Tutorial

Console Output

Here’s the console output from a sample run.


$ python3 -m masterful.train --config=config.yaml
MASTERFUL: Training with configuration config.yaml.
MASTERFUL: Using model efficientnetb0_v1 with:
MASTERFUL:     4053414 total parameters
MASTERFUL:     4011391 trainable parameters
MASTERFUL:     42023 untrainable parameters
MASTERFUL: Caching I/O optimized dataset split 'train' to /datasets/8a3ee6437106398f8aa3024c4bce9efa8539eb4f...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1034/1034 [00:00<00:00, 1464.25it/s]
MASTERFUL: Caching I/O optimized dataset split 'validation' to /datasets/4a55a6889d01131140106e9f3d45885271b3cede...
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 133/133 [00:00<00:00, 1312.08it/s]
MASTERFUL: Caching I/O optimized dataset split 'test' to /datasets/5dbe57e0c0ce96433ec06475498ff3a4b983779b...
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 128/128 [00:00<00:00, 1309.20it/s]
MASTERFUL: Dataset Analysis:
MASTERFUL: 	Training Dataset: 1034 examples.
MASTERFUL: 	Validation Dataset: 133 examples.
MASTERFUL: 	Test Dataset: 128 examples.
MASTERFUL: 	Unlabeled Dataset: 0 examples.
Log API_EVENT (400): {'app_exception': 'InvalidUUID', 'context': {'message': 'account id or password with bad format.'}}
Log API_EVENT (400): {'app_exception': 'InvalidUUID', 'context': {'message': 'account id or password with bad format.'}}
Log API_EVENT (400): {'app_exception': 'InvalidUUID', 'context': {'message': 'account id or password with bad format.'}}
Log API_EVENT (400): {'app_exception': 'InvalidUUID', 'context': {'message': 'account id or password with bad format.'}}
MASTERFUL: Meta-Learning optimization parameters...
Log API_EVENT (400): {'app_exception': 'InvalidUUID', 'context': {'message': 'account id or password with bad format.'}}
MASTERFUL: Learning optimal batch size.
MASTERFUL: Learning optimal initial learning rate for batch size 64.
Log API_EVENT (400): {'app_exception': 'InvalidUUID', 'context': {'message': 'account id or password with bad format.'}}
MASTERFUL: Meta-Learning Regularization Parameters...
MASTERFUL: Warming up model for analysis.
MASTERFUL: 	Warming up batch norm statistics (this could take a few minutes).
MASTERFUL: 	Warming up training for 520 steps.
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 520/520 [01:19<00:00,  6.53steps/s]
MASTERFUL: 	Validating batch norm statistics after warmup for stability (this could take a few minutes).
MASTERFUL: Analyzing baseline model performance. Training until validation loss stabilizes...
Baseline Training: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 960/960 [04:04<00:00,  3.92steps/s]
MASTERFUL: Baseline training complete.
MASTERFUL: Meta-Learning Basic Data Augmentations...
Node 1/4: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 400/400 [01:38<00:00,  4.07steps/s]
Node 2/4: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 400/400 [01:39<00:00,  4.02steps/s]
Node 3/4: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 400/400 [01:39<00:00,  4.01steps/s]
Node 4/4: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 400/400 [01:39<00:00,  4.02steps/s]
MASTERFUL: Meta-Learning Data Augmentation Clusters...
Distance Analysis: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 143/143 [00:53<00:00,  2.65steps/s]
Node 1/10: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 400/400 [01:51<00:00,  3.59steps/s]
Node 2/10: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 400/400 [01:51<00:00,  3.58steps/s]
Node 3/10: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 400/400 [01:52<00:00,  3.57steps/s]
Node 4/10:  10%|██████████▋                                                                                                      | 38/400 [00:05<01:21,  4.47steps/s]
Node 4/10: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 400/400 [01:51<00:00,  3.58steps/s]
Node 5/10: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 400/400 [01:51<00:00,  3.59steps/s]
Distance Analysis: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████| 66/66 [00:25<00:00,  2.57steps/s]
Node 6/10: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 400/400 [01:53<00:00,  3.52steps/s]
Node 7/10: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 400/400 [01:53<00:00,  3.52steps/s]
Node 8/10: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 400/400 [01:53<00:00,  3.52steps/s]
Node 9/10: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 400/400 [01:53<00:00,  3.52steps/s]
Node 10/10: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 400/400 [01:53<00:00,  3.53steps/s]
MASTERFUL: Meta-Learning Label Based Regularization...
Node 1/2: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 400/400 [01:54<00:00,  3.50steps/s]
Node 2/2: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 400/400 [01:54<00:00,  3.50steps/s]
MASTERFUL: Meta-Learning Weight Based Regularization...
MASTERFUL: Analysis finished in 38.25641816457112 minutes.
MASTERFUL: Learned parameters basketball-sophisticated-snagglefoot saved at /home/yaoshiang/.masterful/policies/basketball-sophisticated-snagglefoot.
MASTERFUL: Learning SSL parameters...
Log API_EVENT (400): {'app_exception': 'InvalidUUID', 'context': {'message': 'account id or password with bad format.'}}
Log API_EVENT (400): {'app_exception': 'InvalidUUID', 'context': {'message': 'account id or password with bad format.'}}
MASTERFUL: Training model with semi-supervised learning disabled.
MASTERFUL: Performing basic dataset analysis.
MASTERFUL: Training model with:
MASTERFUL: 	1034 labeled examples.
MASTERFUL: 	133 validation examples.
MASTERFUL: 	0 synthetic examples.
MASTERFUL: 	0 unlabeled examples.
MASTERFUL: Training model with learned parameters basketball-sophisticated-snagglefoot in two phases.
MASTERFUL: The first phase is supervised training with the learned parameters.
MASTERFUL: The second phase is semi-supervised training to boost performance.
MASTERFUL: Warming up model for supervised training.
MASTERFUL: 	Warming up batch norm statistics (this could take a few minutes).
MASTERFUL: 	Warming up training for 520 steps.
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 520/520 [01:22<00:00,  6.33steps/s]
MASTERFUL: 	Validating batch norm statistics after warmup for stability (this could take a few minutes).
MASTERFUL: Starting Phase 1: Supervised training until the validation loss stabilizes...
Supervised Training: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 920/920 [04:55<00:00,  3.12steps/s]
MASTERFUL: Semi-Supervised training disabled in parameters.
MASTERFUL: Training complete in 6.643984425067901 minutes.
MASTERFUL: ************************************
MASTERFUL: Evaluation Metrics:
MASTERFUL:   Loss: 0.1694
MASTERFUL:   Categorical Accuracy: 0.9844
MASTERFUL:   Average Precision: 0.9852
MASTERFUL:   Average Recall:    0.9845
MASTERFUL: 
MASTERFUL:   Per-Class Metrics:
MASTERFUL:     Precision (class=0): 0.9556
MASTERFUL:     Recall    (class=0): 1.0000
MASTERFUL:     ***
MASTERFUL:     Precision (class=1): 1.0000
MASTERFUL:     Recall    (class=1): 0.9535
MASTERFUL:     ***
MASTERFUL:     Precision (class=2): 1.0000
MASTERFUL:     Recall    (class=2): 1.0000
MASTERFUL:     ***
MASTERFUL: Saving model output to model_output.
MASTERFUL:     Saving saved_model output to model_output/saved_model
MASTERFUL:     Saving onnx output to model_output/onnx

Built on the Python API

The Masterful CLI Trainer is implemented entirely on top of the Masterful Python API for Tensorflow, so you get all the power of the Masterful platform including:

  • Semi-Supervised Learning to improve your models using unlabeled data

  • Adaptive, comprehensive regularization that beats techniques like RandAug on real world data.

  • A custom meta-learner to eliminate guessing and checking of hyperparameters

CV Task Support

The Masterful CLI Trainer supports:

  • Binary Classification

  • Single-Label Multi-Class classification

  • Multi-Label Multi-Class classification

  • Object Detection

  • Semantic Segmentation