Quickstart Tutorial¶
Quickstart Tutorial¶
In this short introduction to Masterful, you will train and evaluate a model start to finish with the Masterful CLI Trainer. This example will walk you through setting up your data and model and then using them to train and evaluate a model.
Prerequisites¶
Ensure you have Tensorflow installed, then install Masterful. For more details, follow the Masterful installation instructions.
[1]:
!pip install --upgrade pip --quiet
!pip install masterful --quiet &> /dev/null
TL;DR¶
Don’t want to read the rest of this guide, and want to start training immediately? The following command shows you how to start training with Masterful, using a configuration file and dataset on S3.
The sections following this will go into more detail on what is happening underneath the covers, and explain the dataset and configuration file formats.
For more information on the configuration file and dataset formats, see Masterful Configuration File and Masterful Dataset Format.
From Docker Install:
$ docker run -v $HOME/model_output:/root/model_output:rw --rm --gpus all masterful/masterful:latest masterful-train --config=https://masterful-public.s3.us-west-1.amazonaws.com/datasets/quickstart/training.yaml
From Pip Install:
$ masterful-train --config=https://masterful-public.s3.us-west-1.amazonaws.com/datasets/quickstart/training.yaml
Setup the Dataset¶
The first step in any training project is to collect the data you will train with. Masterful has a very simple, flexible CSV based format for images and labels that should make it trivial to prepare your data for training.
A typical dataset consists of a set of images and their labels. These images and labels are then split
into different sets, typically called training, validation, and test/holdout sets. The training set is used to train your model, validation set is used to measure the performance of your model during training, and the test/holdout set is used to measure the generalization performance of your model on data is has never seen before. Masterful only requires a training set. If there is no
validation set, then Masterful will create one from the training set. If there is no test/holdout set, then Masterful will not evaluate your model on that set and will not report evaluation metrics on the model.
Masterful uses a very simple CSV file format to describe the images and labels in your dataset. Typically you create a separate CSV file for each of the dataset splits (training, validation, and test) that you want to use during training. For this tutorial, you will use a simple Flowers dataset (the same dataset used in the Keras Image Classification tutorial). The images and labels for this dataset are stored in the public AWS
bucket s3://masterful-public/datasets/quickstart/
. This bucket has the following structure:
quickstart/
daisy/
dandelion/
roses/
sunflowers/
tulips/
training.yaml
test.csv
train.csv
validation.csv
As you can see, there are CSV files created called train.csv
, test.csv
, and validation.csv
, which hold the description of the training, test, and validation dataset splits respectively. You can also see a training.yaml
file, which is a YAML formatted configuration file which defines all of the information necessary for Masterful to train on the above dataset. You will learn more about the configuration file below.
For more information about the Masterful dataset format, see Masterful Dataset Format.
Explore the Data¶
There are 5 classes in the dataset listed above, corresponding to different types of flowers. You can see examples of each class in the plot below.
[2]:
import matplotlib.pyplot as plt
import PIL
import requests
from io import BytesIO
daisies = [
"https://masterful-public.s3.us-west-1.amazonaws.com/datasets/quickstart/daisy/100080576_f52e8ee070_n.jpg",
"https://masterful-public.s3.us-west-1.amazonaws.com/datasets/quickstart/daisy/10140303196_b88d3d6cec.jpg",
]
dandelions = [
"https://masterful-public.s3.us-west-1.amazonaws.com/datasets/quickstart/dandelion/10043234166_e6dd915111_n.jpg",
"https://masterful-public.s3.us-west-1.amazonaws.com/datasets/quickstart/dandelion/10200780773_c6051a7d71_n.jpg",
]
roses = [
"https://masterful-public.s3.us-west-1.amazonaws.com/datasets/quickstart/roses/10090824183_d02c613f10_m.jpg",
"https://masterful-public.s3.us-west-1.amazonaws.com/datasets/quickstart/roses/102501987_3cdb8e5394_n.jpg",
]
sunflowers = [
"https://masterful-public.s3.us-west-1.amazonaws.com/datasets/quickstart/sunflowers/1008566138_6927679c8a.jpg",
"https://masterful-public.s3.us-west-1.amazonaws.com/datasets/quickstart/sunflowers/1022552002_2b93faf9e7_n.jpg",
]
tulips = [
"https://masterful-public.s3.us-west-1.amazonaws.com/datasets/quickstart/tulips/100930342_92e8746431_n.jpg",
"https://masterful-public.s3.us-west-1.amazonaws.com/datasets/quickstart/tulips/10094729603_eeca3f2cb6.jpg",
]
images = [daisies, dandelions, roses, sunflowers, tulips]
ROWS = 2
COLUMNS = len(images)
f, axarr = plt.subplots(ROWS, COLUMNS, figsize=(15, 5))
curr_row = 0
for col, image_col in enumerate(images):
for row, image_row in enumerate(image_col):
with requests.get(image_row) as response:
image = PIL.Image.open(BytesIO(response.content))
axarr[row, col].imshow(image)
Configure Training¶
Masterful uses a simple YAML configuration file to setup the training parameters. The configuration file has five major sections: dataset, model, training, evaluation, and output. The configuration file used in this tutorial is here.
For more information on the configuration file format, see Masterful Configuration File.
Let’s Get Training¶
Simply point Masterful to the configuration file you created, and Masterful will begin training on your data. At the end of training, Masterful will summarize the performance metrics after evaluating the model on your test/holdout dataset.
[3]:
!masterful-train --config=https://masterful-public.s3.us-west-1.amazonaws.com/datasets/quickstart/training.yaml
MASTERFUL: Your account has been successfully registered. Masterful v0.5.0 is loaded.
MASTERFUL [12:04:47]: Training with configuration 's3://masterful-public/datasets/quickstart/training.yaml'.
MASTERFUL [12:04:49]: Using model efficientnetb0_v1 with:
MASTERFUL [12:04:49]: 4055976 total parameters
MASTERFUL [12:04:49]: 4013953 trainable parameters
MASTERFUL [12:04:49]: 42023 untrainable parameters
MASTERFUL [12:04:50]: Caching I/O optimized dataset split 'train' to /home/yaoshiang/.masterful/datasets/36158d44fb4f0c63ac240db51678fac4f63ba328...
100%|███████████████████████████████████████| 2936/2936 [05:44<00:00, 8.52it/s]
MASTERFUL [12:10:42]: Caching I/O optimized dataset split 'validation' to /home/yaoshiang/.masterful/datasets/0cc422076c324bbadaf751dfa2cc5845df0f8cb4...
100%|█████████████████████████████████████████| 367/367 [00:40<00:00, 9.07it/s]
MASTERFUL [12:11:29]: Caching I/O optimized dataset split 'test' to /home/yaoshiang/.masterful/datasets/3877e18299c65143f40085b01eeb88ebe52f5930...
100%|█████████████████████████████████████████| 367/367 [00:53<00:00, 6.91it/s]
MASTERFUL [12:12:27]: Dataset Summary:
MASTERFUL [12:12:27]: Training Dataset: 2936 examples.
MASTERFUL [12:12:27]: Validation Dataset: 367 examples.
MASTERFUL [12:12:27]: Test Dataset: 367 examples.
MASTERFUL [12:12:27]: Unlabeled Dataset: 0 examples.
MASTERFUL [12:12:28]: Training Dataset Analysis:
100%|██████████████████████████████████████| 2936/2936 [00:14<00:00, 199.18it/s]
MASTERFUL [12:12:42]: Training dataset analysis finished at 12:12:42 in 15 seconds (15s), returned:
------------------ ----------------------------------------
Total Examples 2936
Label Counts dandelion 713
rose 510
tulip 634
daisy 514
sunflower 565
Label Distribution dandelion 0.242847
rose 0.173706
tulip 0.21594
daisy 0.175068
sunflower 0.192439
Balanced Yes
Per Channel Mean [118.7795914 108.89870996 76.78918076]
Per Channel StdDev [64.67416621 57.99644922 59.40810724]
Min Height 180
Min Width 143
Max Height 442
Max Width 1024
Average Height 272
Average Width 366
Largest Image (442, 500, 3)
Smallest Image (240, 143, 3)
Duplicates 1
------------------ ----------------------------------------
MASTERFUL [12:12:42]: WARNING: Duplicates detected in dataset split 'train'.
MASTERFUL [12:12:42]: WARNING: You can find the duplicate entries using the tool:
MASTERFUL [12:12:42]: WARNING: python -m masterful.data.duplicate_detector --config=s3://masterful-public/datasets/quickstart/training.yaml
MASTERFUL [12:12:42]: Validation Dataset Analysis:
100%|████████████████████████████████████████| 367/367 [00:01<00:00, 206.93it/s]
MASTERFUL [12:12:44]: Validation dataset analysis finished at 12:12:44 in 2 seconds (2s), returned:
------------------ ----------------------------------------
Total Examples 367
Label Counts dandelion 95
rose 74
daisy 55
sunflower 62
tulip 81
Label Distribution dandelion 0.258856
rose 0.201635
daisy 0.149864
sunflower 0.168937
tulip 0.220708
Balanced No
Per Channel Mean [116.63373761 106.30352827 74.48895039]
Per Channel StdDev [62.93288427 55.68255178 56.71276635]
Min Height 180
Min Width 157
Max Height 429
Max Width 640
Average Height 268
Average Width 355
Largest Image (429, 500, 3)
Smallest Image (240, 157, 3)
Duplicates 0
------------------ ----------------------------------------
MASTERFUL [12:12:44]: Test Dataset Analysis:
100%|████████████████████████████████████████| 367/367 [00:01<00:00, 200.34it/s]
MASTERFUL [12:12:46]: Test dataset analysis finished at 12:12:46 in 2 seconds (2s), returned:
------------------ ----------------------------------------
Total Examples 367
Label Counts dandelion 90
rose 57
sunflower 72
daisy 64
tulip 84
Label Distribution dandelion 0.245232
rose 0.155313
sunflower 0.196185
daisy 0.174387
tulip 0.228883
Balanced Yes
Per Channel Mean [112.55877415 105.24692186 76.16738029]
Per Channel StdDev [64.2522064 57.23987375 58.24934888]
Min Height 159
Min Width 159
Max Height 404
Max Width 1024
Average Height 271
Average Width 363
Largest Image (404, 500, 3)
Smallest Image (240, 159, 3)
Duplicates 0
------------------ ----------------------------------------
MASTERFUL [12:12:46]: Cross-Dataset Analysis:
MASTERFUL [12:12:46]: Cross-Dataset analysis finished at 12:12:46 in 0 seconds (0s), returned:
---------- -------------
train train 1
validation 1
test 1
validation train 1
validation 0
test 0
test train 1
validation 0
test 0
---------- -------------
MASTERFUL [12:12:46]: Meta-Learning architecture parameters...
MASTERFUL [12:12:49]: Architecture learner finished at 12:12:49 in 3 seconds (3s), returned:
------------------------------ -----------------------------
task Task.CLASSIFICATION
num_classes 5
ensemble_multiplier 1
custom_objects {}
model_config
backbone_only False
input_shape (180, 180, 3)
input_range ImageRange.ZERO_255
input_dtype <dtype: 'float32'>
input_channels_last True
prediction_logits True
prediction_dtype <dtype: 'float32'>
prediction_structure TensorStructure.SINGLE_TENSOR
prediction_shape (5,)
total_parameters 4055976
total_trainable_parameters 4013953
total_non_trainable_parameters 42023
------------------------------ -----------------------------
MASTERFUL [12:12:49]: Meta-learning training dataset parameters...
MASTERFUL [12:12:50]: Training dataset learner finished at 12:12:50 in 1 seconds (1s), returned:
------------------------- -----------------------------
num_classes 5
task Task.CLASSIFICATION
image_shape (180, 180, 3)
image_range ImageRange.ZERO_255
image_dtype <dtype: 'float32'>
image_channels_last True
label_dtype <dtype: 'float32'>
label_shape (5,)
label_structure TensorStructure.SINGLE_TENSOR
label_sparse False
label_bounding_box_format
------------------------- -----------------------------
MASTERFUL [12:12:50]: Meta-learning validation dataset parameters...
MASTERFUL [12:12:51]: Validation dataset learner finished at 12:12:51 in 1 seconds (1s), returned:
------------------------- -----------------------------
num_classes 5
task Task.CLASSIFICATION
image_shape (180, 180, 3)
image_range ImageRange.ZERO_255
image_dtype <dtype: 'float32'>
image_channels_last True
label_dtype <dtype: 'float32'>
label_shape (5,)
label_structure TensorStructure.SINGLE_TENSOR
label_sparse False
label_bounding_box_format
------------------------- -----------------------------
MASTERFUL [12:12:51]: Meta-learning test dataset parameters...
MASTERFUL [12:12:52]: Test dataset learner finished at 12:12:52 in 1 seconds (1s), returned:
------------------------- -----------------------------
num_classes 5
task Task.CLASSIFICATION
image_shape (180, 180, 3)
image_range ImageRange.ZERO_255
image_dtype <dtype: 'float32'>
image_channels_last True
label_dtype <dtype: 'float32'>
label_shape (5,)
label_structure TensorStructure.SINGLE_TENSOR
label_sparse False
label_bounding_box_format
------------------------- -----------------------------
MASTERFUL [12:12:52]: Meta-Learning optimization parameters...
Callbacks: 100%|███████████████████████████████| 8/8 [01:14<00:00, 9.31s/steps]
MASTERFUL [12:14:07]: Optimization learner finished at 12:14:07 in 75 seconds (1m 15s), returned:
----------------------- -----------------------------------------------------------------
batch_size 64
drop_remainder False
epochs 1000000
learning_rate 0.0017677668947726488
learning_rate_schedule
learning_rate_callback <keras.callbacks.ReduceLROnPlateau object at 0x7f7b3c2d74f0>
warmup_learning_rate 1e-06
warmup_epochs 5
optimizer <tensorflow_addons.optimizers.lamb.LAMB object at 0x7f7ad8150280>
loss <keras.losses.CategoricalCrossentropy object at 0x7f7ad8150790>
loss_weights
early_stopping_callback <keras.callbacks.EarlyStopping object at 0x7f7ad81508b0>
metrics [<keras.metrics.CategoricalAccuracy object at 0x7f7ad8150ee0>]
readonly_callbacks
----------------------- -----------------------------------------------------------------
MASTERFUL [12:14:09]: Meta-Learning Regularization Parameters...
MASTERFUL [12:14:10]: Warming up model for analysis.
MASTERFUL [12:14:14]: Warming up batch norm statistics (this could take a few minutes).
MASTERFUL [12:14:17]: Warming up training for 510 steps.
100%|██████████████████████████████████████| 510/510 [01:32<00:00, 5.54steps/s]
MASTERFUL [12:15:49]: Validating batch norm statistics after warmup for stability (this could take a few minutes).
MASTERFUL [12:15:52]: Analyzing baseline model performance. Training until validation loss stabilizes...
Baseline Training: 100%|█████████████████| 2340/2340 [07:10<00:00, 5.43steps/s]
MASTERFUL [12:23:19]: Baseline training complete.
MASTERFUL [12:23:19]: Meta-Learning Basic Data Augmentations...
Node 1/4: 100%|██████████████████████████| 1040/1040 [03:06<00:00, 5.57steps/s]
Node 2/4: 100%|██████████████████████████| 1040/1040 [03:08<00:00, 5.51steps/s]
Node 3/4: 100%|██████████████████████████| 1040/1040 [03:08<00:00, 5.53steps/s]
Node 4/4: 100%|██████████████████████████| 1040/1040 [03:08<00:00, 5.52steps/s]
MASTERFUL [12:36:40]: Meta-Learning Data Augmentation Clusters...
Distance Analysis: 100%|███████████████████| 143/143 [01:00<00:00, 2.38steps/s]
Node 1/10: 100%|█████████████████████████| 1040/1040 [03:33<00:00, 4.87steps/s]
Node 2/10: 100%|█████████████████████████| 1040/1040 [03:33<00:00, 4.86steps/s]
Node 3/10: 100%|█████████████████████████| 1040/1040 [03:33<00:00, 4.86steps/s]
Node 4/10: 100%|█████████████████████████| 1040/1040 [03:34<00:00, 4.86steps/s]
Node 5/10: 100%|█████████████████████████| 1040/1040 [03:33<00:00, 4.86steps/s]
Distance Analysis: 100%|█████████████████████| 66/66 [00:28<00:00, 2.31steps/s]
Node 6/10: 100%|█████████████████████████| 1040/1040 [03:38<00:00, 4.77steps/s]
Node 7/10: 100%|█████████████████████████| 1040/1040 [03:38<00:00, 4.76steps/s]
Node 8/10: 100%|█████████████████████████| 1040/1040 [03:38<00:00, 4.76steps/s]
Node 9/10: 100%|█████████████████████████| 1040/1040 [03:38<00:00, 4.77steps/s]
Node 10/10: 100%|████████████████████████| 1040/1040 [03:38<00:00, 4.77steps/s]
MASTERFUL [13:16:20]: Meta-Learning Label Based Regularization...
Node 1/2: 100%|██████████████████████████| 1040/1040 [03:39<00:00, 4.74steps/s]
Node 2/2: 100%|██████████████████████████| 1040/1040 [03:39<00:00, 4.75steps/s]
MASTERFUL [13:24:06]: Meta-Learning Weight Based Regularization...
MASTERFUL [13:24:07]: Analysis finished in 69.93716329336166 minutes.
MASTERFUL [13:24:07]: Learned parameters harrier-onyx-cobweb saved at /home/yaoshiang/.masterful/policies/harrier-onyx-cobweb.
MASTERFUL [13:24:07]: Regularization learner finished at 13:24:07 in 4200 seconds (1h 10m 0s), returned:
------------------------- -----------------------------------------------
shuffle_buffer_size 2936
mirror 1.0
rot90 1.0
rotate 0
mixup 0.0
cutmix 0.0
label_smoothing 0
hsv_cluster 4
hsv_cluster_to_index [[ 2 4 6 11 11 11]
[ 2 3 4 6 9 11]
[ 1 2 4 5 6 11]
[ 1 2 3 6 9 11]
[ 2 2 2 2 5 11]]
hsv_magnitude_table [[ 0 10 20 30 40 50 60 70 80 90 100]
[ 0 10 20 30 40 50 60 70 80 90 100]
[ 0 10 20 30 40 50 60 70 100 90 80]
[ 0 10 20 30 40 50 60 70 80 90 100]
[100 0 10 90 80 50 20 40 60 30 70]]
contrast_cluster 4
contrast_cluster_to_index [[ 4 11 11 11 11 11]
[ 1 1 1 1 7 11]
[ 4 6 6 7 9 11]
[ 1 2 5 9 11 11]
[ 1 2 5 9 11 11]
[ 1 2 4 6 8 11]]
contrast_magnitude_table [[ 0 10 20 30 40 50 60 70 80 90 100]
[ 0 10 20 30 50 40 60 70 100 80 90]
[ 0 10 20 30 40 50 60 70 80 90 100]
[ 0 10 20 30 40 50 60 70 80 90 100]
[ 0 10 20 30 40 50 60 70 80 90 100]
[ 0 10 20 30 40 50 60 70 80 90 100]]
blur_cluster 4
blur_cluster_to_index [[ 1 4 11 11 11 11]
[ 3 7 10 11 11 11]]
blur_magnitude_table [[ 0 10 20 30 40 50 60 70 80 90 100]
[ 0 50 10 20 40 30 60 70 80 90 100]]
spatial_cluster 5
spatial_cluster_to_index [[ 1 3 5 6 7 11]
[ 1 3 4 5 7 11]
[ 2 3 8 10 11 11]
[ 1 4 6 8 11 11]
[ 4 7 7 10 11 11]
[ 2 2 3 5 8 11]]
spatial_magnitude_table [[ 0 20 10 30 40 50 60 70 80 90 100]
[ 0 20 10 30 40 50 60 70 80 90 100]
[ 0 100 10 20 90 30 80 70 40 50 60]
[ 0 10 100 20 90 80 30 70 40 60 50]
[ 0 50 20 30 60 10 40 70 80 90 100]
[ 0 10 20 30 40 50 60 70 80 100 90]]
synthetic_proportion [0.0]
------------------------- -----------------------------------------------
MASTERFUL [13:24:07]: Learning SSL parameters...
MASTERFUL [13:24:08]: SSL learner finished at 13:24:08 in 1 seconds (1s), returned:
---------- --
algorithms []
---------- --
MASTERFUL [13:24:09]: Training model with semi-supervised learning disabled.
MASTERFUL [13:24:09]: Performing basic dataset analysis.
MASTERFUL [13:24:10]: Training model with:
MASTERFUL [13:24:10]: 2936 labeled examples.
MASTERFUL [13:24:10]: 367 validation examples.
MASTERFUL [13:24:10]: 0 synthetic examples.
MASTERFUL [13:24:10]: 0 unlabeled examples.
MASTERFUL [13:24:10]: Training model with learned parameters harrier-onyx-cobweb in two phases.
MASTERFUL [13:24:10]: The first phase is supervised training with the learned parameters.
MASTERFUL [13:24:10]: The second phase is semi-supervised training to boost performance.
MASTERFUL [13:24:12]: Warming up model for supervised training.
MASTERFUL [13:24:16]: Warming up batch norm statistics (this could take a few minutes).
MASTERFUL [13:24:19]: Warming up training for 510 steps.
100%|██████████████████████████████████████| 510/510 [01:40<00:00, 5.08steps/s]
MASTERFUL [13:25:59]: Validating batch norm statistics after warmup for stability (this could take a few minutes).
MASTERFUL [13:26:09]: Starting Phase 1: Supervised training until the validation loss stabilizes...
Supervised Training: 100%|███████████████| 2756/2756 [10:59<00:00, 4.18steps/s]
MASTERFUL [13:37:25]: Semi-Supervised training disabled in parameters.
MASTERFUL [13:37:27]: Training complete in 13.261691228548687 minutes.
MASTERFUL [13:37:46]: ************************************
MASTERFUL [13:37:46]: Evaluating model on 367 examples from the 'test' dataset split:
MASTERFUL [13:37:46]: Loss: 0.1692
MASTERFUL [13:37:46]: Categorical Accuracy: 0.9482
MASTERFUL [13:37:49]: Average Precision: 0.9476
MASTERFUL [13:37:49]: Average Recall: 0.9447
MASTERFUL [13:37:49]: Confusion Matrix:
MASTERFUL [13:37:49]: | daisy| dandelion| rose| sunflower| tulip|
MASTERFUL [13:37:49]: daisy| 61| 1| 0| 1| 1|
MASTERFUL [13:37:49]: dandelion| 1| 87| 0| 2| 0|
MASTERFUL [13:37:49]: rose| 0| 0| 50| 0| 7|
MASTERFUL [13:37:49]: sunflower| 0| 0| 1| 71| 0|
MASTERFUL [13:37:49]: tulip| 0| 1| 4| 0| 79|
MASTERFUL [13:37:49]: Confusion matrix columns represent the prediction labels and the rows represent the real labels.
MASTERFUL [13:37:49]:
MASTERFUL [13:37:49]: Per-Class Metrics:
MASTERFUL [13:37:49]: Class daisy:
MASTERFUL [13:37:49]: Precision: 0.9839
MASTERFUL [13:37:49]: Recall : 0.9531
MASTERFUL [13:37:49]: Class dandelion:
MASTERFUL [13:37:49]: Precision: 0.9775
MASTERFUL [13:37:49]: Recall : 0.9667
MASTERFUL [13:37:49]: Class rose:
MASTERFUL [13:37:49]: Precision: 0.9091
MASTERFUL [13:37:49]: Recall : 0.8772
MASTERFUL [13:37:49]: Class sunflower:
MASTERFUL [13:37:49]: Precision: 0.9595
MASTERFUL [13:37:49]: Recall : 0.9861
MASTERFUL [13:37:49]: Class tulip:
MASTERFUL [13:37:49]: Precision: 0.9080
MASTERFUL [13:37:49]: Recall : 0.9405
MASTERFUL [13:37:49]: Saving model output to /home/yaoshiang/model_output/session-00000.
MASTERFUL [13:37:49]: Saving saved_model output to /home/yaoshiang/model_output/session-00000/saved_model
MASTERFUL [13:38:10]: Saving onnx output to /home/yaoshiang/model_output/session-00000/onnx
MASTERFUL [13:38:42]: Saving evaluation metrics to /home/yaoshiang/model_output/session-00000/evaluation_metrics.csv.
MASTERFUL [13:38:42]: Saving regularization params to /home/yaoshiang/model_output/session-00000/regularization.params.
MASTERFUL [13:38:42]: Saving confusion matrix to /home/yaoshiang/model_output/session-00000/confusion_matrix.csv.
MASTERFUL [13:38:42]: Total elapsed training time: 94 minutes (1h 33m 55s).
MASTERFUL [13:38:42]: Launch masterful-gui to visualize the training results: policy name 'harrier-onyx-cobweb'
Pro Tip!¶
If you are training on a remote machine over an SSH connection, you will want to ensure that your training session does not die if your SSH connected gets killed. You can use nohup
to help here!
nohup masterful-train --config=s3://masterful-public/datasets/quickstart/training.yaml &> training_log.txt & while ! test -f training_log.txt; do :; done && tail -f training_log.txt
The above command will launch the job in the background, and then tail the output as it comes. You can CTRL-C the output at any time, and the job will continue to you. Simply tail
the output file again to pick up the training log in progress.
You have now finished training your first model with the Masterful AutoML platform!
Next Steps¶
Now that you have finished this basic tutorial, we suggest exploring the rest of the documentation for more advanced examples and use cases. For example, you can learn how to use unlabeled data to further improve the performance of your model. Or you can learn about other vision tasks such as object detection and segmentation. Don’t see a topic that you are interested in? Reach out to us directly on email at learn@masterfulai.com or join our Slack commmunity. We are happy to help walk you through whatever challenge you are facing!