API Reference: Regularization¶
masterful.regularization.RegularizationParams¶
- class masterful.regularization.RegularizationParams(shuffle_buffer_size=None, mirror=None, rot90=None, rotate=None, mixup=None, cutmix=None, label_smoothing=None, hsv_cluster=None, hsv_cluster_to_index=None, hsv_magnitude_table=None, contrast_cluster=None, contrast_cluster_to_index=None, contrast_magnitude_table=None, blur_cluster=None, blur_cluster_to_index=None, blur_magnitude_table=None, spatial_cluster=None, spatial_cluster_to_index=None, spatial_magnitude_table=None, synthetic_proportion=None)¶
Parameters controlling the regularization of the model during training.
Regularization involves helping a model generalize to data it has not yet seen. Another way of saying this is that regularization is about fighting overfitting.
Masterful supports a range of different augmentations in order to regularize the model during training. The core concept behind the Masterful augmentation strategy is “augmentation clustering”, which groups structurally similar augmentations together (for example, color space augmentations are one group, spatial augmentations are another). In order to support this, we measure a range of different magnitudes for each cluster (the xxx_magnitude_table), the index of the magnitude selected for each cluster (xxx_cluster_to_index), and finally the optimal magnitude chosen for each cluster. Masterful automatically determines each cluster size and potential magnitude by measuring the “distance” each each augmentation moves the loss of the model. This is the “Distance Analysis” phase in the console output.
- Parameters
shuffle_buffer_size (int) – The size of the shuffle buffer to use during training. If None or 0, the shuffle buffer size will be found automatically based on memory capacity.
mirror (float) – Specifies whether to apply mirror augmentation during training as part of the augmentation pipeline. Must be set to 0.0 (off) or 1.0 (on).
rot90 (float) – Specifies whether to apply a random discrete rotation during training as part of the augmentation pipeline. This will perform a random, discrete rotation of either 90, 180, or 270 degress. Must be set to either 0.0 (off) or 1.0 (on).
rotate (int) – Specifies whether to apply a random rotation during training as part of the augmentation pipeline. Must be an integer in the range [0,100], where 0 specifies no rotation, and 100 specifies maximum rotation (180 degrees).
mixup (float) – Controls Mixup label smoothing, as proposed in “mixup: Beyond Empirical Risk Minimization” (https://arxiv.org/abs/1710.09412). Must be a float in the range [0.0, 1.0].
cutmix (float) – Specifies whether to apply cutmix during training as part of the data augmentation pipeline. Cutmix was proposed in “CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features” (https://arxiv.org/abs/1905.04899) and must be a float in the range [0.0, 1.0].
label_smoothing (float) – Controls whether to use label smoothing regularization or not. Label Smoothing is a regularization technique that introduces noise for the labels. This accounts for the fact that datasets may have mistakes in them, so maximizing the likelihood of log p(x|y) directly can be harmful.
hsv_cluster (int) – The color space cluster to use during training. This is the index to use in order to find the magnitude of the cluster in the hsv_magnitude_table. The HSV cluster involves color, brightness, and hue jittering augmentations.
hsv_cluster_to_index (numpy.ndarray) – Maps a cluster index to the index of the magnitude for that cluster in the hasv_magnitude_table.
hsv_magnitude_table (numpy.ndarray) – A table of magnitudes to use for each cluster. The magnitude chosen is specified by hsv_cluster.
contrast_cluster (int) – The contrast cluster to use during training. This is the index to use in order to find the magnitude of the cluster in the hsv_magnitude_table. The contrast cluster involves (auto) contrast, solarize, posterize, and equalize augmentations.
contrast_cluster_to_index (numpy.ndarray) – Maps a cluster index to the index of the magnitude for that cluster in the contrast_magnitude_table.
contrast_magnitude_table (numpy.ndarray) – A table of magnitudes to use for each cluster. The magnitude chosen is specified by contrast_cluster.
blur_cluster (int) – The blur cluster to use during training. This is the index to use in order to find the magnitude of the cluster in the blur_magnitude_table. The blur contrast involves sharpening and desharpening augmentations.
blur_cluster_to_index (numpy.ndarray) – Maps a cluster index to the index of the magnitude for that cluster in the blur_magnitude_table.
blur_magnitude_table (numpy.ndarray) – A table of magnitudes to use for each cluster. The magnitude chosen is specified by blur_cluster.
spatial_cluster (int) – The spatial cluster to use during training. This is the index to use in order to find the magnitude of the cluster in the spatial_magnitude_table. The spatial cluster involves shear, translate, and zoom augmentations.
spatial_cluster_to_index (numpy.ndarray) – Maps a cluster index to the index of the magnitude for that cluster in the spatial_magnitude_table.
spatial_magnitude_table (numpy.ndarray) – A table of magnitudes to use for each cluster. The magnitude chosen is specified by spatial_cluster.
synthetic_proportion (Sequence[float]) – If synthetic data is used during training, the proportion of synthetic data to use from each dataset. Should be a list of floats in the range [0.0, 1.0].
- Return type
None
masterful.regularization.learn_regularization_params¶
- masterful.regularization.learn_regularization_params(model, model_params, optimization_params, training_dataset, training_dataset_params, validation_dataset=None, validation_dataset_params=None, unlabeled_datasets=None, synthetic_datasets=None, **kwargs)¶
Learns the optimal set of regularization parameters for the model during training. This is dependent on having established the
OptimizationParams
to be used during training before calling this function.This function can take awhile to complete. The expensive part of this algorithm is the augmentation learner. In general, this function will take around 1.5x the amount of time it takes to train your model, in order to learn the optimal set of augmentations to use as part of the regularization strategy.
Example:
model: tf.keras.Model = ... model_params: masterful.architecture.params.ArchitectureParams = ... optimization_params: masterful.optimization.params.OptimizationParams = ... training_dataset: tf.data.Dataset = ... training_dataset_params: masterful.data.params.DataParams = ... regularization_params = masterful.regularization.learn_regularization_params( model=model, model_params=model_params, optimization_params=optimization_params, training_dataset=training_dataset, training_dataset_params=training_dataset_params)
- Parameters
model (keras.engine.training.Model) – The model to learn the optimal set of regularization parameters for.
model_params (masterful.architecture.params.ArchitectureParams) – The parameter of the model.
optimization_params (masterful.optimization.params.OptimizationParams) – The optimization parameters to use during training.
training_dataset (Union[tensorflow.python.data.ops.dataset_ops.DatasetV2, Tuple[numpy.ndarray, numpy.ndarray], keras.utils.data_utils.Sequence, Tuple[Callable[[], Iterator], Tuple[tensorflow.python.framework.tensor_spec.TensorSpec, tensorflow.python.framework.tensor_spec.TensorSpec]]]) – The labeled dataset to use for learning.
training_dataset_params (masterful.data.params.DataParams) – The parameters of the labeled dataset.
validation_dataset (Optional[Union[tensorflow.python.data.ops.dataset_ops.DatasetV2, Tuple[numpy.ndarray, numpy.ndarray], keras.utils.data_utils.Sequence, Tuple[Callable[[], Iterator], Tuple[tensorflow.python.framework.tensor_spec.TensorSpec, tensorflow.python.framework.tensor_spec.TensorSpec]]]]) – The optional validation dataset used to measure progress. If not validation dataset is specified, a small portion of the training dataset will be used as the validation set.
validation_dataset_params (Optional[masterful.data.params.DataParams]) – The parameters of the validation dataset.
unlabeled_datasets (Optional[Sequence[Tuple[Union[tensorflow.python.data.ops.dataset_ops.DatasetV2, Tuple[numpy.ndarray, numpy.ndarray], keras.utils.data_utils.Sequence, Tuple[Callable[[], Iterator], Tuple[tensorflow.python.framework.tensor_spec.TensorSpec, tensorflow.python.framework.tensor_spec.TensorSpec]]], masterful.data.params.DataParams]]]) – Optional sequence of unlabled datasets and their parameters, to use during training. If an unlabeled dataset is specified, then a set of algorithms must be specified in ssl_params otherwise this will have no effect.
synthetic_datasets (Optional[Sequence[Tuple[Union[tensorflow.python.data.ops.dataset_ops.DatasetV2, Tuple[numpy.ndarray, numpy.ndarray], keras.utils.data_utils.Sequence, Tuple[Callable[[], Iterator], Tuple[tensorflow.python.framework.tensor_spec.TensorSpec, tensorflow.python.framework.tensor_spec.TensorSpec]]], masterful.data.params.DataParams]]]) – Optional sequence of synthetic data and parameters to use during training. The amount of synthetic data used during training is controlled by
masterful.regularization.RegularizationParams.synthetic_proportion
.
- Returns
An instance of
RegularizationParams
describing the optimal regularization strategy to use during training.- Return type