{ "cells": [ { "cell_type": "markdown", "metadata": { "colab_type": "text" }, "source": [ "# Object Detection with Masterful\n", "\n", "**Author:** [sam](mailto:sam@masterfulai.com) \n", "**Date created:** 2022/03/21 \n", "**Last modified:** 2022/03/21 \n", "**Description:** Overview of how to use Tensorflow Object Detection model with Masterful." ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text" }, "source": [ "[[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)][1]        [![Download](images/download.png)][2][Download this Notebook][2]\n", "\n", "[1]:https://colab.research.google.com/github/masterfulai/masterful-docs/blob/main/notebooks/guide_object_detection.ipynb\n", "[2]:http://docs.masterfulai.com/0.4.1/notebooks/guide_object_detection.ipynb" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text" }, "source": [ "## Introduction\n", "\n", "In the [Classification](guide_classification.ipynb) guide, you looked\n", "at a simple classification example to get you up and\n", "running with the Masterful AutoML platform. In this guide, you will\n", "take a deeper look at Object Detection with Masterful. Specifically,\n", "you will learn how to train a model from the\n", "[Tensorflow Object Detection API](https://github.com/tensorflow/models/tree/master/research/object_detection)\n", "using Masterful.\n", "\n", "The TensorFlow Object Detection API is an open source framework\n", "built on top of TensorFlow that makes it easy to construct,\n", "train and deploy object detection models. This library provides\n", "a lot of high quality object detections models that can be used in\n", "Tensorflow. Normally you would train these models using the\n", "Tensorflow Object Detection API. However, there are many reasons why\n", "you might want to train them outside of the library. In particular, training\n", "these models with Masterful allows you to take advantage of any\n", "unlabeled data you might have using semi-supervised learning.\n", "\n", "For a complete list of the models supported by the Tensorflow Object\n", "Detection API for Tensorflow 2.0, see [here](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md).\n", "\n", "In this guide, you will take an existing pipeline configuration file\n", "that you have created for the Tensorflow Object Detection API and use\n", "it directly with Masterful to train and evaluate the model. For simplicity,\n", "you will be using the VOC 2007 dataset with object annotations to demonstrate\n", "how to setup the dataset and train the model with the data.\n", "\n", "If you are familiar with the Tensorflow Object Detection API pipeline\n", "configuration protocol buffer, this guide demonstrates training with the\n", "model from the pipeline configuration and with a dataset from Tensorflow Datasets.\n", "The input configuration and eval configuration from the pipeline\n", "configuration is ignored in this example." ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text" }, "source": [ "## Prerequisites\n", "\n", "Please follow the Masterful installation instructions [here](../tutorials/tutorial_installation.md) in order to run this Quickstart.\n", "\n", "In addition, this guide requires the installation of and familiarity with\n", "the Tensorflow Object Detection API for Tensorflow 2.0. See the\n", "installation instructions [here](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2.md#installation)." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "colab_type": "code", "execution": { "iopub.execute_input": "2022-03-25T16:24:37.109404Z", "iopub.status.busy": "2022-03-25T16:24:37.108543Z", "iopub.status.idle": "2022-03-25T16:24:39.474670Z", "shell.execute_reply": "2022-03-25T16:24:39.475007Z" } }, "outputs": [], "source": [ "import dataclasses\n", "import object_detection\n", "import tensorflow as tf\n", "import masterful\n", "\n", "masterful = masterful.register()" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text" }, "source": [ "## Prepare the Data\n", "\n", "This guide will use the Pascal VOC 2007 dataset as a simple example of setting\n", "up an Object Detection workflow. The PASCAL Visual Object Classes Challenge\n", "contains both a Classification and Detection competition. In the\n", "Classification competition, the goal is to predict the set of\n", "labels contained in the image, while in the Detection competition\n", "the goal is to predict the bounding box and label of each individual\n", "object.\n", "\n", "You will use the VOC 2007 dataset from the [Tensorflow Datasets\n", "Catalog](https://www.tensorflow.org/datasets/catalog/voc)." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "colab_type": "code", "execution": { "iopub.execute_input": "2022-03-25T16:24:43.482518Z", "iopub.status.busy": "2022-03-25T16:24:43.481774Z", "iopub.status.idle": "2022-03-25T16:24:44.241501Z", "shell.execute_reply": "2022-03-25T16:24:44.242000Z" } }, "outputs": [], "source": [ "import tensorflow_datasets as tfds\n", "\n", "# First step is to load the data from Tensorflow Datasets.\n", "# You will use the training dataset to train the model, and the validation\n", "# set to measure the progress of training. The test dataset\n", "# is used at the end to measure the results of training the model.\n", "# Importantly, Masterful will never see the test dataset,\n", "# so you can be sure that your model is not overfit to any holdout datasets.\n", "training_dataset = tfds.load(\n", " \"voc/2007\",\n", " split=\"train\",\n", " shuffle_files=False,\n", ")\n", "validation_dataset = tfds.load(\n", " \"voc/2007\",\n", " split=\"validation\",\n", " shuffle_files=False,\n", ")\n", "test_dataset = tfds.load(\n", " \"voc/2007\",\n", " split=\"test\",\n", " shuffle_files=False,\n", ")" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text" }, "source": [ "### Convert Labels to Masterful Format\n", "\n", "After you have the loaded the datasets, it is important to convert\n", "the labels into a format Masterful understands. There are two steps\n", "involved here.\n", "\n", "* Step 1: Convert the labels to Masterful format\n", "* Step 2: Pad the labels and images to uniform sizes so they can be batched\n", "\n", "Masterful understands several different label and bounding box formats.\n", "See [DataParams](../api/api_data.rst#masterful.data.DataParams) for the\n", "specific formats supported. In this example, you are going to use the\n", "Tensorflow bounding box format, which defines bounding boxes in terms\n", "of min and max values, normalized into the range [0,1]. Specifically, the\n", "bounding boxes are of the form [ymin, xmin, ymax, xmax].\n", "\n", "Masterful extends this label format to support padding out the labels,\n", "as well as multiple bounding boxes per object. A Masterful Object Detection\n", "label for a single example has the form `[num_boxes, label]` where\n", "label is a `tf.float32` vector of the form `[valid, ymin, xmin, ymax, xmax, one_hot_class]`.\n", "`valid` is a `float` value of either 1.0 or 0.0, and is used to represent\n", "padded bounding boxes. For example, a value of 1.0 represents a \"good\" bounding\n", "box, and a value of 0.0 represents \"padding\" added to the labels in\n", "order to support batching. Labels whose `valid` value is 0.0 are ignored during\n", "training. For example, if you have 10 classes in your dataset, then\n", "the labels for a single example will have the shape `[num_boxes, 1 + 4 + 10]`.\n", "If we allow a maximum number of bounding boxes per example of 20 (`max_bounding_boxes = 20`),\n", "and use a batch size of 8 (`batch_size = 8`), then the per-batch\n", "labels will have the shape `[batch_size, max_bounding_boxes, 1 + 4 + 10]`.\n", "\n", "Masterful provides a utility to help you convert the labels into Masterful\n", "format, and prepare them for padding and batching. All you need to do\n", "is extract the bounding boxes and class labels from the dataset and Masterful\n", "will handle the conversion for you." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "colab_type": "code", "execution": { "iopub.execute_input": "2022-03-25T16:24:44.251201Z", "iopub.status.busy": "2022-03-25T16:24:44.249876Z", "iopub.status.idle": "2022-03-25T16:24:45.049273Z", "shell.execute_reply": "2022-03-25T16:24:45.049736Z" } }, "outputs": [], "source": [ "NUM_CLASSES = 20\n", "MAX_BOUNDING_BOXES = 10\n", "INPUT_SHAPE = (64, 64, 3)\n", "\n", "from masterful.data.preprocessing import (\n", " convert_and_pad_boxes,\n", " resize_and_pad,\n", ")\n", "\n", "\n", "def convert_and_pad_labels(features_dict):\n", " image = features_dict[\"image\"]\n", " classes = features_dict[\"objects\"][\"label\"]\n", " boxes = features_dict[\"objects\"][\"bbox\"]\n", "\n", " # First convert the labels and pad them to the\n", " # maximum number of bounding boxes, so that you\n", " # can batch them later. Tensorflow datasets bounding boxes\n", " # come in Tensorflow format (ymin, xmin, ymax, xmax)\n", " # so you specify that below.\n", " labels = convert_and_pad_boxes(\n", " boxes,\n", " classes,\n", " masterful.spec.BoundingBoxFormat.TENSORFLOW,\n", " sparse_labels=True,\n", " num_classes=NUM_CLASSES,\n", " max_bounding_boxes=MAX_BOUNDING_BOXES,\n", " )\n", "\n", " # Normalize the size of all the input images to the expected input\n", " # size for the model. The below does a bounding box safe resize that\n", " # will pad the short edge to the final square input shape.\n", " # The model you are using for this guide expects input images\n", " # to be sized to (64, 64), so you specify that square image size below.\n", " image, labels = resize_and_pad(image, labels, size=INPUT_SHAPE[0])\n", " image = tf.clip_by_value(image, 0.0, 255.0)\n", " return image, labels\n", "\n", "\n", "training_dataset = training_dataset.map(\n", " convert_and_pad_labels, num_parallel_calls=tf.data.AUTOTUNE\n", ")\n", "validation_dataset = validation_dataset.map(\n", " convert_and_pad_labels, num_parallel_calls=tf.data.AUTOTUNE\n", ")\n", "test_dataset = test_dataset.map(\n", " convert_and_pad_labels, num_parallel_calls=tf.data.AUTOTUNE\n", ")" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text" }, "source": [ "## Build the Model\n", "\n", "For this guide, you will adapt a model from the Tensorflow Object Detection API\n", "Model Zoo for Tensorflow 2. The list of available models\n", "can be found [here](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md).\n", "\n", "The model used below is a SSD MobileNet v2 detector. Note in this example,\n", "you are only using the model definition from the pipeline configuration.\n", "Other entries in the pipeline configuration are ignored." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "colab_type": "code", "execution": { "iopub.execute_input": "2022-03-25T16:24:45.056353Z", "iopub.status.busy": "2022-03-25T16:24:45.055415Z", "iopub.status.idle": "2022-03-25T16:24:46.194175Z", "shell.execute_reply": "2022-03-25T16:24:46.194696Z" } }, "outputs": [], "source": [ "PIPELINE_CONFIG = \"https://raw.githubusercontent.com/tensorflow/models/master/research/object_detection/configs/tf2/ssd_mobilenet_v2_320x320_coco17_tpu-8.config\"\n", "\n", "# Load the pipeline configuration from the repository\n", "# into a string\n", "import urllib.request\n", "\n", "with urllib.request.urlopen(PIPELINE_CONFIG) as url:\n", " pipeline_config_str = url.read()\n", "\n", "# Parse the pipeline configuration proto string\n", "# into a pipeline configuration proto object\n", "from google.protobuf import text_format\n", "from object_detection.protos import pipeline_pb2\n", "\n", "pipeline_config = text_format.Parse(\n", " pipeline_config_str, pipeline_pb2.TrainEvalPipelineConfig()\n", ")\n", "\n", "# Update the config with your specific requirements, namely the number\n", "# of classes and the model input size\n", "pipeline_config.model.ssd.num_classes = NUM_CLASSES\n", "pipeline_config.model.ssd.image_resizer.fixed_shape_resizer.height = INPUT_SHAPE[0]\n", "pipeline_config.model.ssd.image_resizer.fixed_shape_resizer.width = INPUT_SHAPE[1]\n", "\n", "# Next build the model. The Tensorflow Object Detection API\n", "# provides a model builder class which can take a model config\n", "# and return a `DetectionModel` instance.\n", "from object_detection.builders import model_builder\n", "\n", "object_detection_model = model_builder.build(pipeline_config.model, is_training=True)" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text" }, "source": [ "## Setup Masterful Training\n", "\n", "The Masterful AutoML platform learns how to train your model by\n", "focusing on five core organizational principles in deep\n", "learning: architecture, data, optimization, regularization, and\n", "semi-supervision.\n", "\n", "**Architecture** is the structure of weights, biases, and activations\n", "that define a model. In this example, the architecture is defined\n", "by the object detection model you created above.\n", "\n", "**Data** is the input used to train the model. In this example,\n", "you are using a labeled training dataset of from the VOC detection challenge.\n", "More advanced usages of the Masterful AutoML platform can take into\n", "account unlabeled and synthetic data as well, using a variety\n", "of different techniques.\n", "\n", "**Optimization** means finding the best weights for a model\n", "and training data. Optimization is different from regularization\n", "because optimization does not consider generalization to unseen\n", "data. The central challenge of optimization is\n", "speed - find the best weights faster.\n", "\n", "**Regularization** means helping a model generalize to data it\n", "has not yet seen. Another way of saying this is that regularization\n", "is about fighting overfitting.\n", "\n", "**Semi-Supervision** is the process by which a model can be\n", "trained using both labeled and unlabeled data." ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text" }, "source": [ "### Architecture and Data Parameters\n", "\n", "The first step when using Masterful is to learn the optimal set\n", "of parameters for each of the five buckets above. You start by\n", "learning the architecture and data parameters of the model and\n", "training dataset.\n", "\n", "In the code below, you are telling Masterful\n", "that your model is performing a detection task\n", "(`masterful.enums.Task.DETECTION`) with 20 labels\n", "(`num_classes=NUM_CLASSES`), and that the input range of\n", "the image features going into your model are in\n", "the range [0,255] (`input_range=masterful.enums.ImageRange.ZERO_255`).\n", "Also, the model outputs logits rather than a softmax\n", "classification (`prediction_logits=True`).\n", "\n", "Furthermore, in the training dataset, you are providing\n", "dense labels (`label_sparse=False`) rather than sparse labels.\n", "\n", "For more details on architecture and data parameters, see\n", "the API specifications for [ArchitectureParams](../api/api_architecture.rst#masterful.architecture.ArchitectureParams)\n", "and [DataParams](../api/api_data.rst#masterful.data.DataParams)." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "colab_type": "code", "execution": { "iopub.execute_input": "2022-03-25T16:24:46.207960Z", "iopub.status.busy": "2022-03-25T16:24:46.206911Z", "iopub.status.idle": "2022-03-25T16:24:46.208947Z", "shell.execute_reply": "2022-03-25T16:24:46.209447Z" } }, "outputs": [], "source": [ "# Create the model parameters describing the model\n", "# architecture.\n", "model_params = masterful.architecture.ArchitectureParams(\n", " task=masterful.enums.Task.DETECTION,\n", " input_range=masterful.enums.ImageRange.ZERO_255,\n", " input_shape=INPUT_SHAPE,\n", " input_dtype=tf.float32,\n", " num_classes=NUM_CLASSES,\n", " prediction_dtype=tf.float32,\n", " prediction_structure=masterful.enums.TensorStructure.DICT,\n", " prediction_logits=True,\n", " prediction_shape=None,\n", " model_config=pipeline_config.model,\n", ")\n", "\n", "# Create the data parameters describing the input data structure.\n", "training_dataset_params = masterful.data.DataParams(\n", " task=masterful.enums.Task.DETECTION,\n", " image_range=masterful.enums.ImageRange.ZERO_255,\n", " image_shape=INPUT_SHAPE,\n", " image_dtype=tf.float32,\n", " label_sparse=False,\n", " num_classes=NUM_CLASSES,\n", " label_dtype=tf.float32,\n", " label_shape=(MAX_BOUNDING_BOXES, 1 + 4 + NUM_CLASSES),\n", " label_structure=masterful.enums.TensorStructure.SINGLE_TENSOR,\n", " label_bounding_box_format=masterful.enums.BoundingBoxFormat.TENSORFLOW,\n", ")\n", "\n", "# The validation dataset parameters are the same as the training\n", "# dataset parameters.\n", "validation_dataset_params = dataclasses.replace(training_dataset_params)" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text" }, "source": [ "### Optimization Parameters\n", "\n", "Next you learn the optimization parameters that will be used\n", "to train the model. Below, you use Masterful to learn the\n", "standard set of optimization parameters to train your model\n", "for a detection task.\n", "\n", "For more details on the optmization parameters, please see the\n", "[OptimizationParams](../api/api_optimization.rst#masterful.optimization.OptimizationParams)\n", "API specification." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "colab_type": "code", "execution": { "iopub.execute_input": "2022-03-25T16:24:46.213686Z", "iopub.status.busy": "2022-03-25T16:24:46.212952Z", "iopub.status.idle": "2022-03-25T16:24:46.218855Z", "shell.execute_reply": "2022-03-25T16:24:46.219326Z" } }, "outputs": [], "source": [ "optimization_params = masterful.optimization.learn_optimization_params(\n", " object_detection_model,\n", " model_params,\n", " training_dataset,\n", " training_dataset_params,\n", ")" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text" }, "source": [ "### Semi-Supervised Learning Parameters\n", "\n", "The next step before training is to learn the optimal set of\n", "semi-supervision parameters. For this guide, you are not using\n", "any unlabeled or synthetic data as part of training,\n", "so most forms of semi-supervision will be disabled by default.\n", "\n", "For more details on the semi-supervision parameters, please see\n", "the [SemiSupervisedParams](../api/api_ssl.rst#masterful.ssl.SemiSupervisedParams)\n", "API specification." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "colab_type": "code", "execution": { "iopub.execute_input": "2022-03-25T16:24:46.223482Z", "iopub.status.busy": "2022-03-25T16:24:46.222688Z", "iopub.status.idle": "2022-03-25T16:24:46.224726Z", "shell.execute_reply": "2022-03-25T16:24:46.225193Z" } }, "outputs": [], "source": [ "ssl_params = masterful.ssl.learn_ssl_params(training_dataset, training_dataset_params)" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text" }, "source": [ "### Regularization Parameters\n", "\n", "The regularization parameters used can have a dramatic impact\n", "on the final performance of your trained model. Learning these\n", "parameters can be a time-consuming and domain specific challenge.\n", "Masterful can speed up this process by learning these parameters\n", "for you. In general, this can be an expensive operation. A rough\n", "order of magnitude for learning these parameters is 2x the time\n", "it takes to train your model. However, this is still dramatically\n", "faster than manually finding these parameters yourself, and these\n", "parameters can be reused in future training sessions. In the\n", "example below, you will use the [learn_regularization_params](../api/api_regularization.rst#masterful.regularization.learn_regularization_params)\n", "API to learn these parameters directly from your dataset and model.\n", "\n", "For more details on the regularization parameters, please see\n", "the [RegularizationParams](../api/api_regularization.rst#masterful.regularization.RegularizationParams)\n", "API specification." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "colab_type": "code", "execution": { "iopub.execute_input": "2022-03-25T16:24:46.230806Z", "iopub.status.busy": "2022-03-25T16:24:46.230044Z", "iopub.status.idle": "2022-03-25T17:04:30.639404Z", "shell.execute_reply": "2022-03-25T17:04:30.640254Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "MASTERFUL: Meta-Learning Regularization Parameters...\n", "MASTERFUL: Warming up model for analysis.\n", "MASTERFUL: Analyzing baseline model performance. Training until validation loss stabilizes...\n", "Baseline Training: 100%|██████████| 32/32 [00:46<00:00, 1.46s/steps]\n", "MASTERFUL: Baseline training complete.\n", "MASTERFUL: Meta-Learning Basic Data Augmentations...\n", "Node 1/4: 100%|██████████| 640/640 [00:55<00:00, 11.56steps/s]\n", "Node 2/4: 100%|██████████| 640/640 [01:03<00:00, 10.08steps/s]\n", "Node 3/4: 100%|██████████| 640/640 [01:50<00:00, 5.78steps/s]\n", "Node 4/4: 100%|██████████| 640/640 [01:53<00:00, 5.64steps/s]\n", "MASTERFUL: Meta-Learning Data Augmentation Clusters...\n", "Distance Analysis: 100%|██████████| 143/143 [04:42<00:00, 1.98s/steps]\n", "Node 1/10: 100%|██████████| 640/640 [02:53<00:00, 3.69steps/s]\n", "Node 2/10: 100%|██████████| 640/640 [01:24<00:00, 7.59steps/s]\n", "Node 3/10: 100%|██████████| 640/640 [02:35<00:00, 4.11steps/s]\n", "Node 4/10: 100%|██████████| 640/640 [02:13<00:00, 4.80steps/s]\n", "Node 5/10: 100%|██████████| 640/640 [01:18<00:00, 8.14steps/s]\n", "Distance Analysis: 100%|██████████| 66/66 [01:17<00:00, 1.17s/steps]\n", "Node 6/10: 100%|██████████| 640/640 [01:19<00:00, 8.06steps/s]\n", "Node 7/10: 100%|██████████| 640/640 [01:16<00:00, 8.36steps/s]\n", "Node 8/10: 100%|██████████| 640/640 [01:49<00:00, 5.87steps/s]\n", "Node 9/10: 100%|██████████| 640/640 [02:21<00:00, 4.54steps/s]\n", "Node 10/10: 100%|██████████| 640/640 [02:12<00:00, 4.84steps/s]\n", "MASTERFUL: Meta-Learning Label Based Regularization...\n", "Node 1/2: 100%|██████████| 640/640 [02:07<00:00, 5.02steps/s]\n", "Node 2/2: 100%|██████████| 640/640 [02:35<00:00, 4.10steps/s]\n", "MASTERFUL: Meta-Learning Weight Based Regularization...\n", "MASTERFUL: Analysis finished in 39.65527991453806 minutes.\n", "MASTERFUL: Learned parameters coin-vigorous-figure saved at /Users/swookey/.masterful/policies/coin-vigorous-figure.\n" ] } ], "source": [ "# In order to speed up the guide and demonstrate the full workflow,\n", "# take only a small subset of the training and validation data.\n", "# In a real training workflow, you would use the full datasets.\n", "training_dataset = training_dataset.take(128)\n", "validation_dataset = validation_dataset.take(128)\n", "\n", "# Override the optimization parameters to only train for 1 epoch\n", "# to demonstrate the workflow. A real training workflow should use the\n", "# learned parameters directly.\n", "optimization_params.epochs = 1\n", "optimization_params.warmup_epochs = 0\n", "\n", "regularization_params = masterful.regularization.learn_regularization_params(\n", " object_detection_model,\n", " model_params,\n", " optimization_params,\n", " training_dataset,\n", " training_dataset_params,\n", " validation_dataset,\n", " validation_dataset_params,\n", ")" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text" }, "source": [ "## Train the Model\n", "\n", "Now, you are ready to train your model using the Masterful\n", "AutoML platform. In the next cell, you will see the call to\n", "[masterful.training.train](../api/api_training.rst#masterful.training.train),\n", "which is the entry point to the training and meta-learning engine of the\n", "Masterful AutoML platform. Notice there is no need to batch\n", "your data (Masterful will find the optimal batch size for you).\n", "No need to shuffle your data (Masterful handles this for you). You\n", "hand Masterful a model and a dataset, and Masterful will figure\n", "the rest out for you.\n", "\n", "Note that in the section above, you overrode the number of training\n", "epochs to be 1, to speed up this guide. For obvious reasons, this will\n", "not fully train your model, but instead is sufficient to demonstrate\n", "the training workflow." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "colab_type": "code", "execution": { "iopub.execute_input": "2022-03-25T17:04:30.659856Z", "iopub.status.busy": "2022-03-25T17:04:30.658519Z", "iopub.status.idle": "2022-03-25T17:08:20.692486Z", "shell.execute_reply": "2022-03-25T17:08:20.693291Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "MASTERFUL: Training model with semi-supervised learning disabled.\n", "MASTERFUL: Performing basic dataset analysis.\n", "MASTERFUL: Training model with:\n", "MASTERFUL: \t128 labeled examples.\n", "MASTERFUL: \t128 validation examples.\n", "MASTERFUL: \t0 synthetic examples.\n", "MASTERFUL: \t0 unlabeled examples.\n", "MASTERFUL: Training model with learned parameters coin-vigorous-figure in two phases.\n", "MASTERFUL: The first phase is supervised training with the learned parameters.\n", "MASTERFUL: The second phase is semi-supervised training to boost performance.\n", "MASTERFUL: Warming up model for supervised training.\n", "MASTERFUL: Starting Phase 1: Supervised training until the validation loss stabilizes...\n", "Supervised Training: 100%|██████████| 32/32 [02:36<00:00, 4.90s/steps]\n", "MASTERFUL: Semi-Supervised training disabled in parameters.\n", "MASTERFUL: Training complete in 3.4013052304585774 minutes.\n" ] } ], "source": [ "training_report = masterful.training.train(\n", " object_detection_model,\n", " model_params,\n", " optimization_params,\n", " regularization_params,\n", " ssl_params,\n", " training_dataset,\n", " training_dataset_params,\n", " validation_dataset,\n", " validation_dataset_params,\n", ")" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text" }, "source": [ "## Evaluate the Model\n", "\n", "Once you have trained yur model, how do you know that it performs\n", "well? The next step is to evaluate your model. Typically, you do this\n", "through the Tensorflow Object Detection API, which can take your pipeline\n", "configuration and run it in evaluation mode instead of training mode.\n", "Masterful however can evaluate your model directly as well.\n", "\n", "For example, the [TrainingReport](../api/api_training.rst#masterful.training.TrainingReport)\n", "returned by Masterful provides a Keras model wrapper of your Tensorflow\n", "Object Detection API model, so you can use standard Keras evaluation\n", "metrics to look at some intrinsic metrics, like the classification\n", "and localization loss values." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "colab_type": "code", "execution": { "iopub.execute_input": "2022-03-25T17:08:20.701259Z", "iopub.status.busy": "2022-03-25T17:08:20.700028Z", "iopub.status.idle": "2022-03-25T17:08:51.605707Z", "shell.execute_reply": "2022-03-25T17:08:51.628848Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "619/619 [==============================] - 31s 49ms/step - loss: 4.7217 - Loss/localization_loss: 1.3038 - Loss/classification_loss: 3.3551 - Loss/regularization_loss: 0.0627 - Loss/total_loss: 4.7217\n" ] }, { "data": { "text/plain": [ "{'loss': 4.480246543884277,\n", " 'Loss/localization_loss': 1.065096139907837,\n", " 'Loss/classification_loss': 3.3524088859558105,\n", " 'Loss/regularization_loss': 0.0627414882183075,\n", " 'Loss/total_loss': 4.480246543884277}" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# You can use the model returned in the training report\n", "# to evaluate the loss of the model on the test dataset. This\n", "# model is a Keras model that wraps the TF DetectionModel and\n", "# allows you to use Keras model semantics.\n", "training_report.model.evaluate(\n", " test_dataset.batch(optimization_params.batch_size, drop_remainder=True),\n", " return_dict=True,\n", ")" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text" }, "source": [ "### COCO Evaluation Metrics\n", "\n", "A more standard way of measuring object detection performance is\n", "to evaluate using the MSCOCO evaluation metrics standard.\n", "The evaluation metrics are described [here](https://cocodataset.org/#detection-eval),\n", "and there is a common library [pycocotools](https://pypi.org/project/pycocotools/)\n", "which provides implementations of these metrics. Masterful\n", "provides an easy wrapper for these tools in\n", "[CocoEvaluationMetrics](../api/api_evaluation.rst#masterful.evaluation.detection.coco.CocoEvaluationMetrics)\n", "\n", "In order to use this evaluator, you need to tell the evaluator\n", "how to convert the predictions from the detection model into\n", "labels that can be used by the evaluator. Masterful provides\n", "a built-in prediction converter for Tensorflow Object Detection models\n", "in [predictions_to_labels](../api/api_architecture.rst#masterful.architecture.detection.predictions_to_labels)" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "colab_type": "code", "execution": { "iopub.execute_input": "2022-03-25T17:08:51.659393Z", "iopub.status.busy": "2022-03-25T17:08:51.656638Z", "iopub.status.idle": "2022-03-25T17:37:09.072588Z", "shell.execute_reply": "2022-03-25T17:37:09.073236Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "100%|██████████| 4952/4952 [28:08<00:00, 2.93it/s]\n", "creating index...\n", "index created!\n", "creating index...\n", "index created!\n", "Running per image evaluation...\n", "Evaluate annotation type *bbox*\n", "DONE (t=6.73s).\n", "Accumulating evaluation results...\n", "DONE (t=1.69s).\n", " Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000\n", " Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.000\n", " Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.000\n", " Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000\n", " Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.003\n", " Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000\n", " Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.002\n", " Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.003\n", " Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.003\n", " Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.001\n", " Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.013\n", " Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000\n" ] }, { "data": { "text/plain": [ "{'DetectionBoxes_Precision/mAP': 7.872415909767271e-05,\n", " 'DetectionBoxes_Precision/mAP@.50IOU': 0.0003891330980391314,\n", " 'DetectionBoxes_Precision/mAP@.75IOU': 2.3686579184234213e-05,\n", " 'DetectionBoxes_Precision/mAP (small)': 0.00018205932072348492,\n", " 'DetectionBoxes_Precision/mAP (medium)': 0.003112683424234397,\n", " 'DetectionBoxes_Precision/mAP (large)': -1.0,\n", " 'DetectionBoxes_Recall/AR@1': 0.0019719104460060475,\n", " 'DetectionBoxes_Recall/AR@10': 0.0030807878813660853,\n", " 'DetectionBoxes_Recall/AR@100': 0.0030807878813660853,\n", " 'DetectionBoxes_Recall/AR@100 (small)': 0.0009905384339650707,\n", " 'DetectionBoxes_Recall/AR@100 (medium)': 0.012506642532657278,\n", " 'DetectionBoxes_Recall/AR@100 (large)': -1.0}" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Masterful also provides a COCO evaluator, to measure\n", "# the performance of your model using the COCO evaluation\n", "# metrics.\n", "from masterful.evaluation.detection.coco import CocoEvaluationMetrics\n", "from masterful.architecture.detection import predictions_to_labels\n", "\n", "# The COCO evaluation metrics needs to understand the class\n", "# mappings between your class mappings and the semantic names,\n", "# for a human readable output. You can put anything you want here,\n", "# as long as you have an entry for each class label. Below are\n", "# the class names for the 20 VOC labels.\n", "VOC_CLASS_NAMES = [\n", " \"aeroplane\",\n", " \"bicycle\",\n", " \"bird\",\n", " \"boat\",\n", " \"bottle\",\n", " \"bus\",\n", " \"car\",\n", " \"cat\",\n", " \"chair\",\n", " \"cow\",\n", " \"diningtable\",\n", " \"dog\",\n", " \"horse\",\n", " \"motorbike\",\n", " \"person\",\n", " \"pottedplant\",\n", " \"sheep\",\n", " \"sofa\",\n", " \"train\",\n", " \"tvmonitor\",\n", "]\n", "\n", "# Categories is a dictionary mapping the class id to the semantic\n", "# label above.\n", "categories = [\n", " {\"id\": class_id, \"name\": str(class_name)}\n", " for class_id, class_name in zip(range(NUM_CLASSES), VOC_CLASS_NAMES)\n", "]\n", "evaluator = CocoEvaluationMetrics(categories)\n", "\n", "# Evaluate the model on the test dataaset, which has\n", "# never been seen by your model before. The predictions\n", "# to labels function tells the evaluator how to interpret\n", "# the predictions of the model.\n", "evaluator.evaluate_model(\n", " training_report.model,\n", " predictions_to_labels(\n", " object_detection_model,\n", " MAX_BOUNDING_BOXES,\n", " ),\n", " test_dataset,\n", " NUM_CLASSES,\n", ")" ] } ], "metadata": { "colab": { "collapsed_sections": [], "name": "guide_object_detection", "private_outputs": false, "provenance": [], "toc_visible": true }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.8" } }, "nbformat": 4, "nbformat_minor": 0 }