SageMaker Object Detection preprocessing

The Amazon SageMaker Object Detection algorithm detects and classifies objects in images using a single deep neural network. It is a supervised learning algorithm that takes images as input and identifies all instances of objects within the image scene.

If you are looking to classify images or scenes, check out SageMaker Image Classification preprocessing

Upload data to S3

To make the process easy, split your dataset to be used for training and testing/validation. Generally, the split proportion would be 80% training data and 20% testing/validation data. You can upload and organize these images in a two separate folders in S3 (one for training and other for testing/validation).

Create a dataset

We will make use of SageMaker Ground Truth for labeling jobs to build training datasets.

Navigate to SageMaker on the console and click “Labeling jobs” under Ground Truth:

Click on “Create labeling jobs”

Put in details such as job name, input dataset location, output dataset location, and IAM Role

If you do not already have a manifest file for your input dataset, click on “Create manifest file” under Input dataset location

Enter the S3 location for input dataset and click “Create”

Once the manifest file is created, click on “Use this manifest”

Continuing on the same page, select “Image” under Task category and select “Bounding box” under Task selection. Click Next.

Ground Truth allows multiple worker types (mechanical turk, private, vendor managed). In this example, we will use “Private” workers. Fill in the appropriate details.

Scrolling down, you need to add brief description for workers to understand the job as well as labels. Click on “Create”.

Labeling job will be created for Task Type “Bounding Box”

The workers will receive email to perform the labeling

The first time they log in the portal using the credentials provided in the email, they will be asked to change the password. Once they log in the portal, they will be presented with the job. Click on “Start working”.

The worker will be presented with images they need to label and draw bounding box.

The worker needs to select the label first and draw bounding box around the object.

If the image has multiple object, the worker needs to select labels and draw bounding box around all the objects.

Once all the images are labeled, worker can log out of the portal.

The labeling job on SageMaker console will show status as “Complete”