Data Input for Instance Segmentation
------------------------------------

.. _dataset_format:

Instance segmentation expects directories of images for training or validation and annotation files
in COCO format. The naming convention for train/val split can be different because the path of
each set is individually specified in the data preparation script in the IPython notebook example.
Image data and the corresponding annotation file is then converted to TFRecords for training.

COCO format for Instance Segmentation 
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Using the COCO format requires data to be organized in this structure:

.. code::

    annotation{
    "id": int, 
    "image_id": int, 
    "category_id": int, 
    "segmentation": RLE or [polygon], 
    "area": float, 
    "bbox": [x,y,width,height], 
    "iscrowd": 0 or 1,
    }

    image{
    "id": int,
    "width": int,
    "height": int,
    "file_name": str,
    "license": int,
    "flickr_url": str,
    "coco_url": str,
    "date_captured": datetime,
    }

    categories[{
    "id": int, 
    "name": str, 
    "supercategory": str,
    }]

An example COCO annotation file is shown below:

.. code::

    "annotations": [{"segmentation": [[510.66,423.01,511.72,420.03,510.45,416.0,510.34,413.02,510.77,410.26,510.77,407.5,510.34,405.16,511.51,402.83,511.41,400.49,510.24,398.16,509.39,397.31,504.61,399.22,502.17,399.64,500.89,401.66,500.47,402.08,499.09,401.87,495.79,401.98,490.59,401.77,488.79,401.77,485.39,398.58,483.9,397.31,481.56,396.35,478.48,395.93,476.68,396.03,475.4,396.77,473.92,398.79,473.28,399.96,473.49,401.87,474.56,403.47,473.07,405.59,473.39,407.71,476.68,409.41,479.23,409.73,481.56,410.69,480.4,411.85,481.35,414.93,479.86,418.65,477.32,420.03,476.04,422.58,479.02,422.58,480.29,423.01,483.79,419.93,486.66,416.21,490.06,415.57,492.18,416.85,491.65,420.24,492.82,422.9,493.56,424.39,496.43,424.6,498.02,423.01,498.13,421.31,497.07,420.03,497.07,415.15,496.33,414.51,501.1,411.96,502.06,411.32,503.02,415.04,503.33,418.12,501.1,420.24,498.98,421.63,500.47,424.39,505.03,423.32,506.2,421.31,507.69,419.5,506.31,423.32,510.03,423.01,510.45,423.01]],"area": 702.1057499999998,"iscrowd": 0,"image_id": 289343,"bbox": [473.07,395.93,38.65,28.67],"category_id": 18,"id": 1768}],
    "images": [{"license": 1,"file_name": "000000407646.jpg","coco_url": "http://images.cocodataset.org/val2017/000000407646.jpg","height": 400,"width": 500,"date_captured": "2013-11-23 03:58:53","flickr_url": "http://farm4.staticflickr.com/3110/2855627782_17b93a684e_z.jpg","id": 407646}],
    "categories": [{"supercategory": "person","id": 1,"name": "person"},{"supercategory": "vehicle","id": 2,"name": "bicycle"},{"supercategory": "vehicle","id": 3,"name": "car"},{"supercategory": "vehicle","id": 4,"name": "motorcycle"}]

For more details, please check COCO format. A COCO dataset preparation script is provided in the
TLT container which automatically downloads and converts the dataset to TFRecords. In the
MaskRCNN notebook, you can run the script as follows:

.. code::

   download_and_preprocess_coco.sh $data_dir

When using a custom dataset, you should follow the COCO format closely and convert the dataset to
TFRecords using the following command (refer to L68-75 in download_and_preprocess_coco.sh for
more detail). 

.. code::

    python create_coco_tf_record.py
      --logtostderr
      --include_masks
      --train_image_dir=$TRAIN_IMAGE_DIR
      --val_image_dir=$VAL_IMAGE_DIR
      --train_object_annotations_file=$TRAIN_COCO_ANNOTATION_FILE
      --val_object_annotations_file=$VAL_ANNOTATION_FILE
      --output_dir=$OUTPUT_DIR