Learning to Understand Aerial Images (LUAI)

Overview

We propose three tasks. Task1 is to detect instances with oriented bounding boxes. Task2 is to detect instances with horizontal bounding boxes. Task3 is to detect semantic labels for each pixel. Task 1 and Task 2 is based on DOTA-v2.0. Task 3 is based on GID-15. You can use the provided train/val data to train and validate your detector. Validation data may also be used for training when submitting results on the test set. External data of any form is allowed. But must be reported during submission. Fine-tuning models that are pretrained on ImageNet or COCO are also allowed.

Task1 - Detection with oriented bounding boxes

The purpose of this task is to localize the ground object instances with an oriented bounding box. The oriented bounding box follows the same format with the original annotation {(x _i, y _i), i = 1,2,3,4}.

Submission Format

You will be asked to submit a zip file (example of task1) containing results for all test images to evaluate your results. The results are stored in 18 files, "Task1_plane.txt, Task1_storage-tank.txt, ...", each file contains all the results for a specific category. Each file is in the following format:

imgname score x₁ y₁ x₂ y₂ x₃ y₃ x₄ y₄
imgname score x₁ y₁ x₂ y₂ x₃ y₃ x₄ y₄
...

Evaluation Protocol

The evaluation protocol for the oriented bounding box is a little different from the protocol in the original PASCAL VOC. We use the intersection over the union area of two polygons(ground truth and prediction) to calculate the IoU. The rest follows the PASCAL VOC.

Task2 - Detection with horizontal bounding boxes

Detecting object with horizontal bounding boxes is usual in many previous contests for object detection. The aim of this task is to accurately localize the instance in terms of horizontal bounding box with (x, y, w, h) format. In the task, the ground truths for training and testing are generated by calculating the axis-aligned bounding boxes over original annotated bounding boxes.

Submission Format

You will be asked to submit a zip file (example of task2) containing results for all test images to evaluate your results. The results are stored in 16 files, "Task2_plane.txt, Task2_storage-tank.txt, ...", each file contains all the results for a specific category. The format of the results is:

imgname score xmin ymin xmax ymax
imgname score xmin ymin xmax ymax
...

Evaluation Protocol

The evaluation protocol for horizontal bounding boxes follows the PASCAL VOC benchmark, which uses mean Average Precision( mAP) as the primary metric.

Task3 - Semantic Segmentation

The aim of this task is to give the semantic category for each pixel in aerial images.

Submission Format

Participants will be asked to submit a zip file (example of task3) containing results (stored in ".png" format) for all test images. Each ".png" file should have the same name as the corresponding tested image. Image dimensions of ".png" files must be equal to input RGB image dimensions. Each category is represented by a specific value:

Paddyfield : 5
Urbanresidential: 2

Evaluation Protocol

The evaluation protocol adopts the mIoU.