DiRS: On Creating Benchmark Datasets for Remote
Sensing Image Interpretation
Yang Long1,
Gui-Song Xia1,2,*,
Shengyang Li3,
Wen Yang1,4,
Michael Ying Yang5,
Xiao Xiang Zhu6,
Liangpei Zhang1,
Deren Li1.
1. State Key Lab. LIESMARS, Wuhan University, Wuhan 430079, China
2. School of Computer Science, Wuhan University, Wuhan 430079, China
3. Key Laboratory of Space Utilization, Technology and Engineering Center for Space Utilization, Chinese Academy of Sciences, Beijing 100094, China
4. School of Electronic Information, Wuhan University, Wuhan 430072, China
5. Faculty of Geo-Information Science and Earth Observation, University of Twente, Hengelosestraat 99, Enschede, Netherlands
6. German Aerospace Center (DLR) and also Technical University of Munich, Germany
1. Abstract
The past decade has witnessed the great progress on remote sensing (RS) image interpretation and
its wide applications. With RS images becoming more accessible than ever before, there is an
increasing demand for the automatic interpretation of these images, where benchmark datasets are
essential prerequisites for developing and testing intelligent interpretation algorithms. After
reviewing existing benchmark datasets in the research community of RS image interpretation, this
article discusses the problem of how to efficiently prepare a suitable benchmark dataset for RS
image analysis. Specifically, we first analyze the current challenges of developing intelligent
algorithms for RS image interpretation with bibliometric investigations. We then present some
principles, i.e., diversity, richness,
and scalability (called DiRS), on
constructing benchmark datasets in efficient manners. Following the DiRS principles, we also
provide an example on building datasets for RS image classification, i.e.,
Million-AID, a new large-scale benchmark dataset
containing million instances for RS scene classification. Several challenges and perspectives in
RS image annotation are finally discussed to facilitate the research in benchmark dataset construction.
We do hope this paper will provide RS community an overall perspective on constructing large-scale
and practical image datasets for further research, especially data-driven ones.
3. DiRS: Principles to Build RS Image Benchmarks
The primary point to construct a meaningful RS image dataset is that the dataset should be
created on the basis of the requirements of practical applications rather than the characteristics
of algorithms to be employed. Moreover, the annotation of RS image dataset is better to be conducted
by the application sides rather than the algorithm developers. Thus, the annotated dataset is
naturally application-oriented, which is more conducive to enhance the practicability of the
interpretation algorithm. With these points in mind, the i.e., diversity,
richness, and scalability (called DiRS)
could be considered as the basic principles when creating benchmark datasets for RS image interpretation.
We believe that these principles are complementary to each other. That is, the improvement of dataset in
one principle can simultaneously promote the dataset quality reflected in other principles.
4. An Example: Million-AID
Following the DiRS principles, we provide an example on building datasets for RS image classification,
i.e., Million-AID, a new large-scale benchmark dataset containing million instances for RS scene
classification. Million-AID contains a wide range of semantic categories, i.e., 51 scene categories
organized by the hierarchical category network of a three-level tree: 51 leaf nodes fall into 28 parent
nodes at the second level which are grouped into 8 nodes at the first level, representing the 8 underlying
scene categories of agriculture land, commercial land, industrial land, public service land, residential
land, transportation land, unutilized land, and water area. The scene category network provides the dataset
with excellent organization of relationship among different scene categories and also the property of
scalability. The number of images in each scene category ranges from about 2,000 to 45,000, endowing the
dataset with the property of long tail distribution. Besides, Million-AID has superiorities over the
existing scene classification datasets owing to its high spatial resolution, large scale, and global
distribution.
- Category Network
- Semantic Coordinates Collection
- Scene Image Acquisition
- Data Download
Million-AID will be released for public accessibility.