Targeting at depicting land covers with pixel-wise semantic categories, semantic segmentation in remote sensing images needs to portray diverse distributions over vast geographical locations, which is difficult to be achieved by the homogeneous pixel-wise forward paths in the architectures of existing deep models. Although several algorithms have been designed to select pixel-wise adaptive forward paths for natural image analysis, it still lacks theoretical supports on how to obtain optimal selections. In this paper, we provide mathematical analyses in terms of the parameter optimization, which guides us to design a method called Hidden Path Selection Network (HPS-Net). With the help of hidden variables derived from an extra mini-branch, HPS-Net is able to tackle the inherent problem about inaccessible global optimums by adjusting the direct relationships between feature maps and pixel-wise path selections in existing algorithms, which we call hidden path selection. For the better training and evaluation, we further refine and expand the 5-class Gaofen Image Dataset (GID-5) to a new one with 15 land-cover categories, i.e., GID-15. The experimental results on both GID-5 and GID-15 demonstrate that the proposed modules can stably improve the performance of different deep structures, which validates the proposed mathematical analyses.
In proposed Hidden Path Selection Network (HPS-Net), we portray the complicated pixel-wise land-cover distributions by imitating the hidden markov chain. As illustrated in the figure, we design the HPS-Net with independent parallel structures (i.e. a main-branch for semantic segmentation and a mini-branch for supporting) by exploiting a sequence of special modules called Hidden Path Module (HP-Module). Considering all items in the main-branch as variables, we notice that the high-dimension manifold where we excute the parameter optimization could be adjusted by utilizing hidden variables highly independent with main-branch. Thus, we utilize the HP-Module to obtain a sequence of soft masks by leveraging hidden variables from the mini-branch to select pixel-wise adaptive forward paths in the main branch through multiplication, where the adjusted high-dimension manifold would have higher possibility to contain the global optimums.
Data samples taken from our proposed GID-15 dataset. Different colors indicate different land-cover categories. The obscure land covers are blackened in the annotation masks. Several intact data samples are listed at the top and some cropped data patches are listed on the bottom, where we can see the annotation meticulousness with wide geographical areas covered.
GID-15 can be download from Baidudrive:
We evaluate the proposed HPS-Net on both GID-5 and GID-15. Classical semantic segmentation algorithms Deeplabv3* and state-of-the-art pixel-wise dynamic models GPS-Net are picked for visual comparison. The experimental results demonstrate that the proposed modules can stably improve the performance of different deep structures.
Examples of comparing our HPS-Net with the state-of-the-art pixel-wise dynamic method GPSNet on GID-15 with Resnet-101 as the backbone. Our proposed HPS-Net can better depict pixel-wise land-cover distributions. The blackened pixels are the ignored regions in the annotation masks, which are also ignored in the semantic segmentation results.
Examples of comparing our HPS-Net with the basic residual structures Deeplabv3* on GID-5 with Resnet-101 as the backbone, which display the merits of applying the proposed hidden path selection on Deeplabv3* (i.e. HPS-Net-101).
@misc{yang2021hidden, title={Hidden Path Selection Network for Semantic Segmentation of Remote Sensing Images}, author={Kunping Yang and Xin-Yi Tong and Gui-Song Xia and Weiming Shen and Liangpei Zhang}, year={2021}, eprint={2112.05220}, }
E-mail : kunpingyang@whu.edu.cn
             guisong.xia@whu.edu.cn