Article
Ingénierie & Outils numériques

Fully Residual Unet-based Semantic Segmentation of Automotive Fisheye Images: a Comparison of Rectangular and Deformable Convolutions

Toutes les publications

Article : Articles dans des revues internationales ou nationales avec comité de lecture

Semantic image segmentation is an essential task for autonomous vehicles and self-driving cars where a complete and real-time perception of the surroundings is mandatory. Convolutional Neural Network approaches for semantic segmentation standout over other state-of-the-art solutions due to their powerful generalization ability over unknown data and end-to-end training. Fisheye images are important due to their large field of view and ability to reveal information from broader surroundings. Nevertheless, they pose unique challenges for CNNs, due to object distortion resulting from the Fisheye lens and object position. In addition, large annotated Fisheye datasets required for CNN training is rather limited. In this paper, we investigate the use of Deformable convolutions in accommodating distortions within Fisheye image segmentation for fully residual U-net by learning unknown geometric transformations via variable shaped and sized filters. The proposed models and integration strategies are exploited within two main paradigms: single(front)-view and multi-view Fisheye images segmentation. The validation of the proposed methods is conducted on synthetic and real Fisheye images from the WoodScape and the SynWoodScape datasets. The results validate the significance of the Deformable fully residual U-Net structure in learning unknown geometric distortions in both paradigms, demonstrate the possibility in learning view-agnostic distortion properties when trained on the multi-view data and shed light on the role of surround-view images in increasing segmentation performance relative to the single view. Finally, our experiements suggests that Deformable convolutions are a powerful tool that can increase the efficiency of fully residual U-Nets for semantic segmentation of automotive fisheye images.