Journal : Scientific Reports, 21 March 2022
The semantic segmentation of omnidirectional urban driving images is a research topic that has increasingly attracted the attention of researchers, because the use of such images in driving scenes is highly relevant. However, the case of motorized two-wheelers has not been treated yet. Since the dynamics of these vehicles are very diferent from those of cars, we focus our study on images acquired using a motorcycle. This paper provides a thorough comparative study to show how diferent deep learning approaches handle omnidirectional images with diferent representations, including perspective, equirectangular, spherical, and fsheye, and presents the best solution to segment road scene omnidirectional images. We use in this study real perspective images, and synthetic perspective, fsheye and equirectangular images, simulated fsheye images, as well as a test set of real fsheye images. By analyzing both qualitative and quantitative results, the conclusions of this study are multiple, as it helps understand how the networks learn to deal with omnidirectional distortions. Our main fndings are that models with planar convolutions give better results than the ones with spherical convolutions, and that models trained on omnidirectional representations transfer better to standard perspective images than vice versa.