Web of Science: 7 cites, Scopus: 10 cites, Google Scholar: cites
Monocular depth estimation through virtual-world supervision and real-world SfM self-supervision
Gurram, Akhil (Huawei Munich Research Center)
Tuna, Ahmet Faruk (Huawei Munich Research Center)
Shen, Fengyi (Technische Universität München. Department of Informatics)
Urfalioglu, Onay (Huawei Munich Research Center)
López Peña, Antonio M. (Universitat Autònoma de Barcelona. Departament de Ciències de la Computació)

Data: 2022
Resum: Depth information is essential for on-board perception in autonomous driving and driver assistance. Monocular depth estimation (MDE) is very appealing since it allows for appearance and depth being on direct pixelwise correspondence without further calibration. Best MDE models are based on Convolutional Neural Networks (CNNs) trained in a supervised manner, i. e. , assuming pixelwise ground truth (GT). Usually, this GT is acquired at training time through a calibrated multi-modal suite of sensors. However, also using only a monocular system at training time is cheaper and more scalable. This is possible by relying on structure-from-motion (SfM) principles to generate self-supervision. Nevertheless, problems of camouflaged objects, visibility changes, static-camera intervals, textureless areas, and scale ambiguity, diminish the usefulness of such self-supervision. In this paper, we perform monocular d epth e stimation by v irtual-world s upervision (MonoDEVS) and real-world SfM self-supervision. We compensate the SfM self-supervision limitations by leveraging virtual-world images with accurate semantic and depth supervision, and addressing the virtual-to-real domain gap. Our MonoDEVSNet outperforms previous MDE CNNs trained on monocular and even stereo sequences.
Ajuts: Agencia Estatal de Investigación TIN2017-88709-R
Nota: Antonio acknowledges the financial support to his general research activities given by ICREA under the ICREA Academia Program. Antonio acknowledges the support of the Generalitat de Catalunya CERCA Program as well as its ACCIO agency to CVC's general activities
Drets: Tots els drets reservats.
Llengua: Anglès
Document: Article ; recerca ; Versió acceptada per publicar
Matèria: Training ; Estimation ; Semantics ; Cameras ; Laser radar ; Optical imaging ; Sensors
Publicat a: IEEE Transactions on Intelligent Transportation Systems, Vol. 23, issue 8 (Aug. 2022) , p. 12738-12751, ISSN 1558-0016

DOI: 10.1109/TITS.2021.3117059


Disponible a partir de: 2024-08-30
Postprint

El registre apareix a les col·leccions:
Articles > Articles de recerca
Articles > Articles publicats

 Registre creat el 2023-05-26, darrera modificació el 2023-05-31



   Favorit i Compartir