Projects per year
Abstract
We present a novel holistic deep-learning approach for multi-task learning from a single indoor panoramic image. Our framework, named MultiPanoWise, extends vision transformers to jointly infer multiple pixel-wise signals, such as depth, normals, and semantic segmentation, as well as signals from intrinsic decomposition, such as reflectance and shading. Our solution leverages a specific architecture combining a transformer-based encoder-decoder with multiple heads, by introducing, in particular, a novel context adjustment approach, to enforce knowledge distillation between the various signals. Moreover, at training time we introduce a hybrid loss scalarization method based on an augmented Chebychev/hypervolume scheme. We illustrate the capabilities of the proposed architecture on public-domain synthetic and real-world datasets. We demonstrate performance improvements with respect to the most recent methods specifically designed for single tasks, like, for example, individual depth estimation or semantic segmentation. To our knowledge, this is the first architecture capable of achieving state-of-the-art performance on the joint extraction of heterogeneous signals from single indoor omnidirectional images.
Original language | English |
---|---|
Title of host publication | 2024 Ieee/cvf Conference On Computer Vision And Pattern Recognition Workshops, Cvprw |
Publisher | IEEE Computer Society |
Pages | 1311-1321 |
Number of pages | 11 |
ISBN (Electronic) | 9798350365474 |
DOIs | |
Publication status | Published - 18 Jun 2024 |
Event | 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2024 - Seattle, United States Duration: 16 Jun 2024 → 22 Jun 2024 |
Publication series
Name | Ieee Computer Society Conference On Computer Vision And Pattern Recognition Workshops |
---|
Conference
Conference | 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2024 |
---|---|
Country/Territory | United States |
City | Seattle |
Period | 16/06/24 → 22/06/24 |
Keywords
- dense estimation
- indoor environments
- multi-task learning
- panoramic images
Fingerprint
Dive into the research topics of 'MultiPanoWise: Holistic deep architecture for multi-task dense prediction from a single panoramic image'. Together they form a unique fingerprint.Projects
- 1 Active
-
EX-QNRF-NPRPS-16: AIN2: Artificial Intelligence for Indoor Digital Twins
Agus, M. (Lead Principal Investigator), Tukur, M. (Graduate Student), Engineer-1 (Engineer), Assistant-1, R. (Research Assistant), Gobbetti, D. E. (Principal Investigator), Fetais, D. N. (Principal Investigator), Tukur, M. M. (Research Assistant), Jashari, S. (Graduate Student) & Atiyah, M. N. (Consultant)
15/11/22 → 15/11/25
Project: Basic Research