Multi-task View Synthesis with Neural Radiance Fields
Abstract
Multi-task visual learning is a critical aspect of computer vision. Current research, however, predominantly concentrates on the multi-task dense prediction setting, which overlooks the intrinsic 3D world and its multi-view consistent structures, and lacks the capability for versatile imagination. In response to these limitations, we present a novel problem setting -- multi-task view synthesis (MTVS), which reinterprets multi-task prediction as a set of novel-view synthesis tasks for multiple scene properties, including RGB. To tackle the MTVS problem, we propose Muvie<PRE_TAG>NeRF</POST_TAG>, a framework that incorporates both multi-task and cross-view knowledge to simultaneously synthesize multiple scene properties. Muvie<PRE_TAG>NeRF</POST_TAG> integrates two key modules, the Cross-Task Attention (CTA) and Cross-View Attention (CVA) modules, enabling the efficient use of information across multiple views and tasks. Extensive evaluation on both synthetic and realistic benchmarks demonstrates that Muvie<PRE_TAG>NeRF</POST_TAG> is capable of simultaneously synthesizing different scene properties with promising visual quality, even outperforming conventional discriminative models in various settings. Notably, we show that Muvie<PRE_TAG>NeRF</POST_TAG> exhibits universal applicability across a range of <PRE_TAG>NeRF backbones</POST_TAG>. Our code is available at https://github.com/zsh2000/Muvie<PRE_TAG>NeRF</POST_TAG>.
Models citing this paper 0
No model linking this paper
Datasets citing this paper 1
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper