NeRF-Det: Learning Geometry-Aware Volumetric Representation for Multi-View 3D Object Detection

1University of California, Berkeley, 2Meta AI

NeRF-Det aims to detect 3D objects with only RGB images as input. To enhance detection, we propose to embed a NeRF branch with our designed synergy. The two joint branches share the same geometry representation and are trained end-to-end, which helps achieve state-of-the-art accuracy on mutli-view indoor RGB-only 3D detection, and additionally enables a generalizable novel view synthesis on new scenes without per-scene optimization


NeRF-Det is a novel method for 3D detection with posed RGB images as input. Our method makes novel use of NeRF in an end-to-end manner to explicitly estimate 3D geometry, thereby improving 3D detection performance. Specifically, to avoid the significant extra latency associated with per-scene optimization of NeRF, we introduce sufficient geometry priors to enhance the generalizability of NeRF-MLP. We subtly connect the detection and NeRF branches through a shared MLP, enabling an efficient adaptation of NeRF to detection and yielding geometry-aware volumetric representations for 3D detection. As a result of our joint-training design, NeRF-Det is able to generalize well to unseen scenes for object detection, view synthesis, and depth estimation tasks without per-scene optimization.


Method Image
Our method leverages NeRF to learn scene geometry by estimating opacity grids. With the shared geometry-MLP (G-MLP), the detection branch can benefit from NeRF in estimating opacity fields and is thus able to mask out free space and mitigate the ambiguity of the feature volume.

3D Detection Results

bbox Image

Novel-view Synthesis Results

nvs Image


  title={NeRF-Det: Learning Geometry-Aware Volumetric Representation for Multi-View 3D Object Detection},
  author={Xu, Chenfeng and Wu, Bichen and Hou, Ji and Tsai, Sam and Li, Ruilong and Wang, Jialiang and Zhan, Wei and He, Zijian and Vajda, Peter and Keutzer, Kurt and Tomizuka, Masayoshi},