Transient Neural Radiance Fields for Lidar View Synthesis and 3D Reconstruction

Video

Abstract

Neural radiance fields (NeRFs) have become a ubiquitous tool for modeling scene appearance and geometry from multiview imagery. Recent work has also begun to explore how to use additional supervision from lidar or depth sensor measurements in the NeRF framework. However, previous lidar-supervised NeRFs focus on rendering conventional camera imagery and use lidar-derived point cloud data as auxiliary supervision; thus, they fail to incorporate the underlying image formation model of the lidar. Here, we propose a novel method for rendering transient NeRFs that take as input the raw, time-resolved photon count histograms measured by a single-photon lidar system, and we seek to render such histograms from novel views. Different from conventional NeRFs, the approach relies on a time-resolved version of the volume rendering equation to render the lidar measurements and capture transient light transport phenomena at picosecond timescales. We evaluate our method on a first-of-its-kind dataset of simulated and captured transient multiview scans from a prototype single-photon lidar. Overall, our work brings NeRFs to a new dimension of imaging at transient timescales, newly enabling rendering of transient imagery from novel views. Additionally, we show that our approach recovers improved geometry and conventional appearance compared to point cloud-based supervision when training on few input viewpoints. Transient NeRFs may be especially useful for applications which seek to simulate raw lidar measurements for downstream tasks in autonomous driving, robotics, and remote sensing.

Method

Transient NeRF uses an efficient neural representation with a hashing-based feature grid to parameterize the radiance and density of a scene. To synthesize a novel lidar view we query the neural representation at points along a ray corresponding to a pixel coordinate in the lidar image. We then use these network outputs alongside our rendering equation, which produces time-resolved measurements of light transport in the scene. Our rendering equation models lidar image formation, including the shape of the laser pulse and radiometric falloff effects, and we supervise on time-resolved photon count histograms from multiview lidar scans.

Dataset

Apart from results on simulated data, we test our method on a dataset we capture using our prototype single photon lidar system.

Hardware prototype

The hardware system contains a picosecond laser that shares an optical path with a single-photon avalanche diode. We scan the scene using two-axis scanning mirrors. The scenes are mounted on a rotation stage to facilitate multiview scanning.

Captured dataset

The dataset consists of 6 scenes captured from 20 different viewpoints each using the prototype single-photon lidar.

Results

By integrating the transient measurements over time, we can recover a conventional image similar to other NeRF methods such as DS-NeRF or Urban-NeRF. Our method allows recovering relatively high quality models of appearance and geometry after training on as few as two input views which observe the scene from opposite sides.

Simulated results

Captured results

BibTeX

@article{malik2023transient,
  title = {Transient Neural Radiance Fields for Lidar View Synthesis and 3D Reconstruction}, 
  author = {Anagh Malik and Parsa Mirdehghan and Sotiris Nousias and Kiriakos N. Kutulakos and David B. Lindell},
  journal = {NeurIPS},
  year = {2023}
}