Single-Photon 3D Imaging

A conventional camera sensor needs hundreds of photon per pixel to form an image. A single-photon sensor, on the other hand, is so sensitive to incident light that it can capture individual photons with picosecond resolution time-tags. This high-resolution time dimension provides a rich source of information that is not available to conventional cameras. For example, this can enable long-range laser-scan quality 3D imaging. In this work we use an emerging single-photon sensor technology called a single-photon avalanche diode (SPAD) sensor. Due to their peculiar image formation model, extreme ambient light incident on a SPAD-based 3D camera causes severe distortions (photon pileup) leading to large depth errors. We address the following basic question: What is the optimal acquisition scheme for a SPAD-based 3D camera that minimizes depth errors when operating in high ambient light? In this line of work we present asynchronous acquisition schemes that mitigate pileup in data acquisition itself. Asynchronous acquisition involves temporally misaligning the SPAD measurements with respect to the laser, averaging out the effect of pileup. Additionally, we also propose optimal optical attenuation as a method for reducing pileup distortions while maintaining high SNR. Our simulations and experiments demonstrate an improvement in depth accuracy of up to an order of magnitude as compared to the state-of-the-art, across a wide range of imaging scenarios, including those with high ambient flux.


Asynchronous Single-Photon 3D Imaging

Anant Gupta*, Atul Ingle*, Mohit Gupta

Proc. ICCV 2019

Marr Prize (best paper) honorable mention

oral presentation

Photon-Flooded Single-Photon 3D Cameras

Anant Gupta, Atul Ingle, A Velten, Mohit Gupta

Proc. CVPR 2019

oral presentation

High Flux Passive Imaging with Single-Photon Sensors

Atul Ingle, A Velten, Mohit Gupta

Proc. CVPR 2019

oral presentation

Imaging Model and Photon Pileup

Histogram formation and effect of ambient light: A single-photon 3D camera forms a histogram of the first photon arrival times over many laser cycles. In case of no ambient light, the peak of this histogram corresponds to the true depth. However, in case of strong ambient light, the histogram gets distorted due to early-arriving ambient light photons and the true signal peak gets buried in the exponentially decaying tail. For an interactive tool explaining histogram formation, visit:

Proposed Strategies: Optimal Attenuation and Asynchronous Acquisition

We propose a two acquisition strategies to deal with photon pileup. The first strategy involves optimally attenuating a fraction of the total light incident on the SPAD sensor. The second strategy of asynchronous acquisition involves temporally staggering the SPAD measurement windows with respect to the laser to average out the effect of photon pileup. When used in combination, these strategies enable us to mitigate photon pileup in acquisition itself and reliably estimate scene depths even in high ambient light. You can find interactive tools here: and

Simulated 3D Reconstruction: Optimal Attenuation

Reconstructions for a castle scene: With insufficient attenuation, scene points closer to the camera are recovered correctly, but points farther away are lost due to ambient light. With extreme attenuation, points at all depths are recovered albeit with poor accuracy. Our optimal attenuation criterion provides accurate reconstruction over the whole depth range.

Experimental Result: Optical Attenuation

Taj Mahal scene: The average ambient light in this scene was 15000 lux. The proposed optimal attenuation method provides a 6x improvement in depth accuracy compared to the current state-of-the-art method of extreme attenuation.

Experimental Result: Asynchronous Acquisition

Porcelain Face scene: A white "Porcelain Face" vase was illuminated with high ambient light >20,000 lux and scanned with a low-power laser at an SBR of 0.02. The proposed asynchronous acquisition scheme achieves considerably higher depth quality as compared to conventional synchronous methods.

Experimental Result: Effect of Scene Albedos

Asynchronous acquisition is self-adapting: The black vase in this “Vases” scene has 1/10th the reflectivity of the white vase. The optimal attenuation fraction must be adjusted individually for each vase (higher attenuation is needed for the white vase). Asynchronous acquisition automatically adapts to different albedos.


Presentation Slides


What are single-photon sensors? What is a SPAD?

Single-photon sensors are an exciting new sensor technology that have the unique ability to capture individual photons of light with extremely high timing resolution. This extreme sensitivity and time resolution makes single-photon sensors ideal candidates for low-power long-range 3D cameras.

The specific type of single-photon sensor used in this work is called a single-photon avalanche diode (SPAD). SPADs have an advantage over other single-photon sensors because they can be manufactured at scale using mainstream photolithography techniques used in today’s CMOS fabrication lines.

What is a 3D camera? What is a LiDAR?

A conventional camera (like the one in your smartphone) captures two-dimensional color intensity images of the scene. A 3D camera, on the other hand, provides three-dimensional distance information about different parts of the scene. A “picture” captured by a 3D camera is called a depth map. Pixels in a depth map encode the distance from the 3D camera to different points in the scene.

A LiDAR is a specific type of a 3D camera that uses the time-of-travel of a light pulse to and from a each point in the scene to estimate depths.

How do 3D cameras work?

Broadly, there are two classes of 3D cameras: those that use structured illumination and those that use the principle of time-of-flight.

A structured light 3D camera projects a light pattern on the scene and measures the distortions in this pattern to estimate depths.

A time-of-flight 3D camera measures the round-trip travel time of a light pulse for each scene point. Since the speed of light is known, this travel time can be converted to distance using the simple relationship of distance = speed x time. Conventional LiDAR systems use the time-of-flight principle to capture scene depths.

Choosing a 3D camera technology for a specific application is a complex decision involving many practical design parameters. Two important parameters are maximum depth range and depth resolution. A comparison of different 3D imaging modalities based on the range-resolution trade-off is shown in the image at the top of this webpage.

How is a SPAD-based 3D camera different from conventional LiDARs?

Conventional LiDAR systems use optical sensors that are less sensitive to light than SPADs. They use an analog-to-digital conversion step, that adds noise (known as read noise) and so they require hundreds of photons to produce a discernible output. In contrast, a SPAD sensor is capable of capturing individual photons – they produce a precisely timed digital pulse for every photon detected. This extreme sensitivity coupled with picosecond-scale time resolution makes SPADs attractive candidates for long-range high resolution 3D imaging applications.

How does a single-photon 3D camera give depth information?

A single-photon 3D camera works on the principle of time-of-flight. A pulsed laser transmits a short pulse of light towards the scene point. This light bounces off the scene and is captured by the detector. Since the speed of light is known (3 x 10^8 m/s), the round-trip time-delay can be used to compute the distance of the scene point. A SPAD-based 3D camera typically uses several laser pulses for a fixed scene point and builds a histogram of the times-of-arrival of the first returning photons. The peak of this histogram is used to estimate the scene depth.

To learn more, visit this interactive tool on SPAD histogram formation (

What are the noise characteristics of SPADs?

SPADs, unlike conventional analog optical sensors, do not suffer from read noise. They can measure the time-of-arrival of the first returning photon with extremely high accuracy. The main source of noise is due to the randomness of photon arrivals themselves, called shot noise, but this affects all optical image sensors, especially when operating in low light conditions.

Since SPADs only measure the first returning photon in each laser pulse, their measurements can become biased, especially under bright light, an effect known as photon pileup.

What is photon pileup?

In absence of ambient light, the histogram captured by a SPAD-based 3D camera has a single peak corresponding to the true scene depth. However, when operating in high ambient light (e.g., on a bright sunny day), this histogram gets heavily skewed towards earlier time bins. This distortion, called photon pileup, occurs because the SPAD ends up capturing an ambient light photon almost immediately after each laser pulse transmission, with overwhelmingly high probability. The true signal peak gets buried in the tail of this pileup distortion resulting in large depth errors.

Can single-photon 3D cameras operate outdoors in bright sunlight?

In absence of ambient light, a SPAD-based 3D camera can capture the true depth with very high depth resolution. However, when operating outdoors in bright sunlight, these cameras suffer from photon pileup which causes large depth errors. The family of acquisition techniques proposed here pave the way for using SPAD sensors for long-range 3D cameras that can operate successfully in bright ambient light.

How do you solve the problem of photon pileup?

We propose a family of optimal acquisition schemes to mitigate photon pileup in data acquisition itself. Our acquisition schemes rely on two techniques. First, we optimally attenuate a fraction of the total incident light to minimize pileup distortion. Second, we allow the SPAD sensor’s acquisition windows to shift arbitrarily with respect to the laser pulse, thereby “averaging out” the effect of pileup uniformly over the entire depth range.

To learn more, visit these interactive tools on optimal attenuation ( and asynchronous acquisition (

What is range-gating and how is your approach similar to or different from range-gating?

Range-gating is a widely used noise suppression technique in time-of-flight 3D cameras. A range gate is an electronic circuit that allows only those photons to be collected by the sensor that arrive in a pre-specified time window. Our approach is similar to range gating in that we use the notion of fixed acquisition windows for the detector. However, our method is different from gating because we do not make any prior assumptions about the true depth of the target. Our technique can be thought of as a generalization of range gating. The key ideas are to (a) optimally choose the range gate width for each scene pixel, and (b) shift the range gate through the entire depth range. This way we average-out the effect of ambient light and improve the chances of capturing the true signal photons from targets that may be located anywhere in the full depth range of the system.

What applications does this technology enable?

LiDARs for autonomous vehicles; low-cost and high-speed 3D cameras for extreme robotics and drones; low-power 3D cameras for mobile and embedded applications; high resolution airborne surveying LiDARs


Share This Article

Share on LinkedInShare on FacebookTweet about this on Twitter