SpeDo: 6 DOF Ego-Motion Sensor Using Speckle Defocus Imaging
Proc. ICCV 2015
Sensors that measure their motion with respect to the surrounding environment (ego-motion sensors) can be broadly classified into two categories. The first is inertial sensors, such as accelerometers.
Sensors that measure their motion with respect to the surrounding environment (ego-motion sensors) can be broadly classified into two categories. The first is inertial sensors, such as accelerometers. In order to estimate position and velocity, these sensors integrate the measured acceleration, which often results in accumulation of large errors over time. The second category is camera-based approaches, such as SLAM, that can measure position directly, but their performance depends on the surrounding scenes properties. These approaches cannot function reliably if the scene has low-frequency textures or small depth variations.
We present a novel ego-motion sensor called SpeDo that addresses the above fundamental limitations. SpeDo is based on using coherent light sources and cameras with large defocus. Coherent light, upon interacting with a scene, creates a high-frequency interferometric pattern in the captured images, called “speckle.” We develop a theoretical model for speckle flow (motion of speckle as a function of sensor motion), and show that it is quasi-invariant to scenes properties. As a result, SpeDo can measure ego-motion (not the derivative of motion) simply by estimating optical flow at a few image locations. We have built a low-cost and compact hardware prototype of SpeDo and demonstrated high precision, 6 DOF, ego-motion estimation for complex trajectories in scenarios where the scene properties are challenging (e.g., repeating texture or no texture).
Proc. ICCV 2015
A surface is illuminated by a coherent light source such as a laser. This creates speckle, a high-frequency intensity distribution in 3D space due to interference of light. The surface is imaged by a camera with large defocus (the camera's focus plane is distant from the scene). The intensity I captured by the camera pixel is the same as the speckle intensity at its conjugate point F on the focus plane.
We simulate the speckle flow field for different camera motions for both front and back focus settings. We assume that the lens has a long focal length. The flow fields can be divided into four categories - horizontal flow, vertical flow, zoom (in or out) and in-plane rotation. The flow due to camera translation has opposite directions for front and back focus. In contrast, flow due to camera rotation is in the same direction for front and back focus. This is an important property that we use to distinguish camera rotation and translation
(a) Hardware prototype of the proposed SpeDo system consisting of two cameras, one with a front focus setting and the other with a back focus setting, and a laser source. (b) In order to measure the accuracy of SpeDo, we mounted the prototype on a robot arm and applied a variety of known motions to it. We used a wide range of scenes, including a flat white plane, a textured plane, and a scene consisting of a variety of objects of different shapes and textures.
Ground truth and measured trajectories for six different motions (translations and rotations along three axes). The range of the translation and rotation trajectories is 10 mm and 10 degree, respectively. SpeDo recovers each trajectory with high accuracy. The sensitivity of estimation for translation and rotation along the z-axis is lower than the other two axes, resulting in lower accuracy.
(a-b) We performed ego-motion estimation with for a scene (a fronto-parallel plane) having various textures including two checker board patterns with checkers of different sizes and a poster with several images. Insets show captured images. Due to the strong defocus, the texture is almost completely blurred, making SpeDo quasi-invariant to scene texture. (c) Plots of mean error for two different trajectories (translation and rotation). SpeDo achieves a low error-rate irrespective of the surface texture.
Comparison between active visual SLAM (using a v2 Kinect), IMUs and SpeDo for a trajectory including both rotation and translation. The scene consists of a variety of objects with different scene depths and textures. An IMU measures acceleration, which must be integrated twice for estimating the sensor position. Consequently, small errors in the measured acceleration result in large position errors, even if the trajectory is relatively small. The position measurements obtained using SLAM have large errors, especially in the second half of the trajectory where the camera views the textureless and planar portion of the scene. In contrast, SpeDo measures the camera pose with high accuracy over the entire trajectory.