Deep Basis Fitting for Depth Completion

Chao Qu, University of Pennsylvania


Recovering depth information from a single image is a challenging task. It is a fundamentally ill-posed problem since there exists an infinite number of scene geometries that could give rise to a given image. However, knowing the depths for a few pixels can significantly constrain the set of solutions. Recovering a plausible depth map from an image with sparse depth measurements is referred to as image-guided depth completion and is the focus of this thesis. We first developed a novel approach called Deep Basis Fitting (DBF) that builds upon the strengths of modern deep learning techniques and classical optimization algorithms which significantly improves performance. The proposed method replaces the final 1 × 1 convolutional layer used in most depth completion networks with a least-squares fitting module which computes weights by fitting the implicit depth bases to the given sparse depth measurements. In addition, we show how our method can be naturally extended to a multi-scale formulation for improved self-supervised training. We then extend DBF for depth completion within a Bayesian evidence framework to provide calibrated per-pixel variance. The DBF approach falls short when the underlying least-squares problem is under-determined, i.e. the number of sparse depths is smaller than the dimension of the basis. By adopting a Bayesian treatment, our Bayesian Deep Basis Fitting (BDBF) approach is able to 1) predict high-quality uncertainty estimates and 2) enable depth completion with very few or even no sparse measurements. While many depth completion methods rely on an external 3D sensor to produce accurate sparse measurements, it is still possible, albeit much more challenging, to generate dense depth from a single camera. Structure-from-motion algorithms, such as visual odometry or visual SLAM, solve for both camera motion and scene structures which can be used for depth completion. To this end, we developed a visual odometry system named Direct Sparse Odometry Lite (DSOL), which builds upon the original Direct Sparse Odometry (DSO).DSOL adopts several algorithmic and implementation enhancements that speed up computation by an order of magnitude compared to the baseline. We follow the data-oriented design philosophy and layout data contiguously in memory, which improves cache-locality and allows for easy parallelization. The increase in speed allows us to process images at higher frame rates, which in turn provides better results on rapid motions. Finally, we show that the two systems developed above can be integrated together, where sparse points from the monocular visual odometry can be used for depth completion and the completed depth can in turn be used to initialize odometry keyframes.

Subject Area

Computer science

Recommended Citation

Qu, Chao, "Deep Basis Fitting for Depth Completion" (2022). Dissertations available from ProQuest. AAI29261020.