Start Over

Monocular Depth Estimation Using Laplacian Pyramid-Based Depth Residuals

Authors :: Seokjae Lim
Minsoo Song
Won Jun Kim
Source :: IEEE Transactions on Circuits and Systems for Video Technology. 31:4381-4393
Publication Year :: 2021
Publisher :: Institute of Electrical and Electronics Engineers (IEEE), 2021.
Abstract: With a great success of the generative model via deep neural networks, monocular depth estimation has been actively studied by exploiting various encoder-decoder architectures. However, the decoding process in most previous methods, which repeats simple up-sampling operations, probably fails to fully utilize underlying properties of well-encoded features for monocular depth estimation. To resolve this problem, we propose a simple but effective scheme by incorporating the Laplacian pyramid into the decoder architecture. Specifically, encoded features are fed into different streams for decoding depth residuals, which are defined by decomposition of the Laplacian pyramid, and corresponding outputs are progressively combined to reconstruct the final depth map from coarse to fine scales. This is fairly desirable to precisely estimate the depth boundary as well as the global layout. We also propose to apply weight standardization to pre-activation convolution blocks of the decoder architecture, which gives a great help to improve the flow of gradients and thus makes optimization easier. Experimental results on benchmark datasets constructed under various indoor and outdoor environments demonstrate that the proposed method is effective for monocular depth estimation compared to state-of-the-art models. The code and model are publicly available at: | https://github.com/tjqansthd/LapDepth-release |.