Efficient Edge-Preserving Multi-View Stereo Network for Depth Estimation

Authors

  • Wanjuan Su Huazhong University of Science and Technology
  • Wenbing Tao Huazhong University of Science and Technology

DOI:

https://doi.org/10.1609/aaai.v37i2.25330

Keywords:

CV: 3D Computer Vision

Abstract

Over the years, learning-based multi-view stereo methods have achieved great success based on their coarse-to-fine depth estimation frameworks. However, 3D CNN-based cost volume regularization inevitably leads to over-smoothing problems at object boundaries due to its smooth properties. Moreover, discrete and sparse depth hypothesis sampling exacerbates the difficulty in recovering the depth of thin structures and object boundaries. To this end, we present an Efficient edge-Preserving multi-view stereo Network (EPNet) for practical depth estimation. To keep delicate estimation at details, a Hierarchical Edge-Preserving Residual learning (HEPR) module is proposed to progressively rectify the upsampling errors and help refine multi-scale depth estimation. After that, a Cross-view Photometric Consistency (CPC) is proposed to enhance the gradient flow for detailed structures, which further boosts the estimation accuracy. Last, we design a lightweight cascade framework and inject the above two strategies into it to achieve better efficiency and performance trade-offs. Extensive experiments show that our method achieves state-of-the-art performance with fast inference speed and low memory usage. Notably, our method tops the first place on challenging Tanks and Temples advanced dataset and ETH3D high-res benchmark among all published learning-based methods. Code will be available at https://github.com/susuwj/EPNet.

Downloads

Published

2023-06-26

How to Cite

Su, W., & Tao, W. (2023). Efficient Edge-Preserving Multi-View Stereo Network for Depth Estimation. Proceedings of the AAAI Conference on Artificial Intelligence, 37(2), 2348-2356. https://doi.org/10.1609/aaai.v37i2.25330

Issue

Section

AAAI Technical Track on Computer Vision II