Depth-guided deep filtering network for efficient single image bokeh rendering | Neural Computing and Applications Skip to main content
Log in

Depth-guided deep filtering network for efficient single image bokeh rendering

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Bokeh effect is usually used to highlight major contents in an image. Limited by the small sensors, cameras on smartphones are less sensitive to the depth information and cannot directly produce bokeh effect as pleasant as digital single lens reflex cameras. To address this problem, a depth-guided deep filtering network, called DDFN, is proposed in this study. Specifically, the focused region detection block is designed to detect the salient areas, and the depth estimated block is introduced to estimate depth maps from full-focus images. Further, combining depth maps and focused features, an adaptive rendering block is proposed to synthesize bokeh effect with adaptive cross 1-D filters. Both quantitative and qualitative evaluations on the public datasets demonstrate that the proposed model performs favorably against state-of-the-art methods in terms of rendering effects and has lower computational cost, e.g., 24.07 dB PSNR on EBB! dataset and 0.45 s inference times for a \(512 \times 768\) image on a Snapdragon 865 mobile processor.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability

The EBB! dataset analyzed during the current study can be downloaded using the following link: http://people.ee.ethz.ch/~ihnatova/pynet-bokeh.html#dataset.

References

  1. Gan W, Wong PK, Yu G, Zhao R, Vong CM (2021) Light-weight network for real-time adaptive stereo depth estimation. Neurocomputing 441:118–127

    Article  Google Scholar 

  2. Poggi M, Pallotti D, Tosi F, Mattoccia S (2019) Guided stereo matching. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 979–988

  3. Li Z, Wang K, Meng D, Xu C (2016) Multi-view stereo via depth map fusion: a coordinate decent optimization method. Neurocomputing 178:46–61

    Article  Google Scholar 

  4. Luo C, Li Y, Lin K, Chen G, Lee S-J, Choi J, Yoo YF, Polley MO (2020) Wavelet synthesis net for disparity estimation to synthesize dslr calibre bokeh effect on smartphones. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2407–2415

  5. Yang M, Wu F, Li W (2020) Waveletstereo: learning wavelet coefficients of disparity map in stereo matching. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12885–12894

  6. Wadhwa N, Garg R, Jacobs DE, Feldman BE, Kanazawa N, Carroll R, Movshovitz-Attias Y, Barron JT, Pritch Y, Levoy M (2018) Synthetic depth-of-field with a single-camera mobile phone. ACM Trans Graph (ToG) 37(4):1–13

    Article  Google Scholar 

  7. Liu D, Nicolescu R, Klette R (2015) Bokeh effects based on stereo vision. In: International conference on computer analysis of images and patterns. Springer, pp 198–210

  8. Ignatov A, Timofte R, Kulik A, Yang S, Wang K, Baum F, Wu M, Xu L, Van Gool L (2019) Ai benchmark: all about deep learning on smartphones in 2019. In: 2019 IEEE/CVF international conference on computer vision workshop (ICCVW). IEEE, pp 3617–3635

  9. Ignatov A, Patel J, Timofte R (2020) Rendering natural camera bokeh effect with deep learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 418–419

  10. Qian M, Qiao C, Lin J, Guo Z, Li C, Leng C, Cheng J (2020) Bggan: bokeh-glass generative adversarial network for rendering realistic bokeh. In: European conference on computer vision. Springer, pp 229–244

  11. Ignatov A, Timofte R, Qian M, Qiao C, Lin J, Guo Z, Li C, Leng C, Cheng J, Peng J et al (2020) Aim 2020 challenge on rendering realistic bokeh. In: European conference on computer vision. Springer, pp 213–228

  12. Purohit K, Suin M, Kandula P, Ambasamudram R. (2019) Depth-guided dense dynamic filtering network for bokeh effect rendering. In: 2019 IEEE/CVF international conference on computer vision workshop (ICCVW), pp 3417–3426

  13. Li Z, Snavely N (2018) Megadepth: learning single-view depth prediction from internet photos. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2041–2050

  14. Hou Q, Cheng M-M, Hu X, Borji A, Tu Z, Torr PH (2017) Deeply supervised salient object detection with short connections. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3203–3212

  15. Dutta S (2021) Depth-aware blending of smoothed images for bokeh effect generation. J Vis Commun Image Represent 77:103089

    Article  Google Scholar 

  16. Huang Y, Juefei-Xu F, Guo Q, Miao W, Liu Y, Pu G (2021) Advbokeh: learning to adversarially defocus blur. arXiv:2111.12971

  17. Xian K, Peng J, Zhang C, Lu H, Cao Z (2021) Ranking-based salient object detection and depth prediction for shallow depth-of-field. Sensors 21(5):1815

    Article  Google Scholar 

  18. Lee J, Son H, Rim J, Cho S, Lee S (2021) Iterative filter adaptive network for single image defocus deblurring. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2034–2042

  19. Zhou X, Shen K, Weng L, Cong R, Zheng B, Zhang J, Yan C (2023) Edge-guided recurrent positioning network for salient object detection in optical remote sensing images. IEEE Trans Cybern 53(1):539–552

    Article  Google Scholar 

  20. Imamoglu N, Lin W, Fang Y (2013) A saliency detection model using low-level features based on wavelet transform. IEEE Trans Multimed 15(1):96–105

    Article  Google Scholar 

  21. Tu Z, Ma Y, Li Z, Li C, Xu J, Liu Y (2022) Rgbt salient object detection: a large-scale dataset and benchmark. IEEE Trans Multimed

  22. Li J, Pan Z, Liu Q, Wang Z (2020) Stacked u-shape network with channel-wise attention for salient object detection. IEEE Trans Multimed 23:1397–1409

    Article  Google Scholar 

  23. Mou C, Zhang J, Fan X, Liu H, Wang R (2021) Cola-net: collaborative attention network for image restoration. IEEE Trans Multimed 24:1366–1377

    Article  Google Scholar 

  24. Jiang J, Sun H, Liu X, Ma J (2020) Learning spatial–spectral prior for super-resolution of hyperspectral imagery. IEEE Trans Comput Imaging 6:1082–1096

    Article  Google Scholar 

  25. Lin C, Rong X, Yu X (2022) Msaff-net: multiscale attention feature fusion networks for single image dehazing and beyond. IEEE Trans Multimed

  26. Park K, Soh JW, Cho NI (2023) A dynamic residual self-attention network for lightweight single image super-resolution. IEEE Trans Multimed 25:907–918

    Article  Google Scholar 

  27. Liu X, Li L, Liu F, Hou B, Yang S, Jiao L (2021) Gafnet: group attention fusion network for pan and ms image high-resolution classification. IEEE Trans Cybern 52(10):10556–10569

    Article  Google Scholar 

  28. Zhao B, Wu X, Feng J, Peng Q, Yan S (2017) Diversified visual attention networks for fine-grained object classification. IEEE Trans Multimed 19(6):1245–1256

    Article  Google Scholar 

  29. Lin X, Sun S, Huang W, Sheng B, Li P, Feng DD (2023) Eapt: efficient attention pyramid transformer for image processing. IEEE Trans Multimed 25:50–61

    Article  Google Scholar 

  30. Lyu F, Wu Q, Hu F, Wu Q, Tan M (2019) Attend and imagine: multi-label image classification with visual attention and recurrent neural networks. IEEE Trans Multimed 21(8):1971–1981

    Article  Google Scholar 

  31. Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141

  32. Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam: convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp 3–19

  33. Zhang Y, Li K, Li K, Zhong B, Fu Y (2019) Residual non-local attention networks for image restoration. arXiv:1903.10082

  34. Mei Y, Fan Y, Zhou Y (2021) Image super-resolution with non-local sparse attention. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3517–3526

  35. Jia X, Brabandere BD, Tuytelaars T, Gool LV (2016) Dynamic filter networks. In: International conference on neural information processing systems, vol 29

  36. Wu J, Li D, Yang Y, Bajaj C, Ji X (2018) Dynamic sampling convolutional neural networks. arXiv:1803.07624

  37. Mildenhall B, Barron JT, Chen J, Sharlet D, Ng R, Carroll R (2018) Burst denoising with kernel prediction networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2502–2510

  38. He J, Deng Z, Qiao Y (2019) Dynamic multi-scale filters for semantic segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 3562–3572

  39. Zheng B, Chen Y, Tian X, Zhou F, Liu X (2019) Implicit dual-domain convolutional network for robust color image compression artifact reduction. IEEE Trans Circuits Syst Video Technol 30(11):3982–3994

    Article  Google Scholar 

  40. Kong S, Fowlkes C(2019) Multigrid predictive filter flow for unsupervised learning on videos. arXiv:1904.01693

  41. Zhou S, Zhang J, Pan J, Xie H, Zuo W, Ren J (2019) Spatio-temporal filter adaptive network for video deblurring. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 2482–2491

  42. Zhao H, Zheng B, Yuan S, Zhang H, Yan C, Li L, Slabaugh G (2021) Cbren: convolutional neural networks for constant bit rate video quality enhancement. IEEE Trans Circuits Syst Video Technol 32(7):4138–4149

    Article  Google Scholar 

  43. Zheng B, Chen Q, Yuan S, Zhou X, Zhang H, Zhang J, Yan C, Slabaugh G (2022) Constrained predictive filters for single image bokeh rendering. IEEE Trans Comput Imaging 8:346–357

    Article  MathSciNet  Google Scholar 

  44. Barron JT, Adams A, Shih Y, Hernández C (2015) Fast bilateral-space stereo for synthetic defocus. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4466–4474

  45. Busam B, Hog M, McDonagh S, Slabaugh G (2019) Sterefo: efficient image refocusing with stereo vision. In: Proceedings of the IEEE/CVF international conference on computer vision workshops

  46. Lee S, Kim GJ, Choi S (2009) Real-time depth-of-field rendering using anisotropically filtered mipmap interpolation. IEEE Trans Vis Comput Graph 15(3):453–464

    Article  Google Scholar 

  47. Liu D, Nicolescu R, Klette R (2016) Stereo-based bokeh effects for photography. Mach Vis Appl 27(8):1325–1337

    Article  Google Scholar 

  48. Riguer G, Tatarchuk N, Isidoro J (2004) Real-time depth of field simulation. ShaderX2 Shader Program Tips Tricks DirectX 9:529–556

    Google Scholar 

  49. Lee S, Kim GJ, Choi S (2008) Real-time depth-of-field rendering using point splatting on per-pixel layers. Comput Graph Forum 27:1955–1962

    Article  Google Scholar 

  50. Dutta S, Das SD, Shah NA, Tiwari AK (2021) Stacked deep multi-scale hierarchical network for fast bokeh effect rendering from a single image. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR) workshops, pp 2398–2407

  51. Wang Z, Jiang A, Zhang C, Li H, Liu B (2022) Self-supervised multi-scale pyramid fusion networks for realistic bokeh effect rendering. J Vis Commun Image Represent 87:103580

    Article  Google Scholar 

  52. Georgiadis K, Saà-Garriga A, Yucel MK, Drosou A, Manganelli B (2023) Adaptive mask-based pyramid network for realistic bokeh rendering. In: Computer vision-ECCV 2022 workshops: Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part II. Springer, pp 429–444

  53. Ignatov A, Timofte R, Zhang J, Zhang F, Yu G, Ma Z, Wang H, Kwon M, Qian H, Tong W et al(2023) Realistic bokeh effect rendering on mobile gpus, mobile ai & aim 2022 challenge: report. In: Computer vision-ECCV 2022 workshops: Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part III. Springer, pp 153–173

  54. Ignatov A, Patel J, Timofte R, Zheng B, Ye X, Huang L, Tian X, Dutta S, Purohit K, Kandula P et al(2019) Aim 2019 challenge on bokeh effect synthesis: methods and results. In: 2019 IEEE/CVF international conference on computer vision workshop (ICCVW). IEEE, pp 3591–3598

  55. Luo X, Peng J, Xian K, Wu Z, Cao Z (2020) Bokeh rendering from defocus estimation. In: European conference on computer vision. Springer, pp 245–261

  56. Peng J, Cao Z, Luo X, Lu H, Xian K, Zhang J (2022) Bokehme: when neural rendering meets classical rendering. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 16283–16292

  57. Luo X, Peng J, Xian K, Wu Z, Cao Z (2023) Defocus to focus: photo-realistic bokeh rendering by fusing defocus and radiance priors. Inf Fusion 89:320–335

    Article  Google Scholar 

  58. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778

  59. Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted Intervention. Springer, pp 234–241

  60. Zheng B, Yuan S, Slabaugh G, Leonardis A (2020) Image demoireing with learnable bandpass filters. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3636–3645

  61. Zheng B, Yuan S, Yan C, Tian X, Zhang J, Sun Y, Liu L, Leonardis A, Slabaugh G (2021) Learning frequency domain priors for image demoireing. IEEE Trans Pattern Anal Mach Intell 44(11):7705–7717

    Article  Google Scholar 

  62. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  63. Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612

    Article  Google Scholar 

  64. Zhang R, Isola P, Efros AA, Shechtman E, Wang O (2018) The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 586–595

  65. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv:1412.6980

  66. Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L et al (2019) Pytorch: an imperative style, high-performance deep learning library. Adv Neural Inf Process Syst 32:8026–8037

    Google Scholar 

  67. Lim B, Son S, Kim H, Nah S, Mu Lee K (2017) Enhanced deep residual networks for single image super-resolution. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 136–144

Download references

Funding

This work was supported by National Key R&D Program of China under Grant 2021YFA0715202, National Natural Science Foundation of China under Grant Nos. 62001146 and 62271180.

Author information

Authors and Affiliations

Authors

Contributions

QC contributed to writing—original draft, writing—review and editing, methodology and software. BZ contributed to writing—review and editing and project administration. XZ contributed to writing—review and editing. AH contributed to writing—review and editing. YS contributed to writing—review and editing. CC contributed to writing—review and editing. CY contributed to writing—review and editing, funding acquisition and supervision. SY contributed to writing—review and editing and supervision.

Corresponding authors

Correspondence to Bolun Zheng or Xiaofei Zhou.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, Q., Zheng, B., Zhou, X. et al. Depth-guided deep filtering network for efficient single image bokeh rendering. Neural Comput & Applic 35, 20869–20887 (2023). https://doi.org/10.1007/s00521-023-08852-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-08852-y

Keywords

Navigation