Abstract
Bokeh effect is usually used to highlight major contents in an image. Limited by the small sensors, cameras on smartphones are less sensitive to the depth information and cannot directly produce bokeh effect as pleasant as digital single lens reflex cameras. To address this problem, a depth-guided deep filtering network, called DDFN, is proposed in this study. Specifically, the focused region detection block is designed to detect the salient areas, and the depth estimated block is introduced to estimate depth maps from full-focus images. Further, combining depth maps and focused features, an adaptive rendering block is proposed to synthesize bokeh effect with adaptive cross 1-D filters. Both quantitative and qualitative evaluations on the public datasets demonstrate that the proposed model performs favorably against state-of-the-art methods in terms of rendering effects and has lower computational cost, e.g., 24.07 dB PSNR on EBB! dataset and 0.45 s inference times for a \(512 \times 768\) image on a Snapdragon 865 mobile processor.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The EBB! dataset analyzed during the current study can be downloaded using the following link: http://people.ee.ethz.ch/~ihnatova/pynet-bokeh.html#dataset.
References
Gan W, Wong PK, Yu G, Zhao R, Vong CM (2021) Light-weight network for real-time adaptive stereo depth estimation. Neurocomputing 441:118–127
Poggi M, Pallotti D, Tosi F, Mattoccia S (2019) Guided stereo matching. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 979–988
Li Z, Wang K, Meng D, Xu C (2016) Multi-view stereo via depth map fusion: a coordinate decent optimization method. Neurocomputing 178:46–61
Luo C, Li Y, Lin K, Chen G, Lee S-J, Choi J, Yoo YF, Polley MO (2020) Wavelet synthesis net for disparity estimation to synthesize dslr calibre bokeh effect on smartphones. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2407–2415
Yang M, Wu F, Li W (2020) Waveletstereo: learning wavelet coefficients of disparity map in stereo matching. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12885–12894
Wadhwa N, Garg R, Jacobs DE, Feldman BE, Kanazawa N, Carroll R, Movshovitz-Attias Y, Barron JT, Pritch Y, Levoy M (2018) Synthetic depth-of-field with a single-camera mobile phone. ACM Trans Graph (ToG) 37(4):1–13
Liu D, Nicolescu R, Klette R (2015) Bokeh effects based on stereo vision. In: International conference on computer analysis of images and patterns. Springer, pp 198–210
Ignatov A, Timofte R, Kulik A, Yang S, Wang K, Baum F, Wu M, Xu L, Van Gool L (2019) Ai benchmark: all about deep learning on smartphones in 2019. In: 2019 IEEE/CVF international conference on computer vision workshop (ICCVW). IEEE, pp 3617–3635
Ignatov A, Patel J, Timofte R (2020) Rendering natural camera bokeh effect with deep learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 418–419
Qian M, Qiao C, Lin J, Guo Z, Li C, Leng C, Cheng J (2020) Bggan: bokeh-glass generative adversarial network for rendering realistic bokeh. In: European conference on computer vision. Springer, pp 229–244
Ignatov A, Timofte R, Qian M, Qiao C, Lin J, Guo Z, Li C, Leng C, Cheng J, Peng J et al (2020) Aim 2020 challenge on rendering realistic bokeh. In: European conference on computer vision. Springer, pp 213–228
Purohit K, Suin M, Kandula P, Ambasamudram R. (2019) Depth-guided dense dynamic filtering network for bokeh effect rendering. In: 2019 IEEE/CVF international conference on computer vision workshop (ICCVW), pp 3417–3426
Li Z, Snavely N (2018) Megadepth: learning single-view depth prediction from internet photos. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2041–2050
Hou Q, Cheng M-M, Hu X, Borji A, Tu Z, Torr PH (2017) Deeply supervised salient object detection with short connections. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3203–3212
Dutta S (2021) Depth-aware blending of smoothed images for bokeh effect generation. J Vis Commun Image Represent 77:103089
Huang Y, Juefei-Xu F, Guo Q, Miao W, Liu Y, Pu G (2021) Advbokeh: learning to adversarially defocus blur. arXiv:2111.12971
Xian K, Peng J, Zhang C, Lu H, Cao Z (2021) Ranking-based salient object detection and depth prediction for shallow depth-of-field. Sensors 21(5):1815
Lee J, Son H, Rim J, Cho S, Lee S (2021) Iterative filter adaptive network for single image defocus deblurring. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2034–2042
Zhou X, Shen K, Weng L, Cong R, Zheng B, Zhang J, Yan C (2023) Edge-guided recurrent positioning network for salient object detection in optical remote sensing images. IEEE Trans Cybern 53(1):539–552
Imamoglu N, Lin W, Fang Y (2013) A saliency detection model using low-level features based on wavelet transform. IEEE Trans Multimed 15(1):96–105
Tu Z, Ma Y, Li Z, Li C, Xu J, Liu Y (2022) Rgbt salient object detection: a large-scale dataset and benchmark. IEEE Trans Multimed
Li J, Pan Z, Liu Q, Wang Z (2020) Stacked u-shape network with channel-wise attention for salient object detection. IEEE Trans Multimed 23:1397–1409
Mou C, Zhang J, Fan X, Liu H, Wang R (2021) Cola-net: collaborative attention network for image restoration. IEEE Trans Multimed 24:1366–1377
Jiang J, Sun H, Liu X, Ma J (2020) Learning spatial–spectral prior for super-resolution of hyperspectral imagery. IEEE Trans Comput Imaging 6:1082–1096
Lin C, Rong X, Yu X (2022) Msaff-net: multiscale attention feature fusion networks for single image dehazing and beyond. IEEE Trans Multimed
Park K, Soh JW, Cho NI (2023) A dynamic residual self-attention network for lightweight single image super-resolution. IEEE Trans Multimed 25:907–918
Liu X, Li L, Liu F, Hou B, Yang S, Jiao L (2021) Gafnet: group attention fusion network for pan and ms image high-resolution classification. IEEE Trans Cybern 52(10):10556–10569
Zhao B, Wu X, Feng J, Peng Q, Yan S (2017) Diversified visual attention networks for fine-grained object classification. IEEE Trans Multimed 19(6):1245–1256
Lin X, Sun S, Huang W, Sheng B, Li P, Feng DD (2023) Eapt: efficient attention pyramid transformer for image processing. IEEE Trans Multimed 25:50–61
Lyu F, Wu Q, Hu F, Wu Q, Tan M (2019) Attend and imagine: multi-label image classification with visual attention and recurrent neural networks. IEEE Trans Multimed 21(8):1971–1981
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141
Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam: convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp 3–19
Zhang Y, Li K, Li K, Zhong B, Fu Y (2019) Residual non-local attention networks for image restoration. arXiv:1903.10082
Mei Y, Fan Y, Zhou Y (2021) Image super-resolution with non-local sparse attention. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3517–3526
Jia X, Brabandere BD, Tuytelaars T, Gool LV (2016) Dynamic filter networks. In: International conference on neural information processing systems, vol 29
Wu J, Li D, Yang Y, Bajaj C, Ji X (2018) Dynamic sampling convolutional neural networks. arXiv:1803.07624
Mildenhall B, Barron JT, Chen J, Sharlet D, Ng R, Carroll R (2018) Burst denoising with kernel prediction networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2502–2510
He J, Deng Z, Qiao Y (2019) Dynamic multi-scale filters for semantic segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 3562–3572
Zheng B, Chen Y, Tian X, Zhou F, Liu X (2019) Implicit dual-domain convolutional network for robust color image compression artifact reduction. IEEE Trans Circuits Syst Video Technol 30(11):3982–3994
Kong S, Fowlkes C(2019) Multigrid predictive filter flow for unsupervised learning on videos. arXiv:1904.01693
Zhou S, Zhang J, Pan J, Xie H, Zuo W, Ren J (2019) Spatio-temporal filter adaptive network for video deblurring. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 2482–2491
Zhao H, Zheng B, Yuan S, Zhang H, Yan C, Li L, Slabaugh G (2021) Cbren: convolutional neural networks for constant bit rate video quality enhancement. IEEE Trans Circuits Syst Video Technol 32(7):4138–4149
Zheng B, Chen Q, Yuan S, Zhou X, Zhang H, Zhang J, Yan C, Slabaugh G (2022) Constrained predictive filters for single image bokeh rendering. IEEE Trans Comput Imaging 8:346–357
Barron JT, Adams A, Shih Y, Hernández C (2015) Fast bilateral-space stereo for synthetic defocus. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4466–4474
Busam B, Hog M, McDonagh S, Slabaugh G (2019) Sterefo: efficient image refocusing with stereo vision. In: Proceedings of the IEEE/CVF international conference on computer vision workshops
Lee S, Kim GJ, Choi S (2009) Real-time depth-of-field rendering using anisotropically filtered mipmap interpolation. IEEE Trans Vis Comput Graph 15(3):453–464
Liu D, Nicolescu R, Klette R (2016) Stereo-based bokeh effects for photography. Mach Vis Appl 27(8):1325–1337
Riguer G, Tatarchuk N, Isidoro J (2004) Real-time depth of field simulation. ShaderX2 Shader Program Tips Tricks DirectX 9:529–556
Lee S, Kim GJ, Choi S (2008) Real-time depth-of-field rendering using point splatting on per-pixel layers. Comput Graph Forum 27:1955–1962
Dutta S, Das SD, Shah NA, Tiwari AK (2021) Stacked deep multi-scale hierarchical network for fast bokeh effect rendering from a single image. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR) workshops, pp 2398–2407
Wang Z, Jiang A, Zhang C, Li H, Liu B (2022) Self-supervised multi-scale pyramid fusion networks for realistic bokeh effect rendering. J Vis Commun Image Represent 87:103580
Georgiadis K, Saà-Garriga A, Yucel MK, Drosou A, Manganelli B (2023) Adaptive mask-based pyramid network for realistic bokeh rendering. In: Computer vision-ECCV 2022 workshops: Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part II. Springer, pp 429–444
Ignatov A, Timofte R, Zhang J, Zhang F, Yu G, Ma Z, Wang H, Kwon M, Qian H, Tong W et al(2023) Realistic bokeh effect rendering on mobile gpus, mobile ai & aim 2022 challenge: report. In: Computer vision-ECCV 2022 workshops: Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part III. Springer, pp 153–173
Ignatov A, Patel J, Timofte R, Zheng B, Ye X, Huang L, Tian X, Dutta S, Purohit K, Kandula P et al(2019) Aim 2019 challenge on bokeh effect synthesis: methods and results. In: 2019 IEEE/CVF international conference on computer vision workshop (ICCVW). IEEE, pp 3591–3598
Luo X, Peng J, Xian K, Wu Z, Cao Z (2020) Bokeh rendering from defocus estimation. In: European conference on computer vision. Springer, pp 245–261
Peng J, Cao Z, Luo X, Lu H, Xian K, Zhang J (2022) Bokehme: when neural rendering meets classical rendering. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 16283–16292
Luo X, Peng J, Xian K, Wu Z, Cao Z (2023) Defocus to focus: photo-realistic bokeh rendering by fusing defocus and radiance priors. Inf Fusion 89:320–335
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778
Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted Intervention. Springer, pp 234–241
Zheng B, Yuan S, Slabaugh G, Leonardis A (2020) Image demoireing with learnable bandpass filters. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3636–3645
Zheng B, Yuan S, Yan C, Tian X, Zhang J, Sun Y, Liu L, Leonardis A, Slabaugh G (2021) Learning frequency domain priors for image demoireing. IEEE Trans Pattern Anal Mach Intell 44(11):7705–7717
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612
Zhang R, Isola P, Efros AA, Shechtman E, Wang O (2018) The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 586–595
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv:1412.6980
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L et al (2019) Pytorch: an imperative style, high-performance deep learning library. Adv Neural Inf Process Syst 32:8026–8037
Lim B, Son S, Kim H, Nah S, Mu Lee K (2017) Enhanced deep residual networks for single image super-resolution. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 136–144
Funding
This work was supported by National Key R&D Program of China under Grant 2021YFA0715202, National Natural Science Foundation of China under Grant Nos. 62001146 and 62271180.
Author information
Authors and Affiliations
Contributions
QC contributed to writing—original draft, writing—review and editing, methodology and software. BZ contributed to writing—review and editing and project administration. XZ contributed to writing—review and editing. AH contributed to writing—review and editing. YS contributed to writing—review and editing. CC contributed to writing—review and editing. CY contributed to writing—review and editing, funding acquisition and supervision. SY contributed to writing—review and editing and supervision.
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chen, Q., Zheng, B., Zhou, X. et al. Depth-guided deep filtering network for efficient single image bokeh rendering. Neural Comput & Applic 35, 20869–20887 (2023). https://doi.org/10.1007/s00521-023-08852-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-023-08852-y