Abstract
We propose an efficient lighting estimation pipeline that is suitable to run on modern mobile devices, with comparable resource complexities to state-of-the-art mobile deep learning models. Our pipeline, PointAR, takes a single RGB-D image captured from the mobile camera and a 2D location in that image, and estimates 2nd degree spherical harmonics coefficients. This estimated spherical harmonics coefficients can be directly utilized by rendering engines for supporting spatially variant indoor lighting, in the context of augmented reality. Our key insight is to formulate the lighting estimation as a point cloud-based learning problem directly from point clouds, which is in part inspired by the Monte Carlo integration leveraged by real-time spherical harmonics lighting. While existing approaches estimate lighting information with complex deep learning pipelines, our method focuses on reducing the computational complexity. Through both quantitative and qualitative experiments, we demonstrate that PointAR achieves lower lighting estimation errors compared to state-of-the-art methods. Further, our method requires an order of magnitude lower resource, comparable to that of mobile-specific DNNs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
ARCore. https://developers.google.com/ar. Accessed 3 Mar 2020
TensorFlow Mobile and IoT Hosted Models. https://www.tensorflow.org/lite/guide/hosted_models
Apicharttrisorn, K., Ran, X., Chen, J., Krishnamurthy, S.V., Roy-Chowdhury, A.K.: Frugal following: power thrifty object detection and tracking for mobile augmented reality. In: Proceedings of the 17th Conference on Embedded Networked Sensor Systems, SenSys 2019, pp. 96–109. ACM, New York (2019)
Apple: adding realistic reflections to an AR experience. https://developer.apple.com/documentation/arkit/adding_realistic_reflections_to_an_ar_experience. Accessed 10 July 2020
Apple Inc: Augmented reality - apple developer. https://developer.apple.com/augmented-reality/. Accessed 3 Mar 2020
Armeni, I., Sax, A., Zamir, A.R., Savarese, S.: Joint 2D–3D-semantic data for indoor scene understanding. ArXiv e-prints, February 2017
Chang, A., et al.: Matterport3D: learning from RGB-D data in indoor environments. arXiv preprint arXiv:1709.06158 (2017)
Cheng, D., Shi, J., Chen, Y., Deng, X., Zhang, X.: Learning scene illumination by pairwise photos from rear and front mobile cameras. Comput. Graph. Forum 37(7), 213–221 (2018). http://dblp.uni-trier.de/db/journals/cgf/cgf37.html#ChengSCDZ18
Chuang, Y.Y.: Camera calibration (2005)
Debevec, P.: Image-based lighting. In: ACM SIGGRAPH 2006 Courses, pp. 4-es (2006)
Gardner, M.A., Hold-Geoffroy, Y., Sunkavalli, K., Gagne, C., Lalonde, J.F.: Deep parametric indoor lighting estimation. In: The IEEE International Conference on Computer Vision (ICCV), October 2019
Gardner, M., et al.: Learning to predict indoor illumination from a single image. ACM Trans. Graph. 36(6), 14 (2017). https://doi.org/10.1145/3130800.3130891. Article No. 176
Garon, M., Sunkavalli, K., Hadap, S., Carr, N., Lalonde, J.: Fast spatially-varying indoor lighting estimation. In: CVPR (2019)
Google. https://developers.google.com/ar/develop/unity/light-estimation/developer-guide-unity
Gruber, L., Richter-Trummer, T., Schmalstieg, D.: Real-time photometric registration from arbitrary geometry. In: 2012 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp. 119–128. IEEE (2012)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. arXiv preprint arXiv:1512.03385 (2015)
Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications (2017). http://arxiv.org/abs/1704.04861, cite arxiv:1704.04861
Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and \(<\)0.5mb model size (2016). http://arxiv.org/abs/1602.07360, cite arxiv:1602.07360Comment. In ICLR Format
Li, Y., Bu, R., Sun, M., Wu, W., Di, X., Chen, B.: PointCNN: convolution on x-transformed points. In: Advances in Neural Information Processing Systems, pp. 820–830 (2018)
Liu, L., Li, H., Gruteser, M.: Edge assisted real-time object detection for mobile augmented reality. In: The 25th Annual International Conference on Mobile Computing and Networking (MobiCom 2019) (2019)
Liu, W., Sun, J., Li, W., Hu, T., Wang, P.: Deep learning on point clouds and its application: a survey. Sensors 19(19), 4188 (2019)
Prakash, S., Bahremand, A., Nguyen, L.D., LiKamWa, R.: GLEAM: an illumination estimation framework for real-time photorealistic augmented reality on mobile devices. In: Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services, MobiSys 2019, pp. 142–154. Association for Computing Machinery, New York, June 2019
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: deep learning on point sets for 3D classification and segmentation. arXiv preprint arXiv:1612.00593 (2016)
Ramamoorthi, R., Hanrahan, P.: An efficient representation for irradiance environment maps. In: Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques - SIGGRAPH 2001, pp. 497–500. ACM Press, Not Known (2001). https://doi.org/10.1145/383259.383317. http://portal.acm.org/citation.cfm?doid=383259.383317
Song, S., Funkhouser, T.: Neural illumination: lighting prediction for indoor environments. In: CVPR (2019)
Sze, V., Chen, Y., Yang, T., Emer, J.S.: Efficient processing of deep neural networks: a tutorial and survey. CoRR (2017). http://arxiv.org/abs/1703.09039
Tulloch, A., et al.: Enabling full body AR with mask R-CNN2Go - facebook research, January 2018. https://research.fb.com/blog/2018/01/enabling-full-body-ar-with-mask-r-cnn2go/. Accessed 3 Mar 2020
Wu, W., Qi, Z., Fuxin, L.: PointConv: deep convolutional networks on 3D point clouds. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019
Wu, Z., et al.: 3D ShapeNets: a deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1912–1920 (2015)
Xu, Y., Fan, T., Xu, M., Zeng, L., Qiao, Y.: SpiderCNN: deep learning on point sets with parameterized convolutional filters. arXiv preprint arXiv:1803.11527 (2018)
Zhang, E., Cohen, M.F., Curless, B.: Discovering point lights with intensity distance fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6635–6643 (2018)
Acknowledgement
This work was supported in part by NSF Grants #1755659 and #1815619.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhao, Y., Guo, T. (2020). PointAR: Efficient Lighting Estimation for Mobile Augmented Reality. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12368. Springer, Cham. https://doi.org/10.1007/978-3-030-58592-1_40
Download citation
DOI: https://doi.org/10.1007/978-3-030-58592-1_40
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58591-4
Online ISBN: 978-3-030-58592-1
eBook Packages: Computer ScienceComputer Science (R0)