Abstract
This paper introduces an approach to missing data imputation based on deep auto-encoder models, adequate to high-dimensional data exhibiting complex dependencies, such as images. The method exploits the properties of the vector field associated to an auto-encoder, which allows to approximate the gradient of the log-density from its reconstruction error, based on which we propose a projected gradient ascent algorithm to obtain the conditionally most probable estimate of the missing values. Our approach does not require any specialized training procedure and can be used together with any auto-encoder model trained on complete data in a classical way. Experiments performed on benchmark datasets show that imputations produced by our model are sharp and realistic.
The is the extended version of an extended abstract [25] presented at the ICLR Workshop on the Integration of Deep Neural Models and Differential Equations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
A GMM can also be learned from incomplete data, but the imputation process does not change.
- 2.
For a comparison between different auto-encoder models in the proposed procedure the reader is referred to our workshop paper [25].
References
Alain, G., Bengio, Y.: What regularized auto-encoders learn from the data-generating distribution. J. Mach. Learn. Res. 15, 3563–3593 (2014)
Azur, M., Stuart, E., Frangakis, C., Leaf, P.: Multiple imputation by chained equations: what is it and how does it work? Int. J. Methods Psychiatr. Res. 20, 40–49 (2011)
Batista, G., Monard, M.: A study of k-nearest neighbour as an imputation method. Front. Artif. Intell. Appl. 97, 251–260 (2002)
Buuren, S., Groothuis-Oudshoorn, K.: Mice: multivariate imputation by chained equations in R. J. Stat. Softw. 45(3), 1–68 (2010)
Camino, R., Hammerschmidt, C., State, R.: Improving missing data imputation with deep generative models. arXiv preprint arXiv:1902.10666 (2019)
Dinh, L., Krueger, D., Bengio, Y.: Nice: non-linear independent components estimation. arXiv preprint arXiv:1410.8516 (2014)
Gallinari, P., LeCun, Y., Thiria, S., Fogelman-Soulie, F.: Memoires associatives distribuees. In: COGNITIVA 87, Paris (1987)
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)
Hwang, U., Jung, D., Yoon, S.: Hexagan: generative adversarial nets for real world classification. arXiv preprint arXiv:1902.09913 (2019)
Iizuka, S., Simo-Serra, E., Ishikawa, H.: Globally and locally consistent image completion. ACM Trans. Graph. (ToG) 36(4), 1–14 (2017)
Kingma, D., Welling, M.: Auto-encoding variational Bayes. In: International Conference on Learning Representations (2014)
LeCun, Y.: Modeles connexionistes de l’apprentissage. Ph.D. thesis, Ph.D. thesis, Université de Paris VI (1987)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998)
Li, S., Jiang, B., Marlin, B.: MisGAN: learning from incomplete data with generative adversarial networks. arXiv preprint arXiv:1902.09599 (2019)
Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: International Conference on Computer Vision (2015)
Luo, Y., Cai, X., Zhang, Y., Xu, J., Xiaojie, Y.: Multivariate time series imputation with generative adversarial networks. In: Advances in Neural Information Processing Systems, pp. 1596–1607 (2018)
Mattei, P.A., Frellsen, J.: Leveraging the exact likelihood of deep latent variable models. In: Advances in Neural Information Processing Systems, pp. 3855–3866 (2018)
Mattei, P.A., Frellsen, J.: Miwae: Deep generative modelling and imputation of incomplete data sets. In: International Conference on Machine Learning, pp. 4413–4423 (2019)
Nazabal, A., Olmos, P.M., Ghahramani, Z., Valera, I.: Handling incomplete heterogeneous data using vaes. Pattern Recogn. 107, 107501 (2020)
Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., Efros, A.: Context encoders: feature learning by inpainting. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2536–2544 (2016)
Rezende, D.J., Mohamed, S., Wierstra, D.: Stochastic backpropagation and approximate inference in deep generative models. arXiv preprint arXiv:1401.4082 (2014)
Sai Hareesh, A., Chandrasekaran, V.: A novel color image inpainting guided by structural similarity index measure and improved color angular radial transform. In: International Conference on Image Processing, Computer Vision, & Pattern Recognition, pp. 544–550 (2010)
Śmieja, M., Struski, Ł., Tabor, J., Zieliński, B., Spurek, P.: Processing of missing data by neural networks. In: Advances in Neural Information Processing Systems, pp. 2719–2729 (2018)
Śmieja, M., Kołomycki, M., Struski, L., Juda, M., Figueiredo, M.A.T.: Can auto-encoders help with filling missing data? In: ICLR Workshop on Integration of Deep Neural Models and Differential Equations (DeepDiffEq), p. 6 (2020)
Stagakis, N., Zacharaki, E.I., Moustakas, K.: Hierarchical image inpainting by a deep context encoder exploiting structural similarity and saliency criteria. In: Tzovaras, D., Giakoumis, D., Vincze, M., Argyros, A. (eds.) ICVS 2019. LNCS, vol. 11754, pp. 470–479. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-34995-0_42
Titterington, D., Sedransk, J.: Imputation of missing values using density estimation. Stat. Probab. Lett. 9(5), 411–418 (1989)
Tolstikhin, I., Bousquet, O., Gelly, S., Schölkopf, B.: Wasserstein auto-encoders (2017). arXiv:1711.01558
Vincent, P.: A connection between score matching and denoising autoencoders. Neural Comput. 23(7), 1661–1674 (2011)
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017)
Yoon, J., Jordon, J., Van Der Schaar, M.: Gain: missing data imputation using generative adversarial nets. arXiv preprint arXiv:1806.02920 (2018)
Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Generative image inpainting with contextual attention. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5505–5514 (2018)
Acknowledgements
The work of M. Śmieja was supported by the National Science Centre (Poland) grant no. 2018/31/B/ST6/00993. The work of Ł. Struski was supported by the National Science Centre (Poland) grant no. 2017/25/B/ST6/01271 as well as the Foundation for Polish Science Grant No. POIR.04.04.00-00-14DE/18-00 co-financed by the European Union under the European Regional Development Fund. The work of M. Juda was supported by the National Science Centre (Poland) grant no. 2014/14/A/ST1/00453 and 2015/19/D/ST6/01215.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Śmieja, M., Kołomycki, M., Struski, Ł., Juda, M., Figueiredo, M.A.T. (2020). Iterative Imputation of Missing Data Using Auto-Encoder Dynamics. In: Yang, H., Pasupa, K., Leung, A.CS., Kwok, J.T., Chan, J.H., King, I. (eds) Neural Information Processing. ICONIP 2020. Lecture Notes in Computer Science(), vol 12534. Springer, Cham. https://doi.org/10.1007/978-3-030-63836-8_22
Download citation
DOI: https://doi.org/10.1007/978-3-030-63836-8_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63835-1
Online ISBN: 978-3-030-63836-8
eBook Packages: Computer ScienceComputer Science (R0)