Abstract
We consider the problem of estimating the conditional probability distribution of missing values given the observed ones. We propose an approach, which combines the flexibility of deep neural networks with the simplicity of Gaussian mixture models (GMMs). Given an incomplete data point, our neural network returns the parameters of Gaussian distribution (in the form of Factor Analyzers model) representing the corresponding conditional density. We experimentally verify that our model provides better log-likelihood than conditional GMM trained in a typical way. Moreover, imputation obtained by replacing missing values using the mean vector of our model looks visually plausible.
A preliminary version of this paper appeared as an extended abstract [21] at the ICML Workshop on The Art of Learning with Missing Values.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
PPCA uses spherical matrix D.
- 2.
The code was taken from https://github.com/eitanrich/torch-mfa.
- 3.
In fact, minimizing MSE leads to fitting a Gaussian density with isotropic covariance, so this form of loss function still optimizes a log-likelihood.
References
Bishop, C.M.: Mixture density networks (1994)
Delalleau, O., Courville, A., Bengio, Y.: Efficient em training of gaussian mixtures with missing data. arXiv preprint arXiv:1209.0521 (2012)
Dick, U., Haider, P., Scheffer, T.: Learning from incomplete data with infinite imputations. In: Proceedings of the 25th International Conference on Machine Learning, pp. 232–239 (2008)
Ghahramani, Z., Hinton, G.E., et al.: The em algorithm for mixtures of factor analyzers. Technical report, Technical Report CRG-TR-96-1, University of Toronto (1996)
Ghahramani, Z., Jordan, M.I.: Supervised learning from incomplete data via an em approach. In: Advances in Neural Information Processing Systems, pp. 120–127 (1994)
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT press, Cambridge (2016)
Iizuka, S., Simo-Serra, E., Ishikawa, H.: Globally and locally consistent image completion. ACM Trans. Graph. (ToG) 36(4), 1–14 (2017)
Jerez, J.M., et al.: Missing data imputation using statistical and machine learning methods in a real breast cancer problem. Artif. Intell. Med. 50(2), 105–115 (2010)
Kingma, D., Welling, M.: Auto-encoding variational Bayes. In: International Conference on Learning Representations (2014)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998)
Li, S.C.X., Jiang, B., Marlin, B.: Misgan: learning from incomplete data with generative adversarial networks. arXiv preprint arXiv:1902.09599 (2019)
Li, Y., Akbar, S., Oliva, J.B.: Flow models for arbitrary conditional likelihoods. arXiv preprint arXiv:1909.06319 (2019)
Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: International Conference on Computer Vision (2015)
Mattei, P.A., Frellsen, J.: Leveraging the exact likelihood of deep latent variable models. In: Advances in Neural Information Processing Systems. pp. 3855–3866 (2018)
Mattei, P.A., Frellsen, J.: Miwae: deep generative modelling and imputation of incomplete data sets. In: International Conference on Machine Learning, pp. 4413–4423 (2019)
McLachlan, G.J., Peel, D.: Finite Mixture Models. John Wiley & Sons, Hoboken (2004)
Nazabal, A., Olmos, P.M., Ghahramani, Z., Valera, I.: Handling incomplete heterogeneous data using vaes. Pattern Recogn., 107501 (2020)
Van den Oord, A., Kalchbrenner, N., Espeholt, L., Vinyals, O., Graves, A., et al.: Conditional image generation with pixelcnn decoders. In: Advances in Neural Information Processing Systems, pp. 4790–4798 (2016)
Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., Efros, A.: Context encoders: feature learning by inpainting. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2536–2544 (2016)
Przewięźlikowski, M., Śmieja, M., Struski, Ł.: Estimating conditional density of missing values using deep gaussian mixture model. In: ICML Workshop on the Art of Learning with Missing Values (Artemiss), p. 7 (2020)
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015)
Rezende, D.J., Mohamed, S., Wierstra, D.: Stochastic backpropagation and approximate inference in deep generative models. arXiv preprint arXiv:1401.4082 (2014)
Richardson, E., Weiss, Y.: On GANs and GMMs. In: Advances in Neural Information Processing Systems, pp. 5847–5858 (2018)
Śmieja, M., Kołomycki, M., Struski, L., Juda, M., Figueiredo, M.A.T.: Can auto-encoders help with filling missing data? In: ICLR 2020 Workshop on Integration of Deep Neural Models and Differential Equations, p. 6 (2020)
Śmieja, M., Kołomycki, M., Struski, L., Juda, M., Figueiredo, M.A.T.: Iterative imputation of missing data using auto-encoder dynamics. In: International Conference on Neural Information Processing, p. 12. Springer, Cham (2020)
Śmieja, M., Struski, Ł., Tabor, J., Marzec, M.: Generalized RBF kernel for incomplete data. Knowl.-Based Syst. 173, 150–162 (2019)
Śmieja, M., Struski, Ł., Tabor, J., Zieliński, B., Spurek, P.: Processing of missing data by neural networks. In: Advances in Neural Information Processing Systems, pp. 2719–2729 (2018)
Tipping, M.E., Bishop, C.M.: Probabilistic principal component analysis. J. Roy. Stat. Soc. Ser. B (Stat. Methodol.) 61(3), 611–622 (1999)
Tolstikhin, I., Bousquet, O., Gelly, S., Schölkopf, B.: Wasserstein auto-encoders (2017). arXiv:1711.01558
Trippe, B.L., Turner, R.E.: Conditional density estimation with bayesian normalising flows. arXiv preprint arXiv:1802.04908 (2018)
Van Buuren, S.: Flexible Imputation of Missing Data. CRC Press, Boca Raton (2018)
Williams, D., Carin, L.: Analytical kernel matrix completion with incomplete multi-view data. In: Proceedings of the International Conference on Machine Learning (ICML) Workshop on Learning with Multiple Views, pp. 80–86 (2005)
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017)
Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Generative image inpainting with contextual attention. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5505–5514 (2018)
Acknowledgements
The work of M. Śmieja was supported by the National Science Centre (Poland) grant no. 2018/31/B/ST6/00993. The work of Ł. Struski was supported by the National Science Centre (Poland) grant no. 2017/25/B/ST6/01271 as well as the Foundation for Polish Science Grant No. POIR.04.04.00-00-14DE/18-00 co-financed by the European Union under the European Regional Development Fund.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Przewięźlikowski, M., Śmieja, M., Struski, Ł. (2020). Estimating Conditional Density of Missing Values Using Deep Gaussian Mixture Model. In: Yang, H., Pasupa, K., Leung, A.CS., Kwok, J.T., Chan, J.H., King, I. (eds) Neural Information Processing. ICONIP 2020. Lecture Notes in Computer Science(), vol 12534. Springer, Cham. https://doi.org/10.1007/978-3-030-63836-8_19
Download citation
DOI: https://doi.org/10.1007/978-3-030-63836-8_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63835-1
Online ISBN: 978-3-030-63836-8
eBook Packages: Computer ScienceComputer Science (R0)