Abstract
Learning from synthetic data has many important applications in case where sufficient amounts of labeled data are not available. Using synthetic data is challenging due to differences in feature distributions between synthetic and actual data, a phenomenon we term synthetic gap. In this paper, we investigate and formalize a general framework – Stacked Multichannel Autoencoder (SMCAE) that enables bridging the synthetic gap and learning from synthetic data more efficiently. In particular, we show that our SMCAE can not only transform and use synthetic data on a challenging face-sketch recognition task, but that it can also help simulate real images which can be used for training classifiers for recognition. Preliminary experiments validate the effectiveness of the proposed framework.














Similar content being viewed by others
Notes
δ is a sparsity parameter and is empirically set to 0.05 in all our experiments.
Collected from UCI machine learning repository (HWDUCI) [3].
The parameters are cross-validated
References
Alimoglu F, Alpaydin E (1997) Combining multiple representations and classifiers for handwritten digit recognition. In: ICDAR
Alnajar F, Lou Z, Alvarez J, Gevers T (2014) Expression-invariant age estimation. In: BMVC
Bache K, Lichman M (2013) UCI machine learning repository. [Online]. Available: http://archive.ics.uci.edu/ml
Bal G, Agam G, Frieder O, Frieder G (2008) Interactive degraded document enhancement and ground truth generation. In: Electronic imaging 2008 international society for optics and photonics
Baldi P (2012) Autoencoders, unsupervised learning, and deep architectures. Unsupervised Transfer Learn Challenges Mach Learn 7:43
Ben-David S, Blitzer J, Crammer K, Kulesza A, Pereira F, Vaughan J W (2010) A theory of learning from different domains. Mach Learn
Bengio Y (2009) Learning deep architectures for ai. Foundations and trends®;, in Machine Learning 2(1):1–127
Bengio Y (2012) Deep learning of representations for unsupervised and transfer learning. Unsupervised Transfer Learn Challenges Mach Learn 7:19
Chen M, Xu Z, Weinberger K Q, Sha F (2012) Marginalized denoising autoencoders for domain adaptation. In: International conference on machine learning
Deng J, Zhang Z, Marchi E, Schuller B (2013) Sparse autoencoder-based feature transfer learning for speech emotion recognition. In: Affective Computing and intelligent interaction (ACII)
Glorot X, Bordes A, Bengio Y (2011) Domain adaptation for large-scale sentiment classification: a deep learning approach. In: ICML
Kan M, Shan S, Zhang H, Lao S, Chen X (2012) Multi-view discriminant analysis. In: ECCV
Klare B F, Li Z, Jain A K (2011) Matching forensic sketches to mug shot photos. IEEE Trans Pattern Anal Mach Intell 33(3):639–646
Lampert C H, Nickisch H, Harmeling S (2013) Attribute-based classification for zero-shot visual object categorization. In: IEEE TPAMI
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9(1):62–66
Pan S J, Yang Q (2010) A survey on transfer learning. In: IEEE TKDE
Pishchulin L, Jain A, Wojek C, Andriluka M, Thormählen T, Schiele B (2011) Learning people detection models from few training samples. In: 2011 IEEE Conference on computer vision and pattern recognition (CVPR). IEEE, pp 1473–1480
Ruiz A, Van de Weijer J, Binefa X (2014) Regularized multi-concept mil for weakly-supervised facial behavior categorization. In: BMVC
Sarinnapakorn K, Kubat M (2007) Combining subclassifiers in text categorization. A dst-based solution and a case study. In: IEEE TKDE
Srivastava N, Salakhutdinov R R (2012) Multimodal learning with deep boltzmann machines. In: Advances in neural information processing systems, pp 2222–2230
Sun B, Saenko K (2014) From virtual to reality: fast adaptation of virtual object detectors to real domains. In: Proceedings of the British machine vision conference. BMVA Press
Turk M A, Pentland A P (1991) Face recognition using eigenfaces. In: IEEE Computer Society conference on computer vision and pattern recognition. IEEE, pp 586–591
Van der Maaten L, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9(2579–2605):85
Varga T (2004) Comparing natural and synthetic training data for off-line cursive handwriting recognition. In: IWFHR-9 Ninth International workshop on frontiers in handwriting recognition, 2004. IEEE, pp 221–225
Varga T, Bunke H (2003) Effects of training set expansion in handwriting recognition using synthetic data. In: 11th Conf. of the international graphonomics society. Citeseer
Vincent P, Larochelle H, Bengio Y, Manzagol P-A (2008) Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th international conference on machine learning. ACM, pp 1096–1103
Vincent P, Larochelle H, Bengio Y, Manzagol P -A (2011) Extracting and composing robust features with denoising autoencoders. In: ICML
Wang X, Tang X (2009) Face photo-sketch synthesis and recognition. In: IEEE TPAMI
Wang W, Cui Z, Chang H, Shan S, Chen X (2014) Deeply coupled auto-encoder networks for cross-view classification. arXiv:1402.2031
Weinberger K, Dasgupta A, Langford J, Smola A, Attenberg J (2009) Feature hashing for large scale multitask learning. In: ICML
Zhang W, Wang X, Tang X (2011) Coupled information-theoretic encoding for face photo-sketch recognition. In: CVPR. IEEE, pp 513–520
Zhang X, Agam G, Chen X (2014) Alignment of 3d building models with satellite images using extended chamfer matching. In: The IEEE Conference on computer vision and pattern recognition (CVPR) workshops
Zhou Q -Y, Neumann U (2008) Fast and extensible building modeling from airborne lidar data. In: Proceedings of the 16th ACM SIGSPATIAL international conference on advances in geographic information systems. ACM, p 7
Zhu F, Shao L, Tang J (2014) Boosted cross-domain categorization. In: British machine vision conference
Acknowledgements
This work is supported by Fudan University-CIOMP Joint Fund (FC2017-006). Yanwei Fu is supported by The Program for Professor of Special Appointment (Eastern Scholar) at Shanghai Institutions of Higher Learning (No. TP2017006). Yanwei Fu is the corresponding author.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhang, X., Fu, Y., Jiang, S. et al. Stacked multichannel autoencoder – an efficient way of learning from synthetic data. Multimed Tools Appl 77, 26563–26580 (2018). https://doi.org/10.1007/s11042-018-5879-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-5879-7