Active constrained deep embedded clustering with dual source | Applied Intelligence Skip to main content
Log in

Active constrained deep embedded clustering with dual source

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Deep clustering using a deep neural network (DNN) is widely used for simultaneously learning feature representation and clustering. The existing constrained deep clustering methods utilize prior knowledge for improving deep clustering. However, most of these methods randomly select prior knowledge (pairwise constraints) and fail to use it appropriately in the deep clustering process. The present study aims to address this limitation by proposing a new scheme for integrating and improving constrained deep clustering by active learning from dual source. The scheme is DNN for initializing the nonlinear transformation of the original feature space, clustering layer, as well as the constrained clustering layer which is parallel to the clustering layer and uses prior knowledge as a set of neighborhoods. In addition, active learning uses the above-mentioned two layers as a source simultaneously as the proposed scheme for selecting informative and diverse data. The suggested method can simultaneously lead to constrained clustering, learn the latent feature space with the guidance of the constraints set, and indirectly cause the data belonging to one neighborhood to be closer to its center (i.e. away from other neighborhoods centers). Different experiments on different datasets indicate the efficiency and robustness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Notes

  1. http://yann.lecun.com/exdb/mnist

  2. https://github.com/zalandoresearch/fashion-mnist

  3. https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass.html

  4. https://archive.ics.uci.edu/ml/datasets/reuters-21578+text+categorization+collection

  5. Home Page for 20 Newsgroups Data Set (qwone.com)

  6. http://www.cs.columbia.edu/CAVE/software/softlib/coil-20.php

  7. http://vision.ucsd.edu/∼leekc/ExtYaleDatabase/ExtYaleB.html

  8. http://www.idiap.ch/resource/gestures/

References

  1. Jain AK (2010) Data clustering: 50 years beyond K-means. Pattern Recogn Lett 31(8):651–666

    Article  Google Scholar 

  2. Altınel B, Ganiz MC (2018) Semantic text classification: a survey of past and recent advances. Inf Process Manag 54(6):1129–1153

    Article  Google Scholar 

  3. Kim HK, Kim H, Cho S (2017) Bag-of-concepts: comprehending document representation through clustering words in distributed representation. Neurocomputing 266:336–352

    Article  Google Scholar 

  4. Huang S, Xu Z, Lv J (2018) Adaptive local structure learning for document co-clustering. Knowl-Based Syst 148:74–84

    Article  Google Scholar 

  5. Zhao K, Dai Y, Jia Z, Ji Y (2021) General fuzzy C-means clustering algorithm using Minkowski metric. Signal Processing 188:108161

    Article  Google Scholar 

  6. Dinler D, Tural MK (2016) A survey of constrained clustering. In: Celebi M, Aydin K (eds) Unsupervised Learning Algorithms. Springer, Cham. https://doi.org/10.1007/978-3-319-24211-8_9

  7. Ren Y, Hu K, Dai X, Pan L, Hoi SC, Xu Z (2019) Semi-supervised deep embedded clustering. Neurocomputing 325:121–130

    Article  Google Scholar 

  8. Adolfsson A, Ackerman M, Brownstein NC (2019) To cluster, or not to cluster: an analysis of clusterability methods. Pattern Recogn 88:13–26

    Article  Google Scholar 

  9. Van Engelen JE, Hoos HH (2020) A survey on semi-supervised learning. Mach Learn 109(2):373–440

    Article  MathSciNet  MATH  Google Scholar 

  10. Fu Y, Zhu X, Li B (2013) A survey on instance selection for active learning. Knowl Inf Syst 35(2):249–283

    Article  Google Scholar 

  11. Maggu J, Majumdar A, Chouzenoux E, Chierchia G (2020) Deeply transformed subspace clustering. Signal Process 174:107628

    Article  Google Scholar 

  12. Kumar P, Gupta A (2020) Active learning query strategies for classification, regression, and clustering: a survey. J Comput Sci Technol 35(4):913–945

    Article  Google Scholar 

  13. Sheikhpour R, Sarram MA, Gharaghani S, Chahooki MAZ (2017) A survey on semi-supervised feature selection methods. Pattern Recogn 64:141–158

    Article  MATH  Google Scholar 

  14. Mai X, Cheng J, Wang S (2019) Research on semi supervised K-means clustering algorithm in data mining. Clust Comput 22(2):3513–3520

    Article  Google Scholar 

  15. Olsson F (2009) A literature survey of active machine learning in the context of natural language processing. Swedish Institute of Computer Science. https://www.ccs.neu.edu/home/vip/teach/MLcourse/4_boosting/materials/SICS-T--2009-06--SE.pdf

  16. Xiong S, Azimi J, Fern XZ (2013) Active learning of constraints for semi-supervised clustering. IEEE Trans Knowl Data Eng 26(1):43–54

    Article  Google Scholar 

  17. Xiong C, Johnson DM, Corso JJ (2016) Active clustering with model-based uncertainty reduction. IEEE Trans Pattern Anal Mach Intell 39(1):5–17

    Article  Google Scholar 

  18. Basu S, Banerjee A, Mooney RJ (2004) Active semi-supervision for pairwise constrained clustering. In: Proceedings of the 2004 SIAM international conference on data mining: 2004: SIAM, 333–344

  19. Bilenko M, Basu S, Mooney RJ (2004) Integrating constraints and metric learning in semi-supervised clustering. In: Proceedings of the twenty-first international conference on Machine learning: 2004, 11

  20. Zhang H, Zhan T, Basu S, Davidson I (2021) A framework for deep constrained clustering. Data Min Knowl Disc 35(2):593–620

    Article  MathSciNet  MATH  Google Scholar 

  21. Li X, Yin H, Zhou K, Zhou X (2020) Semi-supervised clustering with deep metric learning and graph embedding. World Wide Web 23(2):781–798

    Article  Google Scholar 

  22. Śmieja M, Struski Ł, Figueiredo MAT (2020) A classification-based approach to semi-supervised clustering with pairwise constraints. Neural Netw 127:193–203

    Article  Google Scholar 

  23. Van Craenendonck T, Blockeel H (2017) Constraint-based clustering selection. Mach Learn 106(9):1497–1521

    Article  MathSciNet  Google Scholar 

  24. Settles B (2009) Active learning literature survey. Computer sciences technical report 1648. University of Wisconsin-Madison. https://minds.wisconsin.edu/handle/1793/60660

  25. Li Y, Wang Y, Yu D, Ye N, Hu P, Zhao R (2020) ASCENT: active supervision for semi-supervised learning. IEEE Trans Knowl Data Eng 32(5):868–882

  26. Wang X, Ding S, Jia W (2020) Active constraint spectral clustering based on hessian matrix. Soft Comput 24(3):2381–2390

    Article  Google Scholar 

  27. Bai L, Liang J, Cao F (2021) Semi-supervised clustering with constraints of different types from multiple information sources. IEEE Trans Pattern Anal Mach Intell 43(9):3247–3258

    Article  Google Scholar 

  28. Wang Z, Fang X, Tang X, Wu C (2018) Multi-class active learning by integrating uncertainty and diversity. IEEE Access 6:22794–22803

    Article  Google Scholar 

  29. Yu H, Wang X, Wang G, Zeng X (2018) An active three-way clustering method via low-rank matrices for multi-view data. Inf Sci 507:50–60

    Article  Google Scholar 

  30. Wang K, Zhang D, Li Y, Zhang R, Lin L (2016) Cost-effective active learning for deep image classification. IEEE Trans Circuits Syst Video Technol 27(12):2591–2600

    Article  Google Scholar 

  31. Zhong G, Wang L-N, Ling X, Dong J (2016) An overview on data representation learning: from traditional feature learning to recent deep learning. J Finan Data Sci 2(4):265–278

    Article  Google Scholar 

  32. Ren Y, Zhang G, Yu G, Li X (2012) Local and global structure preserving based feature selection. Neurocomputing 89:147–157

    Article  Google Scholar 

  33. Guo X, Gao L, Liu X, Yin J (2017) Improved deep embedded clustering with local structure preservation. In: Ijcai: 2017, 1753–1759

  34. Ilić V, Tadić J (2021) Active learning using a self-correcting neural network (ALSCN). Appl Intell 52:1956–1968

    Article  Google Scholar 

  35. Guo W, Cai J, Wang S (2020) Unsupervised discriminative feature representation via adversarial auto-encoder. Appl Intell 50(4):1155–1171

    Article  Google Scholar 

  36. Diallo B, Hu J, Li T, Khan GA, Liang X, Zhao Y (2021) Deep embedding clustering based on contractive autoencoder. Neurocomputing 433:96–107

    Article  Google Scholar 

  37. Enguehard J, O’Halloran P, Gholipour A (2019) Semi-supervised learning with deep embedded clustering for image classification and segmentation. IEEE Access 7:11093–11104

    Article  Google Scholar 

  38. Jia X, Jing XY, Zhu X, Chen S, Du B, Cai Z, He Z, Yue D (2021) Semi-supervised multi-view deep discriminant representation learning. IEEE Trans Pattern Anal Mach Intell 43(7):2496–2509

    Article  Google Scholar 

  39. Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828

    Article  Google Scholar 

  40. Ngoc MT, Park D-C (2018) Centroid neural network with pairwise constraints for semi-supervised learning. Neural Process Lett 48(3):1721–1747

    Article  Google Scholar 

  41. Peng X, Xiao S, Feng J, Yau W-Y, Yi Z (2016) Deep subspace clustering with sparsity prior. In: IJCAI: 2016, 1925–1931

  42. Xie J, Girshick R, Farhadi A (2016) Unsupervised deep embedding for clustering analysis. In: International conference on machine learning: 2016: PMLR, 478–487

  43. Ohi AQ, Mridha MF, Safir FB, Hamid MA, Monowar MM (2020) Autoembedder: a semi-supervised DNN embedding system for clustering. Knowl-Based Syst 204:106190

    Article  Google Scholar 

  44. Wagstaff K, Cardie C, Rogers S, Schrödl S (2001) Constrained k-means clustering with background knowledge. In: Icml: 2001, 577–584

  45. Basu S (2003) Semi-supervised clustering: learning with limited user feedback: computer science department, University of Texas at Austin

  46. Schroff F, Kalenichenko D, Philbin J (2015) Facenet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE conference on computer vision and pattern recognition: 2015, 815–823

  47. Hsu Y-C, Kira Z (2015) Neural network-based clustering using pairwise constraints. arXiv preprint arXiv:151106321

  48. Ren P, Xiao Y, Chang X, Huang P-Y, Li Z, Gupta BB, Chen X, Wang X (2021) A survey of deep active learning. ACM Comput Surv (CSUR) 54(9):1–40

    Article  Google Scholar 

  49. Greene D, Cunningham P (2007) Constraint selection by committee: An ensemble approach to identifying informative constraints for semi-supervised clustering. In: European Conference on Machine Learning: 2007: Springer, 140–151

  50. Yu Z, Luo P, Liu J, Wong H, You J, Han G, Zhang J (2018) Semi-supervised ensemble clustering based on selected constraint projection. IEEE Trans Knowl Data Eng 30(12):2394–2407

    Article  Google Scholar 

  51. Yang F, Li T, Zhou Q, Xiao H (2017) Cluster ensemble selection with constraints. Neurocomputing 235:59–70

    Article  Google Scholar 

  52. Yu Z, Luo P, You J, Wong H-S, Leung H, Wu S, Zhang J, Han G (2015) Incremental semi-supervised clustering ensemble for high dimensional data clustering. IEEE Trans Knowl Data Eng 28(3):701–714

    Article  Google Scholar 

  53. de Oliveira RM, Chaves AA, Lorena LAN (2017) A comparison of two hybrid methods for constrained clustering problems. Appl Soft Comput 54:256–266

    Article  Google Scholar 

  54. Lei Q, Li T (2020) Semi-supervised selective affinity propagation ensemble clustering with active constraints. IEEE Access 8:46255–46266

    Article  Google Scholar 

  55. Xu X, He P (2016) Improving clustering with constrained communities. Neurocomputing 188:239–252

    Article  Google Scholar 

  56. Mallapragada PK, Jin R, Jain AK (2008) Active query selection for semi-supervised clustering. In: 2008 19Th international conference on pattern recognition: 2008: IEEE, 1–4

  57. Liu X (2017) Joint constrained clustering and feature learning based on deep neural networks. Applied Sciences: School of Computing Science

  58. Fard MM, Thonet T, Gaussier E (2020) Deep k-means: jointly clustering with k-means and learning representations. Pattern Recogn Lett 138:185–192

    Article  Google Scholar 

Download references

Acknowledgments

The authors received no financial support for the research and/or authorship of this article.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. A. Balafar.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: This appendix shows Table 3 in the form of charts in Fig. 9

Appendix A: This appendix shows Table 3 in the form of charts in Fig. 9

Fig. 9
figure 9

Comparison of the average results of the proposed method with similar methods over all data sets with different numbers of queries measured by ACC (%). The x and y axes show the number of queries and accuracy, respectively

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hazratgholizadeh, R., Balafar, M.A. & Derakhshi, M.R.F. Active constrained deep embedded clustering with dual source. Appl Intell 53, 5337–5367 (2023). https://doi.org/10.1007/s10489-022-03752-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-03752-5

Keywords

Navigation