{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,3,12]],"date-time":"2025-03-12T04:19:42Z","timestamp":1741753182594,"version":"3.38.0"},"reference-count":37,"publisher":"SAGE Publications","issue":"4","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["IDT"],"published-print":{"date-parts":[[2023,11,20]]},"abstract":"The text clustering model becomes an essential process to sort the unstructured text data in an appropriate format. But, it does not give the pave for extracting the information to facilitate the document representation. In today\u2019s date, it becomes crucial to retrieve the relevant text data. Mostly, the data comprises an unstructured text format that it is difficult to categorize the data. The major intention of this work is to implement a new text clustering model of unstructured data using classifier approaches. At first, the unstructured data is taken from standard benchmark datasets focusing on both English and Telugu languages. The collected text data is then given to the pre-processing stage. The pre-processed data is fed into the model of the feature extraction stage 1, in which the GloVe embedding technique is used for extracting text features. Similarly, in the feature extraction stage 2, the pre-processed data is used to extract the deep text features using Text Convolutional Neural Network (Text CNN). Then, the text features from Stage 1 and deep features from Stage 2 are all together and employed for optimal feature selection using the Hybrid Sea Lion Grasshopper Optimization (HSLnGO), where the traditional SLnO is superimposed with GOA. Finally, the text clustering is processed with the help of Deep CNN-assisted hierarchical clustering, where the parameter optimization is done to improve the clustering performance using HSLnGO. Thus, the simulation findings illustrate that the framework yields impressive performance of text classification in contrast with other techniques while implementing the unstructured text data using different quantitative measures.<\/jats:p>","DOI":"10.3233\/idt-220201","type":"journal-article","created":{"date-parts":[[2023,9,15]],"date-time":"2023-09-15T15:13:30Z","timestamp":1694790810000},"page":"1323-1350","source":"Crossref","is-referenced-by-count":0,"title":["Hybrid unstructured text features for meta-heuristic assisted deep CNN-based hierarchical clustering"],"prefix":"10.1177","volume":"17","author":[{"given":"Bankapalli","family":"Jyothi","sequence":"first","affiliation":[{"name":"Computer Science and Engineering, JNTUK Kakinada, Kakinada, Andhra Pradesh, India"}]},{"given":"L.","family":"Sumalatha","sequence":"additional","affiliation":[{"name":"Computer Science and Engineering, Jawaharlal Nehru Technological University, Hyderabad, Telangana, India"}]},{"given":"Suneetha","family":"Eluri","sequence":"additional","affiliation":[{"name":"Computer Science and Engineering, JNTUK Kakinada, Kakinada, Andhra Pradesh, India"}]}],"member":"179","reference":[{"issue":"1","key":"10.3233\/IDT-220201_ref1","doi-asserted-by":"crossref","first-page":"62","DOI":"10.1109\/TKDE.2011.205","article-title":"Clustering Sentence-Level Text Using a Novel Fuzzy Relational Clustering Algorithm","volume":"25","author":"Skabar","year":"2013","journal-title":"IEEE Transactions on Knowledge and Data Engineering"},{"key":"10.3233\/IDT-220201_ref2","doi-asserted-by":"crossref","first-page":"92037","DOI":"10.1109\/ACCESS.2019.2927345","article-title":"Discovering Topic Representative Terms for Short Text Clustering","volume":"7","author":"Yang","year":"2019","journal-title":"IEEE Access"},{"issue":"10","key":"10.3233\/IDT-220201_ref3","doi-asserted-by":"crossref","first-page":"1360","DOI":"10.1109\/TKDE.2009.174","article-title":"An Efficient Concept-Based Mining Model for Enhancing Text Clustering","volume":"22","author":"Shehata","year":"2010","journal-title":"IEEE Transactions on Knowledge and Data Engineering"},{"key":"10.3233\/IDT-220201_ref4","doi-asserted-by":"crossref","first-page":"57460","DOI":"10.1109\/ACCESS.2018.2873327","article-title":"Neural Feedback Text Clustering With BiLSTM-CNN-Kmeans","volume":"6","author":"Yang","year":"2018","journal-title":"IEEE Access"},{"issue":"1","key":"10.3233\/IDT-220201_ref5","doi-asserted-by":"crossref","first-page":"46","DOI":"10.1109\/TIFS.2012.2223679","article-title":"Document Clustering for Forensic Analysis: An Approach for Improving Computer Inspection","volume":"8","author":"da Cruz Nassif","year":"2013","journal-title":"IEEE Transactions on Information Forensics and Security"},{"issue":"5","key":"10.3233\/IDT-220201_ref6","doi-asserted-by":"crossref","first-page":"641","DOI":"10.1109\/TKDE.2007.190740","article-title":"Text Clustering with Feature Selection by Using Statistical Data","volume":"20","author":"Li","year":"2008","journal-title":"IEEE Transactions on Knowledge and Data Engineering"},{"issue":"8","key":"10.3233\/IDT-220201_ref7","doi-asserted-by":"crossref","first-page":"1391","DOI":"10.1109\/TLA.2021.9475870","article-title":"Effects on Time and Quality of Short Text Clustering during Real-Time Presentations","volume":"19","author":"Fuentealba","year":"2021","journal-title":"IEEE Latin America Transactions"},{"key":"10.3233\/IDT-220201_ref8","doi-asserted-by":"crossref","first-page":"233","DOI":"10.1007\/s00500-015-1959-z","article-title":"A hybrid spam detection method based on unstructured datasets","volume":"21","author":"Shao","year":"2017","journal-title":"Soft Comput"},{"key":"10.3233\/IDT-220201_ref9","doi-asserted-by":"crossref","first-page":"1497","DOI":"10.1007\/s11432-010-4030-9","article-title":"A tetrahedral data model for unstructured data management","volume":"53","author":"Li","year":"2010","journal-title":"Sci China Inf Sci"},{"key":"10.3233\/IDT-220201_ref10","doi-asserted-by":"crossref","first-page":"1083","DOI":"10.1007\/s10472-019-09687-x","article-title":"Semantic string operation for specializing AHC algorithm for text clustering","volume":"88","author":"Jo","year":"2020","journal-title":"Ann Math Artif Intell"},{"key":"10.3233\/IDT-220201_ref11","first-page":"69","article-title":"Evaluation of text document clustering approach based on particle swarm optimization","volume":"3","author":"Karol","year":"2013","journal-title":"Centr Eur J Comp Sci"},{"key":"10.3233\/IDT-220201_ref12","doi-asserted-by":"crossref","first-page":"995","DOI":"10.1007\/s00521-014-1792-9","article-title":"Text clustering using VSM with feature clusters","volume":"26","author":"Cao","year":"2015","journal-title":"Neural Comput & Applic"},{"key":"10.3233\/IDT-220201_ref13","doi-asserted-by":"crossref","first-page":"4321","DOI":"10.1007\/s00521-021-06563-w","article-title":"GOWSeqStream: an integrated sequential embedding and graph-of-words for short text stream clustering","volume":"34","author":"Vo","year":"2022","journal-title":"Neural Comput & Applic"},{"key":"10.3233\/IDT-220201_ref14","doi-asserted-by":"crossref","unstructured":"Ponnusamy M, Bedi P, Suresh T, et al. Design and analysis of text document clustering using salp swarm algorithm. J Supercomput. 2022.","DOI":"10.1007\/s11227-022-04525-0"},{"key":"10.3233\/IDT-220201_ref15","doi-asserted-by":"crossref","first-page":"1309","DOI":"10.1134\/S000511791407011X","article-title":"Hierarchical clustering of text documents","volume":"75","author":"Lomakina","year":"2014","journal-title":"Autom Remote Control"},{"key":"10.3233\/IDT-220201_ref16","doi-asserted-by":"crossref","unstructured":"Abualigah L, Almotairi KH, et al. Efficient text document clustering approach using multi-search Arithmetic Optimization Algorithm. Knowledge-Based Systems. 2022; 248.","DOI":"10.1016\/j.knosys.2022.108833"},{"key":"10.3233\/IDT-220201_ref17","doi-asserted-by":"crossref","unstructured":"Purushothaman R, Rajagopal SP, Dhandapani G. Hybridizing Gray Wolf Optimization (GWO) with Grasshopper Optimization Algorithm (GOA) for text feature selection and clustering. Applied Soft Computing. 2020; 96.","DOI":"10.1016\/j.asoc.2020.106651"},{"key":"10.3233\/IDT-220201_ref18","doi-asserted-by":"crossref","first-page":"10861","DOI":"10.1007\/s11042-022-12155-0","article-title":"Deep text clustering using stacked AutoEncoder","volume":"81","author":"Hosseini","year":"2022","journal-title":"Multimed Tools Appl"},{"key":"10.3233\/IDT-220201_ref19","doi-asserted-by":"crossref","first-page":"212838","DOI":"10.1109\/ACCESS.2020.3040506","article-title":"Unstructured Text Documents Summarization with Multi-Stage Clustering","volume":"8","author":"Saeed","year":"2020","journal-title":"IEEE Access"},{"key":"10.3233\/IDT-220201_ref20","doi-asserted-by":"crossref","first-page":"7581","DOI":"10.1007\/s12652-020-02487-w","article-title":"Two phase cluster validation approach towards measuring cluster quality in unstructured and structured numerical datasets","volume":"12","author":"Kumar","year":"2021","journal-title":"J Ambient Intell Human Comput"},{"key":"10.3233\/IDT-220201_ref21","doi-asserted-by":"crossref","first-page":"378","DOI":"10.1007\/s10791-016-9280-8","article-title":"Mining unstructured content for recommender systems: an ensemble approach","volume":"19","author":"Manzato","year":"2016","journal-title":"Information Retrieval Journal"},{"key":"10.3233\/IDT-220201_ref22","doi-asserted-by":"crossref","first-page":"221","DOI":"10.1007\/s10044-018-00772-1","article-title":"Document representation based on probabilistic word clustering in customer-voice classification","volume":"22","author":"Lee","year":"2019","journal-title":"Pattern Anal Applic"},{"key":"10.3233\/IDT-220201_ref23","doi-asserted-by":"crossref","first-page":"115040","DOI":"10.1016\/j.eswa.2021.115040","article-title":"A hybrid approach for text document clustering using Jaya optimization algorithm","volume":"178","author":"Thirumoorthy","year":"2021","journal-title":"Expert Systems with Applications"},{"key":"10.3233\/IDT-220201_ref24","doi-asserted-by":"crossref","first-page":"2865","DOI":"10.1007\/s13369-019-04191-0","article-title":"A Novel Short Text Clustering Model Based on Grey System Theory","volume":"45","author":"Fidan","year":"2020","journal-title":"Arab J Sci Eng"},{"key":"10.3233\/IDT-220201_ref25","doi-asserted-by":"crossref","first-page":"192","DOI":"10.1016\/j.eswa.2019.05.030","article-title":"Text document clustering using Spectral Clustering algorithm with Particle Swarm Optimization","volume":"134","author":"Jananim","year":"2019","journal-title":"Expert Systems with Applications"},{"key":"10.3233\/IDT-220201_ref26","first-page":"1","article-title":"Glove Word Embedding and DBSCAN algorithms for Semantic Document Clustering","author":"Mohammad","year":"2020","journal-title":"2020 International Conference on Advanced Science and Engineering (ICOASE)"},{"issue":"6","key":"10.3233\/IDT-220201_ref27","doi-asserted-by":"crossref","first-page":"2529","DOI":"10.1109\/TIP.2016.2547588","article-title":"Text-Attentional Convolutional Neural Network for Scene Text Detection","volume":"25","author":"He","year":"2016","journal-title":"IEEE Transactions on Image Processing"},{"key":"10.3233\/IDT-220201_ref28","doi-asserted-by":"crossref","first-page":"141","DOI":"10.1007\/s10618-005-0361-3","article-title":"Hierarchical Clustering Algorithms for Document Datasets","volume":"10","author":"Zhao","year":"2005","journal-title":"Data Mining and Knowledge Discovery"},{"issue":"5","key":"10.3233\/IDT-220201_ref29","doi-asserted-by":"crossref","first-page":"388","DOI":"10.14569\/IJACSA.2019.0100548","article-title":"Sea Lion Optimization Algorithm","volume":"10","author":"Masadeh","year":"2019","journal-title":"International Journal of Advanced Computer Science and Applications"},{"key":"10.3233\/IDT-220201_ref30","doi-asserted-by":"crossref","first-page":"30","DOI":"10.1016\/j.advengsoft.2017.01.004","article-title":"Grasshopper Optimization Algorithm: Theory and application","volume":"105","author":"Saremi","year":"2017","journal-title":"Advances in Engineering Software"},{"key":"10.3233\/IDT-220201_ref31","doi-asserted-by":"crossref","unstructured":"Arora M, Kansal V. Character level embedding with deep convolutional neural network for text normalization of unstructured data for Twitter sentiment analysis. Soc Netw Anal Min. 2019; 9(12).","DOI":"10.1007\/s13278-019-0557-y"},{"key":"10.3233\/IDT-220201_ref32","doi-asserted-by":"crossref","first-page":"11543","DOI":"10.1007\/s00521-019-04641-8","article-title":"Electric fish optimization: a new heuristic algorithm inspired by electrolocation","volume":"32","author":"Yilmaz","year":"2020","journal-title":"Neural Computing and Applications"},{"key":"10.3233\/IDT-220201_ref33","doi-asserted-by":"crossref","unstructured":"Jyothi B, Sumalatha L, Eluri S. Intelligent Deep Learning-based Hierarchical Clustering for Unstructured Text Data. Communication with Concurrency and Computation: Practice and Experience. 2022.","DOI":"10.1002\/cpe.7388"},{"key":"10.3233\/IDT-220201_ref34","doi-asserted-by":"crossref","unstructured":"Apoorva KA, Sangeetha S. Deep neural network and model-based clustering technique for forensic electronic mail author attribution. SN Applied Sciences. 2021; 3(348).","DOI":"10.1007\/s42452-020-04127-6"},{"key":"10.3233\/IDT-220201_ref35","unstructured":"Santhanam S. Context based Text-generation using LSTM networks. Computer Science\u00a0\u2013 Computation and Language. 2018."},{"key":"10.3233\/IDT-220201_ref36","first-page":"1","article-title":"Clustering based feature selection using Extreme Learning Machines for text classification","author":"Roul","year":"2015","journal-title":"2015 Annual IEEE India Conference (INDICON)"},{"key":"10.3233\/IDT-220201_ref37","doi-asserted-by":"crossref","first-page":"42689","DOI":"10.1109\/ACCESS.2020.2976744","article-title":"Document-Level Text Classification Using Single-Layer Multisize Filters Convolutional Neural Network","volume":"8","author":"Akhter","year":"2020","journal-title":"IEEE Access"}],"container-title":["Intelligent Decision Technologies"],"original-title":[],"link":[{"URL":"https:\/\/content.iospress.com\/download?id=10.3233\/IDT-220201","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,11]],"date-time":"2025-03-11T06:00:30Z","timestamp":1741672830000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/full\/10.3233\/IDT-220201"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,11,20]]},"references-count":37,"journal-issue":{"issue":"4"},"URL":"https:\/\/doi.org\/10.3233\/idt-220201","relation":{},"ISSN":["1872-4981","1875-8843"],"issn-type":[{"type":"print","value":"1872-4981"},{"type":"electronic","value":"1875-8843"}],"subject":[],"published":{"date-parts":[[2023,11,20]]}}}