{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,11,19]],"date-time":"2024-11-19T18:56:52Z","timestamp":1732042612765},"reference-count":29,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2023,3,30]],"date-time":"2023-03-30T00:00:00Z","timestamp":1680134400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Algorithms"],"abstract":"Nonnegative matrix factorization can be used to automatically detect topics within a corpus in an unsupervised fashion. The technique amounts to an approximation of a nonnegative matrix as the product of two nonnegative matrices of lower rank. In certain applications it is desirable to extract topics and use them to predict quantitative outcomes. In this paper, we show Nonnegative Matrix Factorization can be combined with regression on a continuous response variable by minimizing a penalty function that adds a weighted regression error to a matrix factorization error. We show theoretically that as the weighting increases, the regression error in training decreases weakly. We test our method on synthetic data and real data coming from Rate My Professors reviews to predict an instructor\u2019s rating from the text in their reviews. In practice, when used as a dimensionality reduction method (when the number of topics chosen in the model is fewer than the true number of topics), the method performs better than doing regression after topics are identified\u2014both during training and testing\u2014and it retrains interpretability.<\/jats:p>","DOI":"10.3390\/a16040187","type":"journal-article","created":{"date-parts":[[2023,3,30]],"date-time":"2023-03-30T05:05:26Z","timestamp":1680152726000},"page":"187","source":"Crossref","is-referenced-by-count":2,"title":["Continuous Semi-Supervised Nonnegative Matrix Factorization"],"prefix":"10.3390","volume":"16","author":[{"given":"Michael R.","family":"Lindstrom","sequence":"first","affiliation":[{"name":"School of Mathematical and Statistical Sciences, The University of Texas Rio Grande Valley, Edinburg, TX 78539, USA"}]},{"given":"Xiaofu","family":"Ding","sequence":"additional","affiliation":[{"name":"Department of Mathematics, University of California Los Angeles, Los Angeles, CA 90095, USA"}]},{"given":"Feng","family":"Liu","sequence":"additional","affiliation":[{"name":"Department of Mathematics, University of California Los Angeles, Los Angeles, CA 90095, USA"}]},{"ORCID":"http:\/\/orcid.org\/0000-0002-6781-3220","authenticated-orcid":false,"given":"Anand","family":"Somayajula","sequence":"additional","affiliation":[{"name":"Department of Mathematics, University of California Los Angeles, Los Angeles, CA 90095, USA"}]},{"ORCID":"http:\/\/orcid.org\/0000-0002-8058-8638","authenticated-orcid":false,"given":"Deanna","family":"Needell","sequence":"additional","affiliation":[{"name":"Department of Mathematics, University of California Los Angeles, Los Angeles, CA 90095, USA"}]}],"member":"1968","published-online":{"date-parts":[[2023,3,30]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.knosys.2018.08.011","article-title":"Experimental explorations on short text topic mining between LDA and NMF based Schemes","volume":"163","author":"Chen","year":"2019","journal-title":"Knowl.-Based Syst."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"788","DOI":"10.1038\/44565","article-title":"Learning the parts of objects by non-negative matrix factorization","volume":"401","author":"Lee","year":"1999","journal-title":"Nature"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1103","DOI":"10.1109\/JBHI.2021.3113668","article-title":"Regression and Classification of Alzheimer\u2019s Disease Diagnosis Using NMF-TDNet Features From 3D Brain MR Image","volume":"26","author":"Lao","year":"2021","journal-title":"IEEE J. Biomed. Health Inform."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Lai, Y., Hayashida, M., and Akutsu, T. (2013). Survival analysis by penalized regression and matrix factorization. Sci. World J., 2013.","DOI":"10.1155\/2013\/632030"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"551","DOI":"10.1137\/1035134","article-title":"On the early history of the singular value decomposition","volume":"35","author":"Stewart","year":"1993","journal-title":"SIAM Rev."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"373","DOI":"10.1016\/j.ipm.2004.11.005","article-title":"Document clustering using nonnegative matrix factorization","volume":"42","author":"Shahnaz","year":"2006","journal-title":"Inf. Process. Manag."},{"key":"ref_7","unstructured":"Joyce, J.M. (2011). International Encyclopedia of Statistical Science, Springer."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"853","DOI":"10.1007\/s00158-009-0460-7","article-title":"The weighted sum method for multi-objective optimization: New insights","volume":"41","author":"Marler","year":"2010","journal-title":"Struct. Multidiscip. Optim."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"118","DOI":"10.1111\/insr.12469","article-title":"A critical review of LASSO and its derivatives for variable selection under dependence among covariates","volume":"90","year":"2022","journal-title":"Int. Stat. Rev."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Austin, W., Anderson, D., and Ghosh, J. (2018, January 22\u201327). Fully supervised non-negative matrix factorization for feature extraction. Proceedings of the IGARSS 2018-2018 IEEE International Geoscience and Remote Sensing Symposium, Valencia, Spain.","DOI":"10.1109\/IGARSS.2018.8518592"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"38820","DOI":"10.1109\/ACCESS.2018.2854232","article-title":"Joint linear regression and nonnegative matrix factorization based on self-organized graph for image clustering and classification","volume":"6","author":"Zhu","year":"2018","journal-title":"IEEE Access"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Haddock, J., Kassab, L., Li, S., Kryshchenko, A., Grotheer, R., Sizikova, E., Wang, C., Merkh, T., Madushani, R., and Ahn, M. (November, January 31). Semi-supervised Nonnegative Matrix Factorization for Document Classification. Proceedings of the 2021 55th Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, CA, USA.","DOI":"10.1109\/IEEECONF53345.2021.9723109"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Li, P., Tseng, C., Zheng, Y., Chew, J.A., Huang, L., Jarman, B., and Needell, D. (2022). Guided Semi-Supervised Non-negative Matrix Factorization on Legal Documents. Algorithms, 15.","DOI":"10.3390\/a15050136"},{"key":"ref_14","unstructured":"(2023, February 17). Rate My Professors. Available online: https:\/\/www.ratemyprofessors.com\/."},{"key":"ref_15","unstructured":"He, J. (2023, February 21). Big Data Set from RateMyProfessor.com for Professors\u2019 Teaching Evaluation. Available online: https:\/\/doi.org\/10.17632\/fvtfjyvw7d.2."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"713","DOI":"10.1137\/07069239X","article-title":"Nonnegative matrix factorization based on alternating nonnegativity constrained least squares and active set method","volume":"30","author":"Kim","year":"2008","journal-title":"SIAM J. Matrix Anal. Appl."},{"key":"ref_17","unstructured":"Lee, D., and Seung, H.S. (2000). Algorithms for non-negative matrix factorization. Adv. Neural Inf. Process. Syst., 13."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"245","DOI":"10.1109\/TSMC.1983.6313123","article-title":"Review of pseudoinverse control for use with kinematically redundant manipulators","volume":"2","author":"Klein","year":"1983","journal-title":"IEEE Trans. Syst. Man Cybern."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1038\/s41586-020-2649-2","article-title":"Array programming with NumPy","volume":"585","author":"Harris","year":"2020","journal-title":"Nature"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"261","DOI":"10.1038\/s41592-019-0686-2","article-title":"SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python","volume":"17","author":"Virtanen","year":"2020","journal-title":"Nat. Methods"},{"key":"ref_21","unstructured":"(2023, February 17). scipy.optimize.nnls. Available online: https:\/\/docs.scipy.org\/doc\/scipy\/reference\/generated\/scipy.optimize.nnls.html."},{"key":"ref_22","first-page":"393","article-title":"A fast non-negativity-constrained least squares algorithm","volume":"11","author":"Bro","year":"1997","journal-title":"J. Chemom. J. Chemom. Soc."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"2848","DOI":"10.1137\/100799083","article-title":"Efficient parallel nonnegative least squares on multicore architectures","volume":"33","author":"Luo","year":"2011","journal-title":"SIAM J. Sci. Comput."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"155","DOI":"10.1016\/j.csda.2006.11.006","article-title":"Algorithms and applications for approximate nonnegative matrix factorization","volume":"52","author":"Berry","year":"2007","journal-title":"Comput. Stat. Data Anal."},{"key":"ref_25","unstructured":"Joachims, T. (2023, February 21). A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization. Technical Report, Carnegie-Mellon Univ Pittsburgh pa Dept of Computer Science. Available online: https:\/\/apps.dtic.mil\/sti\/citations\/ADA307731."},{"key":"ref_26","first-page":"2825","article-title":"Scikit-learn: Machine Learning in Python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J. Mach. Learn. Res."},{"key":"ref_27","first-page":"18","article-title":"Student Consensus on RateMyProfessors Com","volume":"16","author":"Fritsch","year":"2011","journal-title":"Pract. Assess. Res. Eval."},{"key":"ref_28","first-page":"151","article-title":"What ratemyprofessors. com reveals about how and why students evaluate their professors: A glimpse into the student mind-set","volume":"23","author":"Hartman","year":"2013","journal-title":"Mark. Educ. Rev."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Moon, G.E., Ellis, J.A., Sukumaran-Rajam, A., Parthasarathy, S., and Sadayappan, P. (2020, January 6\u201310). ALO-NMF: Accelerated locality-optimized non-negative matrix factorization. Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Virtual.","DOI":"10.1145\/3394486.3403227"}],"container-title":["Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-4893\/16\/4\/187\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,3,30]],"date-time":"2023-03-30T05:17:05Z","timestamp":1680153425000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-4893\/16\/4\/187"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,3,30]]},"references-count":29,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2023,4]]}},"alternative-id":["a16040187"],"URL":"https:\/\/doi.org\/10.3390\/a16040187","relation":{},"ISSN":["1999-4893"],"issn-type":[{"value":"1999-4893","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,3,30]]}}}