Abstract
The student’s performance prediction is an important research topic because it can help teachers prevent students from dropping out before final exams and identify students that need additional assistance. The objective of this study is to predict the difficulties that students will encounter in a subsequent digital design course session. We analyzed the data logged by a technology-enhanced learning (TEL) system called digital electronics education and design suite (DEEDS) using machine learning algorithms. The machine learning algorithms included an artificial neural networks (ANNs), support vector machines (SVMs), logistic regression, Naïve bayes classifiers and decision trees. The DEEDS system allows students to solve digital design exercises with different levels of difficulty while logging input data. The input variables of the current study were average time, total number of activities, average idle time, average number of keystrokes and total related activity for each exercise during individual sessions in the digital design course; the output variables were the student(s) grades for each session. We then trained machine learning algorithms on the data from the previous session and tested the algorithms on the data from the upcoming session. We performed k-fold cross-validation and computed the receiver operating characteristic and root mean square error metrics to evaluate the models’ performances. The results show that ANNs and SVMs achieve higher accuracy than do other algorithms. ANNs and SVMs can easily be integrated into the TEL system; thus, we would expect instructors to report improved student’s performance during the subsequent session.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Abu Saa A (2016) Educational data mining and students’ performance prediction. Int J Adv Comput Sci Appl. https://doi.org/10.14569/IJACSA.2016.070531
Acharya A, Sinha D (2014) Early prediction of students performance using machine learning techniques. Int J Comput Appl 107(1):37–43. https://doi.org/10.5120/18717-9939
Ameri S, Fard MJ, Chinnam RB, Reddy CK (2016) Survival analysis based framework for early prediction of student dropouts. In: 25th Procedding of the ACM conference information and knowledge management, pp 903–912. https://doi.org/10.1145/2983323.2983351
Arnold KE, Pistilli (2012) Course signals at purdue: using learning analytics to increase student success. In: 2nd International conference on learning analytics and knowledge (LAK’12), pp 267–270. https://doi.org/10.1145/2330601.2330666
Bakki A, Oubahssi L, Cherkaoui C, George S (2015) Motivation and engagement in MOOCs: How to increase learning motivation by adapting pedagogical scenarios? Desing for teaching and learning in a network world. Lecture notes in computer science 9307:556–559
Barata G, Gama S, Jorge J, Goncalved D (2016) Early prediction of student profiles based on performance and gaming preferences. IEEE Trans Learn Technol 3(9):272–284. https://doi.org/10.1109/TLT.2016.2541664
Chaudhuri S (1998) Data mining and database systems: Where is the intersection? Data Eng Bull 21(1):1998
Chen G-D, Liu C, Ou K-L, Liu B-J (2000) Discovering decision knowledge from web log portfolio for managing classroom processes by applying decision tree and data cube technology. J Educ Comput Res 23(3):305–332. https://doi.org/10.2190/5JNM-B6HP-YC58-PM5Y
Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Meas 20(1):37–46. https://doi.org/10.1177/001316446002000104
De Albuquerque RM, Bezerra AA, de Souza DA, do Nascimento LBP, de Mesquita sa JJ, do Nascimento JC (2015) Using neural networks to predict the future performance of students. In: IEEE international symposium on computers in education (SIIE) 2015, pp 109–113. https://doi.org/10.1109/SIIE.2015.7451658
Devasia MT, Vinushree, HV (2016) Prediction of students performance using educational data mining. In: International conference on data mining and advanced computing (SAPIENCE). https://doi.org/10.1109/SAPIENCE.2016.7684167
Di Mitir D, Scheffel M, Drachsler H, Börner D, Ternier S, Specht M (2017) Learning pulse: a machine learning approach for predicting performance in self-regulated learning using multimodal data. In: 2017 seven international conference on learning analytics and knowledge, pp 188–197. https://doi.org/10.1145/3027385.3027447
Donzellini G, Ponta D (2007) A simulation environment for e-learning in digital design. IEEE Trans Ind Electron 54(6):3078–3085. https://doi.org/10.1109/TIE.2007.907011
Ducher M, Cerutti C, Marquand A, Mounier VC, Hanon O, Girerd X, Ader C, Juillard L, Fauvel JP, Club DJ (2005) How to limit screening of patients for atheromatous renal artery stenosis in two-drug resistant hypertension? J Nephrol 18(2):161–165
Elbadrawy A, Studham RS, Karypis G (2015) Collaborative multi-regression models for predicting students’ performance in course activities. In: 5th International conference on learning analytics and knowledge (LAK ’15), pp 103–107. https://doi.org/10.1145/2723576.2723590
Fawcett T (2004) Roc graphs: notes and practical considerations for researchers. HP Laboratoreis, Palo Alto. 31(8):1–38
Fernandez-Delgado M, Mucientes M, Vazquez-Barreiros B, Lama M (2014) Learning analytices for the prediction of the educational objectives achievement. In: 44th IEEE Frontiers in Eeducation conference (FIE), pp 2500–2503. https://doi.org/10.1109/FIE.2014.7044402
Ge X, Liu J, Qi Q, Chen Z (2011) A new prediction approach based on linear regression for collaborative filtering. In: 8th International 2011 conference on fuzzy systems and knowledge discovery (FSKD), pp 2586–2590. https://doi.org/10.1109/FSKD.2011.6020007
Hämäläinen W, Vinni M (2010) Classifiers for educational data mining. Handbook of educational data mining. Chapman & Hall/CRC Data Mining and Knowledge Discovery Series,CRC Press, pp 57–74. https://doi.org/10.1201/b10274-7
Haykin S (1999) Neural networks: a comprehensive foundation, 2nd edn. Prentice-Hall, Upper Saddle River
He J, Bailey J, Rubinstein BIP, Zhang R (2015) Identifying at-risk students in massive open online courses. In: 29th AAA conference on artificial intelligence 2015, pp 1749–1755
Hlosta M, Zdrahal Z, Zendulka J (2017) Ouroboros: early identification of at-risk students without models based on legacy data. In: 7th International conference on learning analytics & knowledge (LAK’17), pp 6–15. https://doi.org/10.1145/3027385.3027449
Hu Y-H, Lo C-L, Shih S-P (2014) Developing early warning systems to predict students online learning performance. Comput Human Behav 36:469–478. https://doi.org/10.1016/j.chb.2014.04.002
Huang S, Fang N (2013) Predicting student academic performance in an engineering dynamics course: a comparison of four types of predictive mathematical models. Comput Educ 61:133–145. https://doi.org/10.1016/j.compedu.2012.08.015
Imran H, Hoang Q, Chang T-W, Kinshuk, Graf S (2014) A framework to provide personalization in learning management systems through a recommender system approach. In: Intelligent information and database system. ACIIDS 2014. Lecture notes in computer science 8397, pp 271–280. https://doi.org/10.1007/978-3-319-05476-6_28
Jayaprakash SM, Moody EW, Lauria E, Regan JR, Baron JD (2014) Early alert of academically at-risk students: an open source analytics initiative. J Learn Anal 1(1):6–47. https://doi.org/10.18608/jla.2014.11.3
Kai S, Miguel J, Andres L, Paquette L, Baker RS, Molnar K, Watkins H, Moore M (2017) Predicting student retention from behavior in an online orientation course. In: 10th International conference on education data mining
Käser T, Hallinen NR, Schwartz DL (2017) Modeling exploration strategies to predict student performance within a learning environment and beyond. In: 17th International conference on learning analytics and knowledge 2017, pp 31–40. https://doi.org/10.1145/3027385.3027422
Kaur K, Kaur K (2015) Analyzing the effect of difficulty level of a course on students performance prediction using data mining. In: 1st international conference on next generation computing technologies 2015, pp 756–761. https://doi.org/10.1109/NGCT.2015.7375222
Kloft M, Stiehler F, Zheng Z, Pinkwart N (2014) Predicting MOOC dropout over weaks using machine learning methods. In: Proceeding of the EMNLP 2014 workshop on analysis of large scale social interacion in MOOCs, pp 60–65
Kotsiantis S, Pierrakeas C, Zaharakis I, Pintelas P (2003) Efficiency of machine learning techniques in predicting students performance in distance learning systems. Recent advances in mechanics and related fields. University of Patras Press, pp 297–306
Kuzilek J, Hlosta M, Herrmannova D, Zdrahal Z, Vaclavek J, Wolff A (2015) OU analyse: analysing at-risk student at the open university. Learn Anal Rev 15(1):1–16
Liu S, d’Aquin M (2017) Unsupervised learning for understanding student achievement in a distance learning setting. In: IEEE global engineering education conference (EDUCON), pp 25–28. https://doi.org/10.1109/EDUCON.2017.7943026
Lykourentzou I, Giannoukos I, Mpardis G, Nikolopoulos V, Loumos V (2009) Early and dynamic student achievement prediction in e-learning courses using neural networks. J Am Soc Inf Sci Technol 60(2):372–380. https://doi.org/10.1002/asi.v60:2
Marbouti F, Diefes-Dux HA, Madhavan K (2016) Models for early prediction of at-risk students in a course using standards-based grading. Comput Educ 103:1–15. https://doi.org/10.1016/j.compedu.2016.09.005
Marquez-Vera C, Cano A, Remero C, Noman YM, Fardoun HM, Ventura S (2015) Early dropout prediction using data mining :a case study with high school grade. Expert Syst 33(1):107–124. https://doi.org/10.1111/exsy.12135
Meier Y, Xu J, Atan O, Van Der Schaar M (2016) Predicting grades. IEEE Trans Signal Process 64(4):959–972. https://doi.org/10.1109/TSP.2015.2496278
Moseley LG, Mead DM (2008) Predicting who will drop out of nursing courses: a machine learning exercise. Nurse Educ Today 28(4):469–475. https://doi.org/10.1016/j.nedt.2007.07.012
Murphy PM, Aha DW (1995) UCI repository of machine learning databases, (Machine Readable Data Repository). Dept. Inf. Comput. Sci., Univ. California, Irvine, CA
Pahl C, Donnellan D (2002) Data mining technology for the evaluation of web-based teaching and learning systems. In: 7th International conference on e-learning in business, government and higher education, pp 15–19
Pai P-F, Hong W-C (2005) Forecasting regional electricity load based on recurrent support vector machines with genetic algorithms. Electric Power Syst Res 74(3):417–425. https://doi.org/10.1016/j.epsr.2005.01.006
Pelanek R (2015) Metrics for evaluation of student models. J Educ Data Min 7(2):1–19
Ramesh V, Parkavi P, Ramar K (2013) Predicting student performance: a statistical and data mining approach. Int J Comput Appl 63(8):35–39. https://doi.org/10.5120/10489-5242
Rovira S, Puertas E, lgual L (2017) Data-driven system to predict academic grades and dropout. PloS ONE 12(2):e0171207. https://doi.org/10.1371/journal.pone.0171207
Smith-Gratto K (1999) Best practices and problems. Report to the distance education evaluation task force distance educaiton. North Carolina A & T state University, Raleigh
Sweeney M, Rangwala H, Lester J, Johri A (2016) Next-term student performance prediction: a recommender systems approach. J Educ Data Min 8:1–27
Ungar LH, Zhou J, Foster DP, Stine BA (2005) Streaming feature selection using iic. In: Proceedings of the 10th international conference on artificial intelligence and statistics
Vahdat M, Oneto L, Anguita D, Funk M, Rauterberg M (2015) A Learning analytics approach to correlate the academic achievements of students with interaction data from an educational simulator. In: Conole G et al (eds): 10th International European conference on technology enhanced learning (EC-TEL) 2015. pp 352–366. https://doi.org/10.1007/978-3-319-24258-326
Ward ME, Peters G, Shelley K (2010) Student and faculty perceptions of the quality of online learning experiences. Int Rev Res Open Distrib Learn 11(3):57–77. https://doi.org/10.19173/irrodl.v11i3.867
Zacharis NZ (2015) A multivariate approach to predicting student outcomes in web-enabled blended learning courses. Internet High Educ 27:44–53. https://doi.org/10.1016/j.iheduc.2015.05.002
Zheng J, Chen Z, Zhou C (2013) Applying NN-based data mining to learning performance assessment. In: 13th IEEE joint international computer science and information technology conference (JICSIT). https://doi.org/10.1109/ANTHOLOGY.2013.6784924
Zhou J, Foster D, Stine R, Ungar L (2005) Streaming feature selection using alpha-investing.In: 11th ACM international conference on knowledge discovery in data mining, pp 384–393. https://doi.org/10.1145/1081870.1081914
Acknowledgements
The work of this paper is supported by National Natural Science Foundation of china (Nos.61572434, 91630206 and 61303097) and the National Key R&D Program of China (No. 2017YFB0701501).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors have no conflicts of interest.
Rights and permissions
About this article
Cite this article
Hussain, M., Zhu, W., Zhang, W. et al. Using machine learning to predict student difficulties from learning session data. Artif Intell Rev 52, 381–407 (2019). https://doi.org/10.1007/s10462-018-9620-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-018-9620-8