TY - JOUR
T1 - Accurate, timely, and portable
T2 - Course-agnostic early prediction of student performance from LMS logs
AU - Santos, Ricardo Miguel
AU - Henriques, Roberto
N1 - Santos, R. M., & Henriques, R. (2023). Accurate, timely, and portable: Course-agnostic early prediction of student performance from LMS logs. Computers and Education: Artificial Intelligence, 5, 1-15. [100175]. https://doi.org/10.1016/j.caeai.2023.100175 --- Statements on open data and ethics The data for this study is confidential and not available for open access. All student data was anonymized and treated in compliance with the General Data Protection Regulation (GDPR) and the institution’s ethical guidelines. Moreover, the project has approval from the university’s Ethics Committee and Institutional Review Board with Code DSCI2022-9-227363.
PY - 2023/1/1
Y1 - 2023/1/1
N2 - In higher education, providing personalized feedback and support to students is a significant challenge. Early warning systems can help by identifying both at-risk and high-performing students, allowing for timely interventions and enhanced learning opportunities. In our study, we used a year's worth of data from an information management school to build predictive models for two binary classification problems: identifying at-risk students and high-performing students. We employed traditional machine learning classifiers and long-short term memory units (LSTM), testing them at various stages of course completion. The best performance was achieved using all course data, with an AUC of 0.756 for at-risk students and 78.2% accuracy for high-performing students using Random Forest and Extremely Randomized Trees, respectively. We found that early prediction was possible as early as 25% course completion. Although LSTM showed inferior performance, it offered practical advantages for early prediction. Our findings suggest that static LMS logs can be reliable indicators of student success early in a course, and a course-agnostic time-dependent representation of the number of clicks can offer a worthwhile tradeoff between predictive performance and simplicity in implementation in some instances. These findings have important implications as they suggest the potential for automated early warning systems that can help educators identify students of interest and allocate resources where they are most needed. However, implementing these systems in real-time requires clear protocols and responsible policies. Further research should explore the generalizability of findings across different contexts and continuously evaluate their real-world effectiveness.
AB - In higher education, providing personalized feedback and support to students is a significant challenge. Early warning systems can help by identifying both at-risk and high-performing students, allowing for timely interventions and enhanced learning opportunities. In our study, we used a year's worth of data from an information management school to build predictive models for two binary classification problems: identifying at-risk students and high-performing students. We employed traditional machine learning classifiers and long-short term memory units (LSTM), testing them at various stages of course completion. The best performance was achieved using all course data, with an AUC of 0.756 for at-risk students and 78.2% accuracy for high-performing students using Random Forest and Extremely Randomized Trees, respectively. We found that early prediction was possible as early as 25% course completion. Although LSTM showed inferior performance, it offered practical advantages for early prediction. Our findings suggest that static LMS logs can be reliable indicators of student success early in a course, and a course-agnostic time-dependent representation of the number of clicks can offer a worthwhile tradeoff between predictive performance and simplicity in implementation in some instances. These findings have important implications as they suggest the potential for automated early warning systems that can help educators identify students of interest and allocate resources where they are most needed. However, implementing these systems in real-time requires clear protocols and responsible policies. Further research should explore the generalizability of findings across different contexts and continuously evaluate their real-world effectiveness.
KW - Learning management systems
KW - Higher education
KW - Machine learning
KW - Early prediction
KW - Student performance
UR - http://www.scopus.com/inward/record.url?scp=85174210250&partnerID=8YFLogxK
U2 - 10.1016/j.caeai.2023.100175
DO - 10.1016/j.caeai.2023.100175
M3 - Article
SN - 2666-920X
VL - 5
SP - 1
EP - 15
JO - Computers and Education: Artificial Intelligence
JF - Computers and Education: Artificial Intelligence
M1 - 100175
ER -