Using early clickstream data to identify at-risk students in higher education: an LSTM-based approach

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

32 Downloads (Pure)

Abstract

Identifying students who require additional support is a challenging task for educators at the higher education level. One of the main reasons for this difficulty is that traditional assessments of student performance, such as a final exam or project, occur at the end of the course when it is often too late to implement corrective measures that could prevent a student from failing. Learning management systems (LMS) help students engage with educational content and provide a continuous flow of data that documents student-content interactions throughout the course. Although the data collected may be incomplete and noisy, it can still provide valuable insights into student behaviour and performance.

This work uses Moodle logs to create course-agnostic Long Short-Term Memory unit (LSTM)-based classifiers for early identification of students at risk of failing a course. Our classifiers used the day-wise sequences of the number of clicks by each student in various activity types as input. By analysing data collected up to the 50th day of each course, our LSTM-based classifiers achieved an average area under the receiver operating characteristic curve (AUC) of 0.69 while identifying close to 28% of the at-risk students. Furthermore, with minor changes to the model's hyperparameters, we created a classifier that achieved a slightly lower AUC score (0.67) but identified more than 50% of the at-risk students. Moreover, models trained using only the first 25 days of click sequences achieved similar recall scores, even if their overall accuracy and AUC scores were inferior.

These results suggest that our approach can help educators identify struggling students and provide them with timely feedback to prevent avoidable failures. Future research could explore the generalisation of this approach to more courses and recognise the contribution of each activity type to the final prediction.
Original languageEnglish
Title of host publication16th International Conference on Education and New Learning Technologies 1-3 July, 2024 Palma, Spain
EditorsLuis Gómez Chova, Chelo González Martínez, Joanna Lees
Place of PublicationValencia, Spain
PublisherIATED Academy
Pages6216-6225
Number of pages10
ISBN (Print)978-84-09-62938-1
DOIs
Publication statusPublished - Jul 2024
Event16th International Conference on Education and New Learning Technologies - Palma Convention Centre, Palma, Spain
Duration: 1 Jul 20243 Jul 2024
Conference number: 16
https://iated.org/edulearn/

Publication series

NameEDULEARN24 Proceedings
PublisherIATED Academy
Number2024
ISSN (Print)2340-1117

Conference

Conference16th International Conference on Education and New Learning Technologies
Abbreviated titleEDULEARN24
Country/TerritorySpain
CityPalma
Period1/07/243/07/24
Internet address

Keywords

  • Student Performance
  • Early Prediction
  • Learning Management Systems
  • Machine Learning

Cite this