TY - JOUR
T1 - A learning analytics approach to identify students at risk of dropout
T2 - A case study with a technical distance education course
AU - Queiroga, Emanuel Marques
AU - Lopes, João Ladislau
AU - Kappel, Kristofer
AU - Aguiar, Marilton
AU - Araújo, Ricardo Matsumura
AU - Munoz, Roberto
AU - Villarroel, Rodolfo
AU - Cechinel, Cristian
N1 - Publisher Copyright:
© 2020 by the authors.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2020/6/1
Y1 - 2020/6/1
N2 - Contemporary education is a vast field that is concerned with the performance of education systems. In a formal e-learning context, student dropout is considered one of the main problems and has received much attention from the learning analytics research community, which has reported several approaches to the development of models for the early prediction of at-risk students. However, maximizing the results obtained by predictions is a considerable challenge. In this work, we developed a solution using only students' interactions with the virtual learning environment and its derivative features for early predict at-risk students in a Brazilian distance technical high school course that is 103 weeks in duration. To maximize results, we developed an elitist genetic algorithm based on Darwin's theory of natural selection for hyperparameter tuning. With the application of the proposed technique, we predicted the student at risk with an Area Under the Receiver Operating Characteristic Curve (AUROC) above 0.75 in the initial weeks of a course. The results demonstrate the viability of applying interaction count and derivative features to generate prediction models in contexts where access to demographic data is restricted. The application of a genetic algorithm to the tuning of hyperparameters classifiers can increase their performance in comparison with other techniques.
AB - Contemporary education is a vast field that is concerned with the performance of education systems. In a formal e-learning context, student dropout is considered one of the main problems and has received much attention from the learning analytics research community, which has reported several approaches to the development of models for the early prediction of at-risk students. However, maximizing the results obtained by predictions is a considerable challenge. In this work, we developed a solution using only students' interactions with the virtual learning environment and its derivative features for early predict at-risk students in a Brazilian distance technical high school course that is 103 weeks in duration. To maximize results, we developed an elitist genetic algorithm based on Darwin's theory of natural selection for hyperparameter tuning. With the application of the proposed technique, we predicted the student at risk with an Area Under the Receiver Operating Characteristic Curve (AUROC) above 0.75 in the initial weeks of a course. The results demonstrate the viability of applying interaction count and derivative features to generate prediction models in contexts where access to demographic data is restricted. The application of a genetic algorithm to the tuning of hyperparameters classifiers can increase their performance in comparison with other techniques.
KW - At-risk students
KW - Educational data mining
KW - Genetic algorithm
KW - Learning analytics
UR - http://www.scopus.com/inward/record.url?scp=85086929448&partnerID=8YFLogxK
U2 - 10.3390/app10113998
DO - 10.3390/app10113998
M3 - Article
AN - SCOPUS:85086929448
SN - 2076-3417
VL - 10
JO - Applied Sciences (Switzerland)
JF - Applied Sciences (Switzerland)
IS - 11
M1 - 3998
ER -