Predictive model based on Naive Bayes through Supervised Machine Learning and student dropout, in public Technological Education centers in the La Libertad region

Authors

DOI:

https://doi.org/10.17268/rev.cyt.2024.04.05

Keywords:

Naive Bayes algorithm, predictive model, student dropout, supervised algorithms, confusion matrix, machine learning

Abstract

This research expresses a predictive model to estimate students at risk of dropping out of school in public technological higher education centers in the La Libertad region. The model is based on the Naive Bayes classification algorithm in supervised learning machines guided by the CRISP DM methodology. The research is applied, descriptive, non-experimental and cross-sectional in design. The data is obtained from socioeconomic records, enrollments and historical notes to obtain the initial data set, after processing, the final data set is obtained. In the implementation, Python was used through Jupiter Notebook from Google Colaboratory. A part of the final data set was used to train, validate and another to evaluate the reliability of the model. An object of the algorithm is trained with the final set, and the predictive model is obtained. Once the model is generated, a prediction is made with the test data set and the reliability of the results is evaluated. With the expected results of the test data set, a degree of reliability of the obtained model of 93% is verified. To visualize the number of correct and incorrect patterns recognized by the model, the Confusion Matrix was used.

References

Claudio, R. (2007). Informe sobre la educación superior en América Latina y el Caribe 2000-2005. https://www.ses.unam.mx/curso2013/pdf/informe_educacion_superiorAL2007.pdf

Cuji, B.; Gavilanes, W.; & Sánchez, R. (2017). Modelo predictivo de deserción estudiantil basado en árboles de decisión. Revista Espacios, 38(55). https://www.revistaespacios.com/a17v38n55/a17v38n55p17.pdf

Espinosa, J. (2020). Aplicación de metodología CRISP-DM para segmentación geográfica de una base de datos pública. Ingeniería, Investigación y Tecnología, 21(1), 1-10. https://www.scielo.org.mx/scielo.php?script=sci_arttext&pid=S1405-77432020000100008

Gironés, J.; Casas, J.; Minguillón, J. & Caihuelas, R. (2017). Minería de datos: Modelos y algoritmos. Universidad Oberta de Catalunya. https://books.google.com.pe/books/about/Miner%C3%ADa_de_datos.html?id=sOn-swEACAAJ&redir_esc=y

Hinojosa, Á. (2016). Python paso a paso. RA-MA Editorial. https://www.ra-ma.es/libro/python-paso-a-paso_47942/

Mamani, D. (2019). Modelo de minería de datos basado en factores asociados para la predicción de deserción estudiantil universitaria [Tesis de licenciatura, Universidad Nacional Mayor de San Marcos]. Repositorio UNMSM. https://repositorio.unam.edu.pe/bitstreams/9706dd46-07a2-4ba4-8491-527dbefdb3e1/download

Masabanda, J. & Zapata, C. (2019). Modelo basado en minería de datos para determinar factores de deserción estudiantil en la Facultad de Ciencias de la Ingeniería y Aplicadas de la Universidad Técnica de Cotopaxi [Tesis de maestría, Universidad Técnica de Cotopaxi]. Repositorio UTC. https://repositorio.utc.edu.ec/bitstreams/000d2451-4aa5-4340-87f3-9228f3d9ada3/download

Pérez, A. (2016). Python fácil. Marcombo. https://www.alpha-editorial.com/E-book/9786076227800/Python+F%C3%A1cil

Pérez, J. & Nieto, J. (2020). Reflexiones metodológicas de investigación educativa: Perspectivas sociales. Universidad Nacional Abierta y a Distancia (UNAD). http://hdl.handle.net/11634/31292

Sánchez, D. (2015). La tendencia del abandono escolar en Ecuador: Período 1994-2014. http://hdl.handle.net/2078/255557

Timar, R. & Jim, J. (2015). Extracción de perfiles de deserción estudiantil en la institución universitaria CESMAG. Investigium IRE, 3(2), 5-15. https://investigiumire.unicesmag.edu.co/index.php/ire/article/view/69/77

Published

2024-12-28

How to Cite

Polo Romero, V. J. . (2024). Predictive model based on Naive Bayes through Supervised Machine Learning and student dropout, in public Technological Education centers in the La Libertad region. Revista CIENCIA Y TECNOLOGÍA, 20(4), 59-71. https://doi.org/10.17268/rev.cyt.2024.04.05

Issue

Section

Artículos Originales