Comparison of two analytical options in academic performance prediction: classification trees versus logistic regression

Authors

Keywords:

árboles de clasificación y regresión, regresión logística, rendimiento académico, estudios de predicción

Abstract

Classification and regression trees and logistic regression are two statistical techniques frequently used in biomedical research, which are contained in several statistical programs. However, their advantages and disadvantages, and the conditions for the application of each, are not well known. The purpose of this article is to compare, from a didactic perspective, the two analytical resources in the context of prediction studies. This paper emphasizes the advantages and relative disadvantages of these two techniques and their values and complementary uses. Efforts are made to minimize the use of technical language, and instead, highly intuitive and practical reasoning arguments are presented. Both options are illustrated with the results of an investigation related to the prognosis of academic performance in first-year medical students at the Faculty of Medical Sciences “Victoria de Girón”, University of Medical Sciences of Havana.

Downloads

Download data is not yet available.

References

Breiman L, Friedman J, Olshen RA, Stone CJ. Classification and Regression Trees. Monterey, CA: Brooks/Cole Publishing; 1984.

Loh W-Y. Classification and regression trees. WIREs: Data Mining Knowl Discov [Internet]. 2011 [citado 30 Ene 2025];1(1):14-23. Disponible en: https://wires.onlibrary.wiley.com//doi/10.1002/widm.8

Carrasco RA, Bueno I, Montero JM. Árboles de clasificación y regresión. En: Fernández Avilés G, Montero JM, editores. Fundamentos de ciencia de datos en R [Internet]. Madrid: McGraw-Hill Interamericana de España S. L.; 2024 [citado 20 Mar 2025]. Disponible en: https://cdr-book.github.io/cap-arboles.html

Carrizosa E, Molero-Río C, Romero Morales D. Mathematical optimization in classification and regression trees. TOP [Internet]. 2021 [citado 20 Mar 2025];29:5-33. Disponible en: https://link.springer.com/content/pdf/10.1007/s11750-021-00594-1.pdf

Hastie T, Tibshirani R, Friedman J. The elements of statistical learning: data mining, inference and prediction. Nueva York, NY: Springer; 2009.

James G, Witten D, Hastie T, Tibshirani R. An Introduction to Statistical Learning with Applications in R. Nueva York, NY: Springer; 2021.

Martínez Pérez JA, Pérez Martín PS. Regresión logística. Medicina de Familia. SEMERGEN [Internet]. 2024 [citado 21 Mar 2025];50(1):1-7. Disponible en: https://www.elsevier.es/es-revista-familia-semergen-40-pdf-S1135389323001661

Schalttmann P. Statistics in diagnostic medicine. Clin Chem Lab Med [Internet]. 2022 Mar 31 [citado 23 Feb 2025];60(6):801-807. Disponible en: https://degruyterbrill.com/document/doi/10.1515/cclm-2022-0225/html

McShane BB, Gal D, Gelman A, Robert C, Tackett JL. Abandon Statistical Significance. Am Stat [Internet]. 2019 [citado 30 Ene 2025];73(Suppl 1):S235-245. Disponible en: https://tandfonline.com/doi/full/10.1080/00031305.2018.1527253?scroll=top&needAcces

Hurlbert SH, Levine RA, Utts J. Coup de Grâce for a Tough Old Bull: “Statistically Significant” Expires. Am Stat [Internet]. 2019 [citado 30 Ene 2025];73(suppl 1):S352-357. Disponible en: https://tandfonline.com/doi/full/10.1080/00031305.2018.1543616?scroll=top&needAcces

Greenland S. For and Against Methodologies: Some Perspectives on Recent Causal and Statistical Inference Debates. Eur J Epidemiol [Internet]. 2017 [citado 16 Feb 2025];32(1):3-20. Disponible en: https://link.springer.com/article/10.1007/s10654-017-0230-6

Anstey E. D-48 Tests de Dominós. Manual. 12 ed. Madrid: TEA Ediciones; 1999.

Martínez Angulo MR. Manual de técnicas de exploración psicológica. La Habana: Editorial Pueblo y Educación; 2013.

Zou H, Hastie T. Regularization and variable selection via the elastic net. J R Stat Soc Series B: Statistical Methodology [Internet]. 2005 [citado 2 Mar 2025];67(2):301-320. Disponible en: https://studylib.net/doc/8388343/regularization-and-variable-selection-via-the-elastic-net

Published

2026-02-21

How to Cite

1.
Bacallao Gallestey J, Fernández Regalado R, Alba Zayas LE. Comparison of two analytical options in academic performance prediction: classification trees versus logistic regression. Rev Cubana Inv Bioméd [Internet]. 2026 Feb. 21 [cited 2026 Feb. 28];45:e3811. Available from: https://revibiomedica.sld.cu/index.php/ibi/article/view/3811

Issue

Section

ENSAYO REFLEXIVO