Comparison of Classification and Regression Model Approaches on the Main Causes of Stroke with Symbolic Regression Feyn Qlattice

Authors

  • Purwono Purwono Universitas Harapan Bangsa
  • Agung Budi Prasetio Institut Teknologi Tanggerang Selatan
  • Burhanuddin bin Mohd Aboobaider Universiti Teknikal Malaysia Melaka

DOI:

https://doi.org/10.59247/jahir.v1i2.87

Keywords:

stroke, classification, regression, qlattice, feyn

Abstract

Stroke is one of the deadliest diseases in the world, caused by damage to brain tissue resulting from a blockage in the cerebrovascular system. Proper treatment is essential to avoid worsening complications in patients. Several main triggering factors for stroke include hypertension, obesity, smoking habits, lack of physical activity, excessive alcohol consumption, diabetes, and high cholesterol levels. The advancement of information technology allows for early disease prediction through the utilization of AI and Machine Learning technology. The vast amount of data available on medical and health services worldwide can be maximized to identify risk factors for various diseases, including stroke. Machine learning techniques can be employed to predict the causes of stroke. In this study, we were inspired to use the Feyn Qlattice model approach to address stroke. Both classification and regression models were tested in this study. The results indicate that the classification model performs better, achieving an accuracy rate of 0.95. In contrast, the regression model yielded less satisfactory results, with R2, MAE, and RMSE values considered inadequate. This conclusion is supported by the regression plot and residual plot, both of which indicate suboptimal performance. Hence, maximizing the use of the Feyn Qlattice regression model in datasets related to the causes of stroke is recommended

References

M. O. Owolabi et al., “Primary stroke prevention worldwide: translating evidence into action,” Lancet Public Heal., vol. 7, no. 1, pp. e74–e85, 2022, doi: 10.1016/S2468-2667(21)00230-9.

E. Dritsas and M. Trigka, “Stroke Risk Prediction with Machine Learning Techniques,” Sensors, vol. 22, no. 13, 2022, doi: 10.3390/s22134670.

P. Purwono, A. Ma’arif, I. S. Mangku Negara, W. Rahmaniar, and J. Rahmawan, “Linkage Detection of Features that Cause Stroke using Feyn Qlattice Machine Learning Model,” J. Ilm. Tek. Elektro Komput. dan Inform., vol. 7, no. 3, p. 423, 2021, doi: 10.26555/jiteki.v7i3.22237.

J. C. M. Prick et al., “Experiences with information provision and preferences for decision making of patients with acute stroke,” Patient Educ. Couns., vol. 105, no. 5, pp. 1123–1129, 2022, doi: 10.1016/j.pec.2021.08.015.

A. Kobayashi et al., “European Academy of Neurology and European Stroke Organization consensus statement and practical guidance for pre-hospital management of stroke,” Eur. J. Neurol., vol. 25, no. 3, pp. 425–433, 2018, doi: 10.1111/ene.13539.

J. Liu et al., “Analysis of main risk factors causing stroke in Shanxi Province based on machine learning models,” Informatics Med. Unlocked, vol. 26, no. June, p. 100712, 2021, doi: 10.1016/j.imu.2021.100712.

W. C. Chen, M. Y. Hsiao, and T. G. Wang, “Prognostic factors of functional outcome in post-acute stroke in the rehabilitation unit,” J. Formos. Med. Assoc., no. 7, 2021, doi: 10.1016/j.jfma.2021.07.009.

O. Ookeditse et al., “Healthcare professionals’ knowledge of modifiable stroke risk factors: A cross-sectional questionnaire survey in greater Gaborone, Botswana,” eNeurologicalSci, vol. 25, p. 100365, 2021, doi: 10.1016/j.ensci.2021.100365.

T. Elloker and A. J. Rhoda, “The relationship between social support and participation in stroke: A systematic review,” African J. Disabil., pp. 1–9, 2018, doi: 10.4102/ajod.v7i0.357.

N. Fazakis, O. Kocsis, E. Dritsas, S. Alexiou, N. Fakotakis, and K. Moustakas, “Machine Learning Tools for Long-Term Type 2 Diabetes Risk Prediction,” IEEE Access, vol. 9, pp. 103737–103757, 2021, doi: 10.1109/ACCESS.2021.3098691.

S. Alexiou, E. Dritsas, O. Kocsis, K. Moustakas, and N. Fakotakis, “An approach for Personalized Continuous Glucose Prediction with Regression Trees,” 2021, doi: 10.1109/SEEDA-CECNSM53056.2021.9566278.

N. Fazakis, E. Dritsas, O. Kocsis, N. Fakotakis, and K. Moustakas, “Long-term Cholesterol Risk Prediction using Machine Learning Techniques in ELSA Database,” no. Ijcci, pp. 445–450, 2021, doi: 10.5220/0010727200003063.

A. S. Kwekha-Rashid, H. N. Abduljabbar, and B. Alhayani, “Coronavirus disease (COVID-19) cases analysis using machine-learning applications,” Appl. Nanosci., no. 0123456789, 2021, doi: 10.1007/s13204-021-01868-7.

M. Tavana, “Transforming healthcare one byte at a time in the world of big data,” Healthc. Anal., vol. 1, p. 100003, 2021, doi: 10.1016/j.health.2021.100003.

Y. Yang, X. Zheng, W. Guo, X. Liu, and V. Chang, “Privacy-preserving fusion of IoT and big data for e-health,” Futur. Gener. Comput. Syst., vol. 86, pp. 1437–1455, 2018, doi: https://doi.org/10.1016/j.future.2018.01.003.

J. Waring, C. Lindvall, and R. Umeton, “Automated machine learning: Review of the state-of-the-art and opportunities for healthcare,” Artif. Intell. Med., vol. 104, no. October, p. 101822, 2020, doi: 10.1016/j.artmed.2020.101822.

K. Kosteva, T. Wu, Y. Wang, K. Chaudhuri, and C. Tanislav, “Predicting the risk of stroke in patients with late-onset epilepsy: A machine learning approach,” Epilepsy Behav., vol. 122, p. 108211, 2021.

L. Velagapudi et al., “Discrepancies in Stroke Distribution and Dataset Origin in Machine Learning for Stroke,” J. Stroke Cerebrovasc. Dis., vol. 30, no. 7, p. 105832, 2021, doi: https://doi.org/10.1016/j.jstrokecerebrovasdis.2021.105832.

N. Biswas, K. Mohammad, M. Uddin, and S. Tasmin, “Healthcare Analytics A comparative analysis of machine learning classifiers for stroke prediction : A predictive analytics approach,” Healthc. Anal., vol. 2, no. July, p. 100116, 2022, doi: 10.1016/j.health.2022.100116.

V. H. E. W. Brouwer et al., “Applying machine learning to dissociate between stroke patients and healthy controls using eye movement features obtained from a virtual reality task,” Heliyon, vol. 8, no. 4, p. e09207, 2022, doi: 10.1016/j.heliyon.2022.e09207.

B. Chong, A. Wang, V. Borges, W. D. Byblow, P. Alan Barber, and C. Stinear, “Investigating the structure-function relationship of the corticomotor system early after stroke using machine learning,” NeuroImage Clin., vol. 33, p. 102935, 2022, doi: 10.1016/j.nicl.2021.102935.

H. Zhu, L. Jiang, H. Zhang, L. Luo, Y. Chen, and Y. Chen, “An automatic machine learning approach for ischemic stroke onset time identification based on DWI and FLAIR imaging,” NeuroImage Clin., vol. 31, p. 102744, 2021, doi: https://doi.org/10.1016/j.nicl.2021.102744.

P. A. Riyantoko, Sugiarto, I. G. S. M. Diyasa, and Kraugusteeliana, “‘F.Q.A.M’ Feyn-QLattice Automation Modelling: Python Module of Machine Learning for Data Classification in Water Potability,” 2021, doi: 10.1109/ICIMCIS53775.2021.9699371.

Fedesoriano, “Stroke Prediction Dataset,” 11 clinical features for predicting stroke events, 2020. https://www.kaggle.com/datasets/fedesoriano/stroke-prediction-dataset.

F. Farhangi, “Investigating the role of data preprocessing, hyperparameters tuning, and type of machine learning algorithm in the improvement of drowsy EEG signal modeling,” Intell. Syst. with Appl., vol. 15, no. July, p. 200100, 2022, doi: 10.1016/j.iswa.2022.200100.

Y. Fu, H. Liao, and L. Lv, “A comparative study of various methods of handling missing data in unsoda,” Agric., vol. 11, no. 8, 2021, doi: 10.3390/agriculture11080727.

L. M. Matos, J. Azevedo, A. Matta, A. Pilastri, P. Cortez, and R. Mendes, “Categorical Attribute traNsformation Environment (CANE): A python module for categorical to numeric data preprocessing[Formula presented],” Softw. Impacts, vol. 13, no. July, p. 100359, 2022, doi: 10.1016/j.simpa.2022.100359.

Q. H. Nguyen et al., “Influence of data splitting on performance of machine learning models in prediction of shear strength of soil,” Math. Probl. Eng., vol. 2021, 2021, doi: 10.1155/2021/4832864.

K. R. Broløs et al., “An Approach to Symbolic Regression Using Feyn,” 2021. [Online]. Available: http://arxiv.org/abs/2104.05417.

S. Yang and G. Berdine, “The receiver operating characteristic (ROC) curve,” Southwest Respir. Crit. Care Chronicles, vol. 5, no. 19, p. 34, 2017, doi: 10.12746/swrccc.v5i19.391.

W. Fierz, “A simplified method to approximate a ROC curve with a Bézier curve to calculate likelihood ratios of quantitative test results,” MethodsX, vol. 7, p. 100915, 2020, doi: 10.1016/j.mex.2020.100915.

A. Kulkarni, D. Chong, and F. A. Batarseh, Foundations of data imbalance and solutions for a data democracy. Elsevier Inc., 2020.

D. Chicco, M. J. Warrens, and G. Jurman, “The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation,” PeerJ Comput. Sci., vol. 7, pp. 1–24, 2021, doi: 10.7717/PEERJ-CS.623

Downloads

Published

2023-09-23

How to Cite

Purwono, P., Agung Budi Prasetio, & Burhanuddin bin Mohd Aboobaider. (2023). Comparison of Classification and Regression Model Approaches on the Main Causes of Stroke with Symbolic Regression Feyn Qlattice. Journal of Advanced Health Informatics Research, 1(2), 95–105. https://doi.org/10.59247/jahir.v1i2.87

Issue

Section

Articles

Most read articles by the same author(s)