Exploration of Machine Learning Methods in Medical Disease Prediction: A Systematic Literature Review

Authors

  • Ria Suci Nurhalizah Universitas Harapan Bangsa
  • Hadi Jayusman Universitas Harapan Bangsa
  • Purwatiningsih Universitas Harapan Bangsa

DOI:

https://doi.org/10.59247/jahir.v1i3.174

Keywords:

Machine Learning, Medical, Diagnose, Prediction, Disease

Abstract

Exploration of Machine Learning methods in the systematic literature shows successful applications in disease diagnosis, disease prediction, and treatment planning. This literature only includes discussions on Classification methods consisting of Support Vector Machine(SVM), Naïve Bayes, Nearest Neighbors and Neural Network(NN) and Regression consisting of Decision Tree, Linear Regression, Random Forest Ensemble Methods, and Neural Network(NN). Clustering which consists of K-Means Clustering, Artificial Neural Network (ANN), Gaussian Mixture, Neural Network (NN) and Dimensionality reduction which consists of Principal Component Analysis (PCA). In the context of healthcare, the importance of sustainability, ethics, and data security are key factors. This research uses Systematic Literature Review (SLR) to explore Machine Learning methods in the medical context and recommends Support Vector Machine, Random Forest, and Neural Networks as effective methods. By exploring 300 papers and selecting 57 papers for discussion of machine learning methods in medical disease prediction. Method selection should be tailored to the dataset characteristics and disease prediction goals, while prioritizing

References

S. S. Rana, B. Nath, P. K. Chaudhari, and S. Vichare, “Cervical Vertebral Maturation Assessment using various Machine Learning techniques on Lateral cephalogram: A systematic literature review,” Journal of Oral Biology and Craniofacial Research, vol. 13, no. 5. Elsevier B.V., pp. 642–651, Sep. 01, 2023. doi: 10.1016/j.jobcr.2023.08.005.

I. Ismail, P. J. A. Stam, F. R. M. Portrait, A. van Witteloostuijn, and X. Koolman, “Addressing unanticipated interactions in risk equalization: A machine learning approach to modeling medical expenditure risk,” Econ Model, vol. 130, Jan. 2024, doi: 10.1016/j.econmod.2023.106564.

O. Nooruldeen, M. R. Baker, A. M. Aleesa, A. Ghareeb, and E. H. Shaker, “Strategies for predictive power: Machine learning models in city-scale load forecasting,” e-Prime - Advances in Electrical Engineering, Electronics and Energy, vol. 6, Dec. 2023, doi: 10.1016/j.prime.2023.100392.

Z. Sun, G. An, Y. Yang, and Y. Liu, “Optimized machine learning enabled intrusion detection 2 system for internet of medical things,” Franklin Open, vol. 6, p. 100056, Mar. 2024, doi: 10.1016/j.fraope.2023.100056.

L. Xiao et al., “Predictive model for early death risk in pediatric hemophagocytic lymphohistiocytosis patients based on machine learning,” Heliyon, vol. 9, no. 11, p. e22202, Nov. 2023, doi: 10.1016/j.heliyon.2023.e22202.

M. M. Hossain, M. A. Kashem, N. M. Nayan, and M. A. Chowdhury, “A Medical Cyber-physical system for predicting maternal health in developing countries using machine learning,” Healthcare Analytics, vol. 5, Jun. 2024, doi: 10.1016/j.health.2023.100285.

J. Allgaier, L. Mulansky, R. L. Draelos, and R. Pryss, “How does the model make predictions? A systematic literature review on the explainability power of machine learning in healthcare,” Artif Intell Med, vol. 143, Sep. 2023, doi: 10.1016/j.artmed.2023.102616.

C. Montorsi, A. Fusco, P. Van Kerm, and S. P. A. Bordas, “Predicting depression in old age: Combining life course data with machine learning,” Econ Hum Biol, vol. 52, Jan. 2024, doi: 10.1016/j.ehb.2023.101331.

S. S. Bhat, M. Banu, G. A. Ansari, and V. Selvam, “A risk assessment and prediction framework for diabetes mellitus using machine learning algorithms,” Healthcare Analytics, vol. 4. Elsevier Inc., Dec. 01, 2023. doi: 10.1016/j.health.2023.100273.

S. Jangili, H. Vavilala, G. S. B. Boddeda, S. M. Upadhyayula, R. Adela, and S. R. Mutheneni, “Machine learning-driven early biomarker prediction for type 2 diabetes mellitus associated coronary artery diseases,” Clin Epidemiol Glob Health, vol. 24, Nov. 2023, doi: 10.1016/j.cegh.2023.101433.

G. S, V. S. Reddy, and M. R. Ahmed, “Exploring the effectiveness of machine learning algorithms for early detection of Type-2 Diabetes Mellitus,” Measurement: Sensors, p. 100983, Dec. 2023, doi: 10.1016/j.measen.2023.100983.

T. Frondelius et al., “Early prediction of ventilator-associated pneumonia with machine learning models: A systematic review and meta-analysis of prediction model performance✰,” Eur J Intern Med, 2023, doi: 10.1016/j.ejim.2023.11.009.

P. S. Asih, Y. Azhar, G. W. Wicaksono, and D. R. Akbi, “Interpretable Machine Learning Model For Heart Disease Prediction,” Procedia Comput Sci, vol. 227, pp. 439–445, 2023, doi: 10.1016/j.procs.2023.10.544.

T. Zhang, F. Rabhi, X. Chen, H. Paik, and C. R. MacIntyre, “A machine learning-based universal outbreak risk prediction tool,” Comput Biol Med, p. 107876, Dec. 2023, doi: 10.1016/j.compbiomed.2023.107876.

B. G. Pijls, “Machine Learning assisted systematic reviewing in orthopaedics,” J Orthop, vol. 48, pp. 103–106, Feb. 2024, doi: 10.1016/j.jor.2023.11.051.

S. Jahandideh, G. Ozavci, B. W. Sahle, A. Z. Kouzani, F. Magrabi, and T. Bucknall, “Evaluation of machine learning-based models for prediction of clinical deterioration: A systematic literature review,” International Journal of Medical Informatics, vol. 175. Elsevier Ireland Ltd, Jul. 01, 2023. doi: 10.1016/j.ijmedinf.2023.105084.

J. W. Asare, P. Appiahene, and E. T. Donkoh, “Detection of anaemia using medical images: A comparative study of machine learning algorithms – A systematic literature review,” Informatics in Medicine Unlocked, vol. 40. Elsevier Ltd, Jan. 01, 2023. doi: 10.1016/j.imu.2023.101283.

O. Alshaikh, S. Parkinson, and S. Khan, “Exploring Perceptions of Decision-Makers and Specialists in Defensive Machine Learning Cybersecurity Applications: The Need for a Standardised Approach,” Comput Secur, p. 103694, Dec. 2023, doi: 10.1016/j.cose.2023.103694.

A. X. Wang, S. S. Chukova, and B. P. Nguyen, “Synthetic minority oversampling using edited displacement-based k-nearest neighbors,” Appl Soft Comput, vol. 148, Nov. 2023, doi: 10.1016/j.asoc.2023.110895.

G. Kantayeva, J. Lima, and A. I. Pereira, “Application of machine learning in dementia diagnosis: A systematic literature review,” Heliyon, vol. 9, no. 11, Nov. 2023, doi: 10.1016/j.heliyon.2023.e21626.

J. A. Warwicker and S. Rebennack, “Support vector machines within a bivariate mixed-integer linear programming framework,” Expert Syst Appl, vol. 245, p. 122998, Jul. 2024, doi: 10.1016/j.eswa.2023.122998.

C. E. Widodo, K. Adi, and R. Gernowo, “A support vector machine approach for identification of pleural effusion,” Heliyon, p. e22778, Nov. 2023, doi: 10.1016/j.heliyon.2023.e22778.

E. S. Mohamed, T. A. Naqishbandi, S. A. C. Bukhari, I. Rauf, V. Sawrikar, and A. Hussain, “A hybrid mental health prediction model using Support Vector Machine, Multilayer Perceptron, and Random Forest algorithms,” Healthcare Analytics, vol. 3, Nov. 2023, doi: 10.1016/j.health.2023.100185.

M. Vishwakarma and N. Kesswani, “A new two-phase intrusion detection system with Naïve Bayes machine learning for data classification and elliptic envelop method for anomaly detection,” Decision Analytics Journal, vol. 7, Jun. 2023, doi: 10.1016/j.dajour.2023.100233.

N. Deepa, J. Sathya Priya, and T. Devi, “Towards applying internet of things and machine learning for the risk prediction of COVID-19 in pandemic situation using Naive Bayes classifier for improving accuracy,” Mater Today Proc, vol. 62, pp. 4795–4799, Jan. 2022, doi: 10.1016/j.matpr.2022.03.345.

C. J. Anderson et al., “A novel naïve Bayes approach to identifying grooming behaviors in the force-plate actometric platform,” J Neurosci Methods, vol. 403, Mar. 2024, doi: 10.1016/j.jneumeth.2023.110026.

S. Suyanto, P. E. Yunanto, T. Wahyuningrum, and S. Khomsah, “A multi-voter multi-commission nearest neighbor classifier,” Journal of King Saud University - Computer and Information Sciences, vol. 34, no. 8, pp. 6292–6302, Sep. 2022, doi: 10.1016/j.jksuci.2022.01.018.

F. J. Gomez-Gil, V. Martínez-Martínez, R. Ruiz-Gonzalez, L. Martínez-Martínez, and J. Gomez-Gil, “Vibration-based monitoring of agro-industrial machinery using a k-Nearest Neighbors (kNN) classifier with a Harmony Search (HS) frequency selector algorithm,” Comput Electron Agric, vol. 217, p. 108556, Feb. 2024, doi: 10.1016/j.compag.2023.108556.

N. Zamri et al., “River quality classification using different distances in k-nearest neighbors algorithm,” in Procedia Computer Science, Elsevier B.V., 2022, pp. 180–186. doi: 10.1016/j.procs.2022.08.022.

K. A. Shastry, “An ensemble nearest neighbor boosting technique for prediction of Parkinson’s disease,” Healthcare Analytics, vol. 3, Nov. 2023, doi: 10.1016/j.health.2023.100181.

N. Herwig and P. Borghesani, “Explaining deep neural networks processing raw diagnostic signals,” Mech Syst Signal Process, vol. 200, Oct. 2023, doi: 10.1016/j.ymssp.2023.110584.

X. Li, K. H. K. Patel, L. Sun, N. S. Peters, and F. S. Ng, “Neural networks applied to 12-lead electrocardiograms predict body mass index, visceral adiposity and concurrent cardiometabolic ill-health,” Cardiovasc Digit Health J, vol. 2, no. 6, pp. S1–S10, Dec. 2021, doi: 10.1016/j.cvdhj.2021.10.003.

J. Oh and B. Kim, “Prediction Model for Demands of the Health Meteorological Information Using a Decision Tree Method,” 2010.

L. Wolfenden et al., “Improving the impact of public health service delivery and research: a decision tree to aid evidence-based public health practice and research,” Australian and New Zealand Journal of Public Health, vol. 44, no. 5. Wiley-Blackwell, pp. 331–332, Oct. 01, 2020. doi: 10.1111/1753-6405.13023.

S. L. QU, A. L. WANG, X. P. PAN, Q. WANG, L. X. DOU, and T. ZHANG, “Estimating the Health and Economic Outcomes of the Prevention of Mother-to-child Transmission of HIV Using a Decision Tree Model,” Biomedical and Environmental Sciences, vol. 32, no. 1, pp. 68–74, Jan. 2019, doi: 10.3967/bes2019.011.

S. D. Permai and H. Tanty, “Linear regression model using bayesian approach for energy performance of residential building,” in Procedia Computer Science, Elsevier B.V., 2018, pp. 671–677. doi: 10.1016/j.procs.2018.08.219.

T. H. Nguyen et al., “Assessing the relationship between Body Mass Index and Bone Mineral Density in a clinical-based sample of Vietnamese aged 20–50: A generalized linear regression analysis,” Human Nutrition & Metabolism, vol. 35, p. 200241, Mar. 2024, doi: 10.1016/j.hnm.2024.200241.

G. Grekousis, Z. Feng, I. Marakakis, Y. Lu, and R. Wang, “Ranking the importance of demographic, socioeconomic, and underlying health factors on US COVID-19 deaths: A geographical random forest approach,” Health Place, vol. 74, Mar. 2022, doi: 10.1016/j.healthplace.2022.102744.

M. Mojsilović, R. Cvejić, S. Pepić, D. Karabašević, M. Saračević, and D. Stanujkić, “Statistical evaluation of the achievements of professional students by combination of the random forest algorithm and the ANFIS method,” Heliyon, vol. 9, no. 11, Nov. 2023, doi: 10.1016/j.heliyon.2023.e21768.

J.-J. Chen, L.-F. Liu, S.-M. Chang, and C.-P. Lu, “Identifying the top determinants of psychological resilience among community older adults during COVID-19 in Taiwan: A random forest approach,” Machine Learning with Applications, vol. 14, p. 100494, Dec. 2023, doi: 10.1016/j.mlwa.2023.100494.

A. Sanjurjo-de-No, A. M. Pérez-Zuriaga, and A. García, “Analysis and prediction of injury severity in single micromobility crashes with Random Forest,” Heliyon, vol. 9, no. 12, Dec. 2023, doi: 10.1016/j.heliyon.2023.e23062.

A. A. Abdullah, M. M. Hassan, and Y. T. Mustafa, “Leveraging Bayesian deep learning and ensemble methods for uncertainty quantification in image classification: A ranking-based approach,” Heliyon, p. e24188, Jan. 2024, doi: 10.1016/j.heliyon.2024.e24188.

K. Sriprateep et al., “Heterogeneous ensemble machine learning to predict the asiaticoside concentration in centella asiatica urban,” Intelligent Systems with Applications, vol. 21, p. 200319, Mar. 2024, doi: 10.1016/j.iswa.2023.200319.

P. Appiahene et al., “Application of ensemble models approach in anemia detection using images of the palpable palm,” Med Nov Technol Devices, vol. 20, Dec. 2023, doi: 10.1016/j.medntd.2023.100269.

X. Wang, Z. Shao, Y. Shen, and Y. He, “Research on fast marking method for indicator diagram of pumping well based on K-means clustering,” Heliyon, vol. 9, no. 10, Oct. 2023, doi: 10.1016/j.heliyon.2023.e20468.

S. Ilbeigipour, A. Albadvi, and E. Akhondzadeh Noughabi, “Cluster-based analysis of COVID-19 cases using self-organizing map neural network and K-means methods to improve medical decision-making,” Inform Med Unlocked, vol. 32, Jan. 2022, doi: 10.1016/j.imu.2022.101005.

P. G, V. R. Chintala, T. Reddy, and R. T, “User-Cloud-based Ensemble Framework for Type-2 Diabetes Prediction with Diet Plan Suggestion,” e-Prime - Advances in Electrical Engineering, Electronics and Energy, p. 100423, Jan. 2024, doi: 10.1016/j.prime.2024.100423.

J. K. Chaw et al., “A predictive analytics model using machine learning algorithms to estimate the risk of shock development among dengue patients,” Healthcare Analytics, vol. 5, Jun. 2024, doi: 10.1016/j.health.2023.100290.

A. Shah et al., “A comprehensive study on skin cancer detection using artificial neural network (ANN) and convolutional neural network (CNN),” Clinical eHealth, vol. 6. KeAi Communications Co., pp. 76–84, Dec. 01, 2023. doi: 10.1016/j.ceh.2023.08.002.

S. Yunhong, Y. Shilei, Z. Xiaojing, and Y. Jinhua, “Edge Detection Algorithm of MRI Medical Image Based on Artificial Neural Network,” in Procedia Computer Science, Elsevier B.V., 2022, pp. 136–144. doi: 10.1016/j.procs.2022.10.021.

E. Külah, Y. M. Çetinkaya, A. G. Özer, and H. Alemdar, “COVID-19 forecasting using shifted Gaussian Mixture Model with similarity-based estimation,” Expert Syst Appl, vol. 214, Mar. 2023, doi: 10.1016/j.eswa.2022.119034.

M. Hamdi, I. Hilali-Jaghdam, B. E. Elnaim, and A. A. Elhag, “Forecasting and classification of new cases of COVID 19 before vaccination using decision trees and Gaussian mixture model,” Alexandria Engineering Journal, vol. 62, pp. 327–333, Jan. 2023, doi: 10.1016/j.aej.2022.07.011.

A. Budiarto, B. Mahesworo, A. A. Hidayat, I. Nurlaila, and B. Pardamean, “Gaussian Mixture Model Implementation for Population Stratification Estimation from Genomics Data,” in Procedia Computer Science, Elsevier B.V., 2021, pp. 202–210. doi: 10.1016/j.procs.2021.12.026.

H. Sharma, G. Mandil, É. Monnier, E. Cor, and P. Zwolinski, “Sizing a hybrid hydrogen production plant including life cycle assessment indicators by combining NSGA-III and principal component analysis (PCA),” Energy Conversion and Management: X, vol. 18, Apr. 2023, doi: 10.1016/j.ecmx.2023.100361.

K. Zhang, Z. Chen, L. Yang, and Y. Liang, “Principal component analysis (PCA) based sparrow search algorithm (SSA) for optimal learning vector quantized (LVQ) neural network for mechanical fault diagnosis of high voltage circuit breakers,” Energy Reports, vol. 9, pp. 954–962, Mar. 2023, doi: 10.1016/j.egyr.2022.11.118.

N. Fairley, P. Bargiela, W. M. Huang, and J. Baltrusaitis, “Principal Component Analysis (PCA) unravels spectral components present in XPS spectra of complex oxide films on iron foil,” Applied Surface Science Advances, vol. 17, Oct. 2023, doi: 10.1016/j.apsadv.2023.100447.

M. Saint-Jalmes et al., “Disease progression modelling of Alzheimer’s disease using probabilistic principal components analysis,” Neuroimage, vol. 278, Sep. 2023, doi: 10.1016/j.neuroimage.2023.120279.

Downloads

Published

2024-01-31

How to Cite

Ria Suci Nurhalizah, Hadi Jayusman, & Purwatiningsih. (2024). Exploration of Machine Learning Methods in Medical Disease Prediction: A Systematic Literature Review. Journal of Advanced Health Informatics Research, 1(3), 157–174. https://doi.org/10.59247/jahir.v1i3.174

Issue

Section

Articles