Hypertension Detection via Tree-Based Stack Ensemble with SMOTE-Tomek Data Balance and XGBoost Meta-Learner
(Christopher Chukwufunaya Odiakaose, Fidelis Obukohwo Aghware, Margaret Dumebi Okpor, Andrew Okonji Eboka, Amaka Patience Binitie, Arnold Adimabua Ojugo, De Rosal Ignatius Moses Setiadi, Ayei Egu Ibor, Rita Erhovwo Ako, Victor Ochuko Geteloma, Eferhire Valentine Ugbotu, Tabitha Chukwudi Aghaunor)
DOI : 10.62411/faith.3048-3719-43
- Volume: 1,
Issue: 3,
Sitasi : 0 01-Dec-2024
| Abstrak
| PDF File
| Resource
| Last.31-Jul-2025
Abstrak:
High blood pressure (or hypertension) is a causative disorder to a plethora of other ailments – as it succinctly masks other ailments, making them difficult to diagnose and manage with a targeted treatment plan effectively. While some patients living with elevated high blood pressure can effectively manage their condition via adjusted lifestyle and monitoring with follow-up treatments, Others in self-denial leads to unreported instances, mishandled cases, and in now rampant cases – result in death. Even with the usage of machine learning schemes in medicine, two (2) significant issues abound, namely: (a) utilization of dataset in the construction of the model, which often yields non-perfect scores, and (b) the exploration of complex deep learning models have yielded improved accuracy, which often requires large dataset. To curb these issues, our study explores the tree-based stacking ensemble with Decision tree, Adaptive Boosting, and Random Forest (base learners) while we explore the XGBoost as a meta-learner. With the Kaggle dataset as retrieved, our stacking ensemble yields a prediction accuracy of 1.00 and an F1-score of 1.00 that effectively correctly classified all instances of the test dataset.
|
0 |
2024 |
Pilot Study on Enhanced Detection of Cues over Malicious Sites Using Data Balancing on the Random Forest Ensemble
(Margaret Dumebi Okpor, Fidelis Obukohwo Aghware, Maureen Ifeanyi Akazue, Andrew Okonji Eboka, Rita Erhovwo Ako, Arnold Adimabua Ojugo, Christopher Chukwufunaya Odiakaose, Amaka Patience Binitie, Victor Ochuko Geteloma, Patrick Ogholuwarami Ejeh)
DOI : 10.62411/faith.2024-14
- Volume: 1,
Issue: 2,
Sitasi : 0 07-Sep-2024
| Abstrak
| PDF File
| Resource
| Last.31-Jul-2025
Abstrak:
The digital revolution frontiers have rippled across society today – with various web content shared online for users as they seek to promote monetization and asset exchange, with clients constantly seeking improved alternatives at lowered costs to meet their value demands. From item upgrades to their replacement, businesses are poised with retention strategies to help curb the challenge of customer attrition. The birth of smartphones has proliferated feats such as mobility, ease of accessibility, and portability – which, in turn, have continued to ease their rise in adoption, exposing user device vulnerability as they are quite susceptible to phishing. With users classified as more susceptible than others due to online presence and personality traits, studies have sought to reveal lures/cues as exploited by adversaries to enhance phishing success and classify web content as genuine and malicious. Our study explores the tree-based Random Forest to effectively identify phishing cues via sentiment analysis on phishing website datasets as scrapped from user accounts on social network sites. The dataset is scrapped via Python Google Scrapper and divided into train/test subsets to effectively classify contents as genuine or malicious with data balancing and feature selection techniques. With Random Forest as the machine learning of choice, the result shows the ensemble yields a prediction accuracy of 97 percent with an F1-score of 98.19% that effectively correctly classified 2089 instances with 85 incorrectly classified instances for the test-dataset.
|
0 |
2024 |
Effects of Data Resampling on Predicting Customer Churn via a Comparative Tree-based Random Forest and XGBoost
(Rita Erhovwo Ako, Fidelis Obukohwo Aghware, Margaret Dumebi Okpor, Maureen Ifeanyi Akazue, Rume Elizabeth Yoro, Arnold Adimabua Ojugo, De Rosal Ignatius Moses Setiadi, Chris Chukwufunaya Odiakaose, Reuben Akporube Abere, Frances Uche Emordi, Victor Ochuko Geteloma, Patrick Ogholuwarami Ejeh)
DOI : 10.62411/jcta.10562
- Volume: 2,
Issue: 1,
Sitasi : 0 27-Jun-2024
| Abstrak
| PDF File
| Resource
| Last.31-Jul-2025
Abstrak:
Customer attrition has become the focus of many businesses today – since the online market space has continued to proffer customers, various choices and alternatives to goods, services, and products for their monies. Businesses must seek to improve value, meet customers' teething demands/needs, enhance their strategies toward customer retention, and better monetize. The study compares the effects of data resampling schemes on predicting customer churn for both Random Forest (RF) and XGBoost ensembles. Data resampling schemes used include: (a) default mode, (b) random-under-sampling RUS, (c) synthetic minority oversampling technique (SMOTE), and (d) SMOTE-edited nearest neighbor (SMOTEEN). Both tree-based ensembles were constructed and trained to assess how well they performed with the chi-square feature selection mode. The result shows that RF achieved F1 0.9898, Accuracy 0.9973, Precision 0.9457, and Recall 0.9698 for the default, RUS, SMOTE, and SMOTEEN resampling, respectively. Xgboost outperformed Random Forest with F1 0.9945, Accuracy 0.9984, Precision 0.9616, and Recall 0.9890 for the default, RUS, SMOTE, and SMOTEEN, respectively. Studies support that the use of SMOTEEN resampling outperforms other schemes; while, it attributed XGBoost enhanced performance to hyper-parameter tuning of its decision trees. Retention strategies of recency-frequency-monetization were used and have been found to curb churn and improve monetization policies that will place business managers ahead of the curve of churning by customers.
|
0 |
2024 |
Enhancing the Random Forest Model via Synthetic Minority Oversampling Technique for Credit-Card Fraud Detection
(Fidelis Obukohwo Aghware, Arnold Adimabua Ojugo, Wilfred Adigwe, Christopher Chukwufumaya Odiakaose, Emma Obiajulu Ojei, Nwanze Chukwudi Ashioba, Margareth Dumebi Okpor, Victor Ochuko Geteloma)
DOI : 10.62411/jcta.10323
- Volume: 1,
Issue: 4,
Sitasi : 0 26-Mar-2024
| Abstrak
| PDF File
| Resource
| Last.31-Jul-2025
Abstrak:
Fraudsters increasingly exploit unauthorized credit card information for financial gain, targeting un-suspecting users, especially as financial institutions expand their services to semi-urban and rural areas. This, in turn, has continued to ripple across society, causing huge financial losses and lowering user trust implications for all cardholders. Thus, banks cum financial institutions are today poised to implement fraud detection schemes. Five algorithms were trained with and without the application of the Synthetic Minority Over-sampling Technique (SMOTE) to assess their performance. These algorithms included Random Forest (RF), K-Nearest Neighbors (KNN), Naïve Bayes (NB), Support Vector Machines (SVM), and Logistic Regression (LR). The methodology was implemented and tested through an API using Flask and Streamlit in Python. Before applying SMOTE, the RF classifier outperformed the others with an accuracy of 0.9802, while the accuracies for LR, KNN, NB, and SVM were 0.9219, 0.9435, 0.9508, and 0.9008, respectively. Conversely, after the application of SMOTE, RF achieved a prediction accuracy of 0.9919, whereas LR, KNN, NB, and SVM attained accuracies of 0.9805, 0.9210, 0.9125, and 0.8145, respectively. These results highlight the effectiveness of combining RF with SMOTE to enhance prediction accuracy in credit card fraud detection.
|
0 |
2024 |