6285641688335, 628551515511 info@scirepid.com

 
IJITEB - International Journal of Information Technology and Business - Vol. 5 Issue. 1 (2022)

Software Defect Prediction Using AWEIG+ADACOST Bayesian Algorithm for Handling High Dimensional Data and Class Imbalance Problem

Joko Suntoro, Febrian Wahyu Christanto, Henny Indriyawati,



Abstract

The most important part in software engineering is a software defect prediction. Software defect prediction is defined as a software prediction process from errors, failures, and system errors. Machine learning methods are used by researchers to predict software defects including estimation, association, classification, clustering, and datasets analysis. Datasets of NASA Metrics Data Program (NASA MDP) is one of the metric software that researchers use to predict software defects. NASA MDP datasets contain unbalanced classes and high dimensional data, so they will affect the classification evaluation results to be low. In this research, data with unbalanced classes will be solved by the AdaCost method and high dimensional data will be handled with the Average Weight Information Gain (AWEIG) method, while the classification method that will be used is the Naïve Bayes algorithm. The proposed method is named AWEIG + AdaCost Bayesian. In this experiment, the AWEIG + AdaCost Bayesian algorithm is compared to the Naïve Bayesian algorithm. The results showed the mean of Area Under the Curve (AUC) algorithm AWEIG + AdaCost Bayesian yields better than just a Naïve Bayes algorithm with respectively mean of AUC values are 0.752 and 0.696.







Publisher :

Universitas Kristen Satya Wacana

DOI :


Sitasi :

0

PISSN :

2655-9293

EISSN :

2655-495X

Date.Create Crossref:

18-Mar-2025

Date.Issue :

30-Nov-2022

Date.Publish :

30-Nov-2022

Date.PublishOnline :

30-Nov-2022



PDF File :

Resource :

Open

License :

http://creativecommons.org/licenses/by/4.0