SciRepID - Scientific Publication Search

Analisis dan Penerapan Algoritma Naïve Bayes Untuk Klasifikasi Penyakit Diabetes Melitus

M Daffa Adrian; Pareza Alam Jusia; Rudolf Sinaga; Azzahra Raihana Adriansyah; Mutammimah Mutammimah

Prosiding Seminar Nasional Ilmu Teknik• 2025 •Asosiasi Riset Ilmu Teknik Indonesia

Diabetes Mellitus is a group of metabolic diseases characterized by hyperglycemia resulting from defects in insulin secretion, insulin action or both. Hyperglycemia is a medical condition in the form of an increase in glucose levels beyond normal limits which is a characteristic of several diseases, especially Diabetes Mellitus, in addition to various other conditions. Diabetes Mellitus is currently a global health threat. Classification is one of the techniques of data mining that can be used to help predict the results of the classification of types of diabetes using the naïve Bayes algorithm. Testing was carried out using 5 evaluation models including rapid miner with 3 options, namely use training set, 5 Fold Cross-Validation, 10 Fold Cross-Validation, and 2 other evaluation models, namely Microsoft Excel and Python. Testing data regarding Diabetes Mellitus has high accuracy in the excel evaluation model, which is 89.00% compared to other evaluation models. Meanwhile, the lowest accuracy is the Python evaluation model which obtains an accuracy of 86.36%. The Naïve Bayes algorithm can be said to be one of the most effective algorithms, both in terms of calculations and the final results, where the test can be used as a basis for diabetes mellitus considering the accuracy results are above 85%.

https://doi.org/10.61132/prosemnasproit.v2i2.114

Open Access Website Google Scholar

Analisis Kelayakan Pemberian Kredit dengan Algoritma Naïve Bayes untuk Antisipasi Risiko Kredit Bermasalah Pada BPR Ukabima Lestari Cabang Jambi

Anggi Saputra; Setiawan Assegaff; Benni Purnama

Prosiding Seminar Nasional Ilmu Teknik• 2025 •Asosiasi Riset Ilmu Teknik Indonesia

This study analyzes creditworthiness assessment and predicts non-performing loan (NPL) risk using the Naïve Bayes algorithm at BPR Ukabima Lestari, Jambi Branch. A quantitative data mining approach with probabilistic classification is applied. The dataset includes borrower attributes such as age, occupation, income, loan amount, tenor, collateral, and repayment history. Research stages comprise data preprocessing, model development, and performance evaluation using accuracy, precision, recall, and F1-score implemented in RapidMiner. The results indicate that the Naïve Bayes model achieves 99.58% accuracy, demonstrating strong capability to predict potential problem loans accurately and efficiently, supporting data-driven credit decisions and strengthening credit risk management in microbanking institutions.

https://doi.org/10.61132/prosemnasproit.v2i2.96

Open Access Website Google Scholar

Evaluasi Kinerja Machine Learning pada Klasifikasi Penyakit Jantung Menggunakan Teknik Penyeimbangan Data

Eni Rohaini; Gunardi, Gunardi; Nurhayati Nurhayati; Jasmir Jasmir; Zahra Prisdian Tiararosa

Prosiding Seminar Nasional Ilmu Teknik• 2025 •Asosiasi Riset Ilmu Teknik Indonesia

AImbalanced data remains a significant issue in heart disease classification using machine learning, as it tends to cause models to overestimate the majority class while ignoring minority classes with high clinical value. This can lead to a decrease in accuracy and the model's ability to accurately detect disease cases. Therefore, this study aims to assess the effectiveness of oversampling techniques, namely Random Oversampling and Synthetic Minority Oversampling Technique (SMOTE), in improving the performance of the K-Nearest Neighbors (KNN), Naive Bayes (NB), and Random Forest (RF) algorithms. The dataset used comes from Kaggle and consists of 918 data sets with 12 attributes representing patient information related to heart disease prediction. The research stages include data preprocessing, baseline model testing, and re-evaluation using the two oversampling methods. Experimental results show that oversampling can improve the performance of all algorithms. KNN achieved the best results with SMOTE, with an accuracy of 72.98% and an F1-score of 75.39%. In the Naive Bayes algorithm, both oversampling techniques produced relatively stable performance, with the highest F1-score of 73.56% using SMOTE. Meanwhile, Random Forest showed the most optimal performance when combined with Random Oversampling, with an accuracy of 79.19% and an F1-score of 81.51%. These findings confirm that the success of data balancing techniques is strongly influenced by the characteristics of the classification algorithm used, and provide a practical contribution in determining strategies for handling imbalanced data in health research.

https://doi.org/10.61132/prosemnasproit.v2i2.59

Open Access Website Google Scholar

Analisis Machine Learning pada Data Netflix Shows untuk Mengklasifikasikan Tren Genre dan Karakteristik Film

Claudia K. Hamsi; I Wayan Sudiarsa; Vinsensia P.K Abu; Sarling C. Dhai; Maria A. Serero

Mars: Jurnal Teknik Mesin, Industri, Elektro Dan Ilmu Komputer• 2025 •Asosiasi Riset Teknik Elektro dan Informatika Indonesia

The rapid development of digital streaming platforms such as Netflix has generated a large volume of content data with diverse characteristics, thereby requiring effective analytical methods to understand emerging patterns and trends. This study aims to classify Netflix content into two main categories, namely movies and television shows, and to analyze genre trends and content characteristics using a data mining approach with the Naive Bayes algorithm. The dataset used in this study is the Netflix Shows dataset, consisting of 8,809 content entries, with the primary features analyzed including genre, rating, and country of production. The research process begins with data exploration and preprocessing stages, including data cleaning, handling missing values, and transforming categorical features to enable effective model construction. Subsequently, the dataset is divided into training and testing sets to objectively and systematically build and evaluate the Naive Bayes classification model. Model performance is evaluated using accuracy, precision, recall, and F1-score metrics to assess the model’s ability to accurately distinguish between Netflix content types. The experimental results demonstrate that the Naive Bayes algorithm is able to classify Netflix content into Movie and TV Show categories with accuracy, precision, recall, and F1-score values of 100%, respectively. The confusion matrix indicates that no misclassification occurred, suggesting that genre, rating, and country of production features provide a very clear separation between content classes. These findings indicate that the Naive Bayes algorithm can achieve exceptionally high classification performance with optimal evaluation results. The results further reveal distinct differences in characteristics between movies and television shows based on genre and production attributes. Therefore, this study is expected to contribute to the development of content recommendation systems and strategic content management within the streaming industry.

https://doi.org/10.61132/mars.v3i6.1389

Open Access Website Google Scholar

Analisis Sentimen Kebijakan Pemberlakuan Cukai pada Minuman Berpemanis dalam Kemasan Menggunakan Metode Multinomial Naive Bayes

Firdaus, Muhammad; Rosyidah, Ulya Anisatur; Handayani, Luluk

Router : Jurnal Teknik Informatika dan Terapan• 2025 •Asosiasi Profesi Telekomunikasi dan Informatika Indonesia

Sugar consumption in Indonesia remains high, with diabetes affecting 20.4 million people. This condition has prompted the government to introduce an excise policy on Minuman Berpemanis Dalam Kemasan (MBDK) to reduce sugar intake. Social media, particularly the X platform, serves as a medium for the public to express their opinions regarding this policy. This study aims to analyze public sentiment toward the MBDK excise policy using a lexicon-based approach for data labeling and the Multinomial Naive Bayes algorithm with unigram and bigram feature extraction. The initial results show that the highest performance was achieved using 5-Fold Cross Validation, with an average accuracy of 83%, precision of 84%, recall of 75%, and an F1-Score of 77%. After applying data balancing using Stratified Cross Validation combined with Borderline-SMOTE and limiting the features to the 700 most frequent terms, the model’s performance improved. The best results were obtained with 10-Fold Cross Validation, achieving 86% accuracy, 84% precision, 83% recall, and an F1-Score of 83%. These findings indicate that the Multinomial Naive Bayes model can effectively classify public sentiment regarding the MBDK excise policy after the data balancing process.

https://doi.org/10.62951/router.v3i4.704

Open Access Website Google Scholar

Sistem Pendukung Keputusan Penentuan Program Keluarga Harapan di Dinas Sosial Kabupaten Sumba Barat Menggunakan Metode Naïve Bayes

Selvinus Dakku; Vinsensius Aprila Kore Dima; Diana Reby Sabawaly

Router : Jurnal Teknik Informatika dan Terapan• 2025 •Asosiasi Profesi Telekomunikasi dan Informatika Indonesia

The Family Hope Program (PKH) is a conditional social assistance program provided by the government to improve the quality of life of underprivileged families through support in the education, health, and social welfare sectors. In its implementation, the process of determining PKH candidate recipients at the West Sumba Regency Social Service often experiences obstacles, especially with regard to objectivity, accuracy of targets, and limitations in complex data management. Thus, a decision support system (SPK) is needed that can assist the agency in selecting prospective recipients more effectively, efficiently, and on target. This study proposes the application of the Naive Bayes method in the development of SPK to determine PKH recipients. The Naive Bayes method was chosen because of its ability to classify data based on probability, and it can handle large volumes of data with a good degree of accuracy. The criteria applied in the classification include the level of household income, the number of members covered, the state of residence, the education of children, and the health of family members. The research process includes needs analysis, system design, data collection, application of Naive Bayes algorithms, and system testing. The findings of the study show that SPK based on Naive Bayes can provide recommendations for PKH recipients with better accuracy compared to manual methods. In addition, the system is able to improve transparency, fairness, and speed in the recipient selection procedure. With this system, it is hoped that the distribution of PKH in West Sumba Regency can be more orderly, balanced, and on target in accordance with the goals of government programs.

https://doi.org/10.62951/router.v3i3.641

Open Access Website Google Scholar

Penerapan Algoritma Multinomial Naïve Bayes dengan Penyeimbangan Data SMOTE pada Kl Asifikasi Sentimen Pengguna Shopee terhadap Produk Facial Wash Kahf

Farendika Rezzi

Uranus: Jurnal Ilmiah Teknik Elektro, Sains dan Informatika• 2025 •Asosiasi Riset Teknik Elektro dan Informatika Indonesia

The rapid growth of e-commerce platforms has significantly transformed the way consumers share and access product feedback. One of the widely used platforms in Indonesia is Shopee, where customers actively provide reviews of various products, including local skincare brands such as Kahf facial wash. Customer reviews on e-commerce platforms contain valuable information that can be analyzed to understand consumer opinions and preferences. Sentiment analysis, as a branch of natural language processing, enables the classification of textual data into categories such as positive, negative, or neutral. This study aims to classify Shopee user sentiments regarding Kahf facial wash products by implementing the Multinomial Naïve Bayes algorithm, a well-known probabilistic classifier suitable for text categorization. The research methodology consisted of several preprocessing stages, including data cleansing, case folding, tokenizing, stopword removal, and stemming, to prepare raw review texts for further analysis. For feature representation, the Term Frequency–Inverse Document Frequency (TF-IDF) method was applied to capture the importance of words across documents. To evaluate the classification performance, K-Fold cross-validation was employed with K values of 4, 5, 6, and 10 to ensure model reliability and robustness. Considering the issue of imbalanced datasets in user-generated reviews, the Synthetic Minority Over-sampling Technique (SMOTE) was utilized to balance the distribution of sentiment classes. Based on the confusion matrix, the Multinomial Naïve Bayes algorithm demonstrated effective performance in classifying sentiments, achieving satisfactory levels of accuracy, precision, and recall across different folds. These results indicate that the algorithm is capable of handling sentiment analysis tasks for local product reviews effectively. The findings of this study are expected to provide meaningful insights for businesses in understanding consumer perceptions, thereby supporting decision-making processes in product development, marketing strategies, and customer engagement for local brands.

https://doi.org/10.61132/uranus.v3i3.1022

Open Access Website Google Scholar

Analisis Sentimen Ulasan pada Google Review di Sebuah Penginapan Menggunakan Algoritma Naïve Bayes: Studi Kasus: Grand Jatra Hotel Pekanbaru

Muhammad Azlan; Elvi Rahmi

Neptunus: Jurnal Ilmu Komputer Dan Teknologi Informasi• 2025 •Asosiasi Riset Teknik Elektro dan Informatika Indonesia

This study aims to analyze the sentiment of customer reviews of the Grand Jatra Hotel Pekanbaru on the Google Review platform using the Naïve Bayes algorithm. Social media and online review platforms are increasingly becoming the primary source of information for potential customers in making purchasing decisions, particularly in the hospitality sector. Therefore, sentiment analysis of customer reviews is crucial for understanding consumer perceptions and providing strategic input for hotels in improving service quality. The research data was collected using web scraping techniques to obtain publicly available customer reviews. The obtained data was then processed through text preprocessing stages including case folding, tokenizing, normalization, stopword removal, and stemming. The Term Frequency-Inverse Document Frequency (TF-IDF) method was then used to weight each word, so that more relevant words have a greater influence in the classification process. The sentiment classification process was carried out into two main categories, namely positive and negative. The Naïve Bayes model was trained using training data and then tested with test data to measure the algorithm's performance in classifying sentiment. The evaluation results show that the model built is able to achieve an accuracy level of 98%, with a precision value of 97% and a recall of 100% in the positive class, and 92% in the negative class. These findings confirm that the Naïve Bayes algorithm can be effectively used in analyzing customer sentiment towards hotel services and facilities. Practically, the results of this study are expected to provide insight for the management of Grand Jatra Hotel Pekanbaru in understanding customer perceptions, identifying service strengths and weaknesses, and formulating more targeted marketing strategies. In addition, this study can also be a reference for the development of similar studies in the hotel industry and other service sectors.

https://doi.org/10.61132/neptunus.v3i3.1003

Open Access Website Google Scholar

Model Perawatan Generator di PLTU Paiton menggunakan Metode Naive Bayes

Bambang Minto Basuki

Jupiter: Publikasi Ilmu Keteknikan Industri, Teknik Elektro dan Informatika• 2025 •Asosiasi Riset Ilmu Teknik Indonesia

The Paiton Steam Power Plant (PLTU) is one of the main sources of electrical energy in East Java, which plays a vital role in maintaining a sustainable electricity supply. The reliability of generator units is a key element in maintaining stable energy distribution. However, the high frequency of sudden generator failures poses serious challenges, such as increased downtime and increased maintenance costs. To address these challenges, this study aims to design a generator maintenance prediction model based on the Naive Bayes algorithm with a predictive maintenance approach. This study uses historical maintenance data and key sensor parameters such as temperature, oil pressure, and vibration as input. The data is analyzed through several stages, namely data preprocessing, selection of relevant features, and labeling generator conditions into three categories: Normal, Warning, and Critical. The Naive Bayes model is trained to classify the data probabilistically to generate predictions of future generator conditions. Model evaluation using accuracy metrics and a confusion matrix shows that the model successfully achieved an accuracy rate of 89% and was able to provide early warnings of potential failures up to 3 days before failure occurs. The implementation of this system is expected to support the shift in maintenance strategies from reactive and scheduled systems to data-driven predictive systems. Implementing failure predictions allows the technical team at the Paiton PLTU to conduct planned maintenance, avoid sudden disruptions, and extend equipment lifespan. Thus, this model has the potential to reduce operational downtime by up to 25%, while providing significant savings in operational and logistics costs. This research also shows that integrating machine learning technology into energy facility management can improve the efficiency and resilience of the overall electric power system.

https://doi.org/10.61132/jupiter.v3i4.1002

Open Access Website Google Scholar

Analisis Perbandingan Algoritma Random Forest dan Algoritma Naive Bayes untuk Memprediksi Penyakit Paru-Paru di Indonesia

Eka Wulansari Fidayanthie; Asep Sayfulloh; Mardiana Rafa Alzena; Nilam Kurnia Sari

Saturnus: Jurnal Teknologi dan Sistem Informasi• 2025 •Asosiasi Riset Teknik Elektro dan Informatika Indonesia

Lungs are vital organs in the human respiratory system, responsible for fulfilling the body's oxygen needs. If the lungs experience health problems, it can have adverse effects on the human respiratory system. Common causes of lung diseases are usually due to inhaling air contaminated by dust, smoke, viruses, and bacteria. This study aims to compare the performance of two classification algorithms, namely Random Forest and Naive Bayes, in predicting lung diseases. The data used was obtained from the Kaggle website and processed using RapidMiner software. The attributes involved include smoking habits, pre-existing conditions, staying up late, exercise activities, age, and outcomes. Based on the test results, the Random Forest algorithm demonstrated the best performance with an accuracy of 93%, while the Naive Bayes algorithm achieved an accuracy of 87%. These findings indicate that the Random Forest algorithm outperforms the Naive Bayes algorithm in terms of lung disease prediction accuracy.

https://doi.org/10.61132/saturnus.v3i3.956

Open Access Website Google Scholar

Analisis tingkat kepuasan pengguna Tiktok Shop berdasarkan UI/UX menggunakan metode Naïve Bayes

Seli, Francelia Regina; A. Ineke Pakereng , Magdalena

IT-Explore: Jurnal Penerapan Teknologi Informasi dan Komunikasi• 2025 •Fakultas Teknologi Informasi, Universitas Kristen Satya Wacana

Technological advances that continue to develop have changed the way people carry out various activities, including online buying and selling transactions. Various e-commerce platforms are here to meet the Indonesian market, including Tiktok which in the form of a social tool that people like. The lesson wants to observe the satisfaction of Tiktok Shop users from UI/UX through the Naïve Bayes algorithm. This lesson uses the CRISP-DM method. There are stages of reviewing reports, efforts, models, readiness, appearance and reviews. 60 test data processed in Rapid Miner obtained results with a user interface accuracy level of 88.33% and a user experience accuracy level of 76.67%. This shows that the user interface and user experience are factors that influence the level of satisfaction of Tiktok Shop users.

https://doi.org/10.24246/itexplore.v4i2.2025.pp211-220

Open Access Website Google Scholar

Implementasi E-Commerce Minyak Kemiri BUMDes Inegena Berbasis Web Menggunakan Algoritma Naive Bayes

Theresia Clarita Neba; Anastasia Mude; Krisantus Thomas Rada

Merkurius : Jurnal Riset Sistem Informasi dan Teknik Informatika• 2025 •Asosiasi Riset Teknik Elektro dan Informatika Indonesia

This research aims to address the challenges in sales data management and limited market reach faced by the Inegena Village-Owned Enterprise (BUMDes) in North Bajawa District, Ngada Regency, East Nusa Tenggara. The BUMDes produces and sells candlenut oil, a superior local product, but currently uses a manual sales and recording system (B2B and B2C), which leads to fluctuating demand, difficulties in sales data analysis, and decision-making that lacks valid data. To address these issues, a web-based e-commerce system was implemented. This system was designed using Agile methods, involving planning, implementation, software testing (black box testing), documentation, deployment, and maintenance. Furthermore, the Naïve Bayes algorithm was applied to visualize sales data and support better decision-making by classifying best-selling products, popular payment methods, and sales levels. The results of this research are expected to assist Inegena BUMDes in improving sales efficiency, expanding the market reach of candlenut oil products nationally. This system uses supporting software such as Xampp, PHP, and MySQL.

https://doi.org/10.61132/merkurius.v4i2.1591

Open Access Website Google Scholar

Classification of Neighborhood Unit Cadres’ Satisfaction Levels with the Carik App Using the Naïve Bayes Method in Semper Barat Subdistrict

Frencis Matheos Sarimole; Sugiyono Sugiyono; Aditya Zakaria Hidayat; Wida Lestari

International Journal of Information Engineering and Science• 2025 •Asosiasi Riset Teknik Elektro dan Infomatika Indonesia

This study aims to classify the level of satisfaction of Dasawisma cadres with the Carik application in West Semper Village by utilizing the Naive Bayes method. Data was obtained through questionnaires, which were compiled based on three main aspects: ease of use, speed of access, and the usefulness of applications in supporting cadre tasks. After the data is collected, a pre-processing and labeling process is carried out, where the level of satisfaction of respondents is categorized into two classes, namely "satisfied" and "dissatisfied". The Naive Bayes algorithm is applied to predict satisfaction classes based on questionnaire answers. The results of the analysis show that the Naive Bayes method is able to perform classification with sufficient accuracy, so that it can be used as an evaluation tool and decision support in the development of the carik application. This method can also help the management understand user perceptions and improve the system based on objective and routine data in line with the needs of field cadres.

https://doi.org/10.62951/ijies.v2i1.8

Open Access Website Google Scholar

ANALISIS SENTIMEN MEDIA SOSIAL TERHADAP PENGGUNAAN INSTAGRAM OLEH ANAK USIA DINI PENGEMBANGAN SISTEM INFORMASI EDUKATIF BERBASIS NAÏVE BAYES

Yuma Akbar; Sugiyono Sugiyono; Dedi Gunawan; Salsabila Putri W

International Journal of Information Engineering and Science• 2025 •Asosiasi Riset Teknik Elektro dan Infomatika Indonesia

This study aims to classify the level of satisfaction of Dasawisma cadres with the Carik application in West Semper Village by utilizing the Naive Bayes method. Data was obtained through questionnaires, which were compiled based on three main aspects: ease of use, speed of access, and the usefulness of applications in supporting cadre tasks. After the data is collected, a pre-processing and labeling process is carried out, where the level of satisfaction of respondents is categorized into two classes, namely "satisfied" and "dissatisfied". The Naive Bayes algorithm is applied to predict satisfaction classes based on questionnaire answers. The results of the analysis show that the Naive Bayes method is able to perform classification with sufficient accuracy, so that it can be used as an evaluation tool and decision support in the development of the carik application. This method can also help the management understand user perceptions and improve the system based on objective and routine data in line with the needs of field cadres

https://doi.org/10.62951/ijies.v2i1.341

Open Access Website Google Scholar