SciRepID - Scientific Publication Search

IMPLEMENTASI ALGORITMA K – NEAREST NEIGHBORUNTUK MENGKLASIFIKASIKAN FAKTOR PERNIKAHAN DINI

Siti Muntari; Febriansyah, Febriansyah

JURNAL ILMIAH KOMPUTER GRAFIS• 2026 •UNIVERSITAS STEKOM

Early Marriage in Pagar Alam City is currently still quite high, the Pagar Alam Religious Court only relies on a recap of the number of cases per year to draw conclusions about the early marriage data. This method has limitations in classifying early marriage factors, so the Religious Court has difficulty in monitoring and controlling the occurrence of Early Marriage in Pagar Alam City. The purpose of this research thesis is to produce a classification system for Early Marriage factors using the K-Nearest Neighbor Algorithm to find out what factors influence the occurrence of Early Marriage, the method used in this study is the Cross Industry Standard Process for Data Mining (CRISP-DM) which has 6 stages, namely: Business Understanding, Data Understanding, Data Preparation, Modeling, Evaluation, and Deployment. The testing stage in this study uses Confusion Matrix and BlackBox Testing. The final results of this study indicate that the system can classify Early Marriage factors. The classification model built achieved an Accuracy of 94.12% and a Precise value of 85.71% and a Recall value of 100.00%, while testing using Black Box testing in the form of alpha obtained a feasibility value of 83.2%, so this system is very suitable for use.

https://doi.org/10.51903/pixel.v19i1.3879

Open Access Website Google Scholar

Klasifikasi Penyakit Daun Tomat Menggunakan MobileNetV3

Russel Wijaya; Nur Rachmat

JURNAL PENELITIAN TEKNOLOGI INFORMASI DAN SAINS (JPTIS)• 2026 •Institut Teknologi dan Bisnis (ITB) Semarang

Tomato (Solanum lycopersicum) is a high-value horticultural commodity in Indonesia, yet its cultivation is frequently disrupted by leaf diseases that are difficult to distinguish visually. Diseases such as Bacterial Spot, Early Blight, and Tomato Yellow Leaf Curl Virus often present overlapping visual symptoms, making early and accurate diagnosis a significant challenge for farmers. The manual identification methods currently in use are inefficient and error-prone, ultimately leading to reduced crop yield  and quality. The general objective of this study is to develop software capable of automatically classifying tomato  leaf diseases. Specifically, this research aims to implement the MobileNetV3 Small architecture based on Convolutional Neural  Network (CNN) with ImageNet pre-trained weights to classify 10 types of tomato leaf diseases. The research methodology encompasses dataset collection from Kaggle comprising 10,000 images (1,000 per class), image pre-processing through resizing to 224x224 pixels, and normalization, as well as hyperparameter optimization (optimizer, learning rate, epoch, batch size) via scheduler. Model performance is evaluated using a confusion matrix encompassing accuracy, precision, recall, and F1-score.

https://doi.org/10.54066/jptis.v4i2.4230

Open Access Website Google Scholar

Implementasi Algoritma Naïve Bayes untuk Deteksi Dini Risiko Hipertensi Berdasarkan Distribusi Usia pada Layanan Kesehatan

Julio Warmansyah; Safrial Safrial; Alam Supriatna; Wiwit Thoyyibah

JURNAL PENELITIAN TEKNOLOGI INFORMASI DAN SAINS (JPTIS)• 2026 •Institut Teknologi dan Bisnis (ITB) Semarang

Hypertension is one of the leading non-communicable diseases contributing significantly to cardiovascular morbidity and mortality worldwide. Despite the availability of extensive electronic medical record data in healthcare institutions, these data are often utilized only for administrative reporting rather than predictive analysis. Consequently, opportunities to identify age groups with a higher probability of developing hypertension remain underutilized. This study aims to implement the Naïve Bayes classification algorithm to analyze age distribution and classify the risk of hypertension among patients using healthcare data. The research adopted the Cross-Industry Standard Process for Data Mining (CRISP-DM) methodology, including business understanding, data understanding, data preparation, modeling, evaluation, and deployment. Patient medical record data consisting of demographic and clinical attributes, including age, systolic blood pressure, diastolic blood pressure, body weight, gender, and hypertension status, were processed using the Naïve Bayes algorithm. Model performance was evaluated using a confusion matrix by measuring accuracy, precision, recall, specificity, and balanced accuracy. The implementation demonstrates that the Naïve Bayes algorithm is capable of classifying hypertension risk efficiently while providing probabilistic information regarding age groups with a higher tendency to experience hypertension. The resulting classification model offers an effective decision-support tool for healthcare providers in conducting targeted screening, preventive interventions, and evidence-based health planning. The findings also indicate that data mining techniques can transform routinely collected medical records into valuable clinical knowledge for early hypertension prevention and healthcare decision-making.

https://doi.org/10.54066/jptis.v4i2.4415

Open Access Website Google Scholar

Implementasi model Naive Bayes multikategori untuk analisis sentimen produk Wardah di E-commerce Shoppe

Mesra Betty Yel; Sopan Adrianto; Rasiban Rasiban; Eva Widiyanti

International Journal of Information Engineering and Science• 2026 •Asosiasi Riset Teknik Elektro dan Infomatika Indonesia

The growth of information technology has driven changes in consumer behavior, one of which is through e-commerce platforms such as Shopee. This phenomenon has generated a large number of customer reviews, including those for local cosmetic products such as Wardah. These reviews serve as an important source of information for understanding customer perceptions and satisfaction levels. However, manual analysis of large and linguistically diverse datasets is inefficient and potentially subjective. This study aims to implement the multi-category Naive Bayes algorithm to classify the sentiment of Wardah product reviews on Shopee into three categories: positive, negative, and neutral. The data were collected using a web scraping technique and processed through a series of preprocessing stages including case folding, tokenization, stopword removal, stemming, and text cleaning. Subsequently, term weighting was performed using the TF-IDF method prior to classification. Model performance was evaluated using a confusion matrix as well as accuracy, precision, and recall metrics. The results indicate that the multi-category Naive Bayes algorithm achieved an accuracy of 86.00%, a precision of 86.63%, and a recall of 98.24%. This approach can assist business practitioners in objectively understanding customer opinions and support decision-making in business strategy and product development.

https://doi.org/10.62951/ijies.v2i2.6

Open Access Website Google Scholar

Deteksi Tingkat Konsentrasi Belajar Melalui Perilaku Siswa di Kelas Menggunakan Convolutional Neural Networks

Yuma Akbar; Sopan Adrianto; Rasiban Rasiban; Nadya Khairunnisa

International Journal of Applied Mathematics and Computing• 2026 •Asosiasi Riset Ilmu Matematika dan Sains Indonesia

This study discusses a student concentration detection system using Convolutional Neural Network (CNN) with the MobileNetV2 architecture. The dataset was adapted from Classroom Student Behaviors and mapped into four concentration categories: highly focused, focused, less focused, and unfocused. The system was tested with a 720p webcam and produced real-time detection data. The evaluation results show an overall accuracy of 75.85%, with the highest precision achieved in the focused class (0.9859) and the highest recall in the highly focused (0.9739) and unfocused (0.9811) classes. The confusion matrix indicates that the focused class was detected most consistently, while highly focused and unfocused classes were often misclassified as focused, resulting in lower precision. In real-time testing, the system operated at an average of 7 FPS and worked optimally when students faced the camera directly with sufficient lighting, but its performance decreased significantly at face angles greater than 45°. User evaluation shows that 75% of students rated the detection results as accurate/very accurate with an average satisfaction score of 3.6 out of 5, and 75% felt assisted in recognizing their concentration level. From the teachers’ perspective, most stated that the results were consistent with classroom observations, and all expressed willingness to reuse the system.

https://doi.org/10.62951/ijamc.v2i1.74

Open Access Website Google Scholar

Analisis Sentimen Tren 'Kabur Aja Dulu' pada Sosial Media X sebagai Dasar Perancangan Sistem Pemantauan Sentimen Publik Menggunakan Naive Bayes dan SVM

Sutisna Sutisna; Tri Wahyudi; Dwi Swasono Rachmad; Fachrur Rozi

International Journal of Information Engineering and Science• 2026 •Asosiasi Riset Teknik Elektro dan Infomatika Indonesia

Social media X (Twitter) has become the main platform for the Indonesian public to express opinions, including on the trend of 'kabur aja dulu' (let's just run away for a bit). This research aims to classify the sentiments of the public using the Naïve Bayes and Support Vector Machine (SVM) methods, and to compare the accuracy of both in sentiment analysis. Data was collected via the Twitter API with the hashtag #kaburajadulu, resulting in 2,067 tweets, which, after the cleansing process and manual labeling, left 385 data points. The analysis process followed the CRISP-DM stages, which include business understanding, data understanding, data preparation, modeling, evaluation, and deployment. Model evaluation was conducted using a confusion matrix with accuracy, precision, and recall metrics. The classification results show that 82% of tweets have a positive sentiment and 18% negative. The Naïve Bayes algorithm achieved an accuracy of 86.49%, slightly lower than SVM, which reached 88.05%. In conclusion, Support Vector Machine is more effective in sentiment classification on public opinion data. This research contributes to the digital mapping of public opinion and recommends the development of automatic labeling methods as well as the exploration of advanced algorithms in the future.

https://doi.org/10.62951/ijies.v2i3.79

Open Access Website Google Scholar

Implementasi Algoritma Damerau-Levenshtein Untuk Pemeriksaan Dan Koreksi Kesalahan Ejaan Bahasia Indonesia

Nugroho, Okvi; Ahmad Rahmatika; Tri Andre Anu; Maulidya Rahmah

JURNAL ILMIAH SAINS TEKNOLOGI DAN INFORMASI (JITI)• 2026 •CV. ALIM'SPUBLISHING

This study implements the Damerau-Levenshtein algorithm for an Indonesian spelling checking and correction system based on the distance editing approach. The main objective of this study is to develop a system capable of automatically detecting and correcting spelling errors at the character level through a matching process against the KBBI dictionary and the Indonesian corpus. The methods used include data collection, text pre-processing, system design, and implementation of the Damerau-Levenshtein algorithm which includes insertion, deletion, substitution, and transposition operations. Testing was conducted using 25 test data consisting of standard words and modified words for typographical errors. The results show that the system is able to measure all test data with an accuracy level of 100% on a limited dataset. In addition, the average Damerau-Levenshtein Distance value of 0.84 indicates that most errors are in the light category. Evaluation using a confusion matrix produces precision, recall, and F1-score values of 100% each. These results indicate that the Damerau-Levenshtein algorithm is effective in handling character-based spelling errors. However, the system still has limitations in handling complex semantic contexts and language variations. Therefore, further research is recommended to integrate language model-based approaches to improve the system's accuracy and generalization on real-world data.

https://doi.org/10.59024/jiti.v4i2.1943

Open Access Website Google Scholar

Perbandingan Algoritma K-Nearest Neighbors dan Naïve Bayes dalam Penentuan Penerima Bantuan di Desa Banyuputih Kidul

Thoriq Wahyu Hidayatullah; Ulya Anisatur Rosyidah; Nur Qodariyah Fitriyah

JURNAL PENELITIAN TEKNOLOGI INFORMASI DAN SAINS (JPTIS)• 2026 •Institut Teknologi dan Bisnis (ITB) Semarang

The distribution of social assistance represents a key government strategy to enhance the welfare of low-income communities. Nevertheless, its implementation frequently faces challenges related to inaccurate targeting, often caused by uneven data collection and subjective decision-making processes in identifying eligible beneficiaries. This study aims to compare the performance of the K-Nearest Neighbors (KNN) and Naïve Bayes algorithms in determining eligibility for social assistance recipients in Banyuputih Kidul Village. Both models were evaluated using a confusion matrix with performance indicators including accuracy, precision, recall, and F1-score. The findings reveal that the KNN algorithm outperformed Naïve Bayes in identifying recipients of the PKH social assistance program, achieving an evaluation score of 99%, compared to 86% for Naïve Bayes. These results indicate that KNN provides higher predictive reliability for eligibility classification. This research is expected to support the development of an objective, data-driven decision support system that can assist village governments in distributing social assistance more accurately and transparently.

https://doi.org/10.54066/jptis.v4i1.3733

Open Access Website Google Scholar

Implementasi Algoritma Damerau-Levenshtein Untuk Pemeriksaan Dan Koreksi Kesalahan Ejaan Bahasia Indonesia

Nugroho, Okvi; Ahmad Rahmatika; Tri Andre Anu; Maulidya Rahmah

JURNAL ILMIAH SAINS TEKNOLOGI DAN INFORMASI (JITI)• 2026 •CV. ALIM'SPUBLISHING

This study implements the Damerau-Levenshtein algorithm for an Indonesian spelling checking and correction system based on the distance editing approach. The main objective of this study is to develop a system capable of automatically detecting and correcting spelling errors at the character level through a matching process against the KBBI dictionary and the Indonesian corpus. The methods used include data collection, text pre-processing, system design, and implementation of the Damerau-Levenshtein algorithm which includes insertion, deletion, substitution, and transposition operations. Testing was conducted using 25 test data consisting of standard words and modified words for typographical errors. The results show that the system is able to measure all test data with an accuracy level of 100% on a limited dataset. In addition, the average Damerau-Levenshtein Distance value of 0.84 indicates that most errors are in the light category. Evaluation using a confusion matrix produces precision, recall, and F1-score values of 100% each. These results indicate that the Damerau-Levenshtein algorithm is effective in handling character-based spelling errors. However, the system still has limitations in handling complex semantic contexts and language variations. Therefore, further research is recommended to integrate language model-based approaches to improve the system's accuracy and generalization on real-world data.

https://doi.org/10.59024/jiti.v4i2.1943

Open Access Website Google Scholar

Klasifikasi Gaya Hidup Siswa Menggunakan Metode K-Nearest Neighbors

Syahrina Indah Harahap; Ilka Zufria; Abdul Halim Hasugian

JURNAL ILMIAH SAINS TEKNOLOGI DAN INFORMASI (JITI)• 2026 •CV. ALIM'SPUBLISHING

This research aims to classify students’ lifestyles using the K-Nearest Neighbors (KNN) algorithm. The dataset consists of 392 high school students obtained from Kaggle, with key attributes including study hours, social media usage, Netflix viewing duration, attendance, sleep quality, internet quality, mental health, and extracurricular activities. KNN was chosen for its simplicity in distance-based classification, measured using Euclidean Distance. The data was divided into training and testing sets, then evaluated using accuracy and a confusion matrix. The results show that KNN effectively classifies students’ lifestyles into four categories: healthy, less active, at risk, and highly at risk. This classification is expected to assist educational institutions, parents, and students in understanding lifestyle patterns and their impact on academic performance and mental well-being. Furthermore, this study emphasizes the relevance of applying machine learning in education, aligned with Islamic values concerning health, discipline, and the optimal use of time.

https://doi.org/10.59024/jiti.v4i2.1785

Open Access Website Google Scholar

Klasifikasi Gaya Hidup Siswa Menggunakan Metode K-Nearest Neighbors

Syahrina Indah Harahap; Ilka Zufria; Abdul Halim Hasugian

JURNAL ILMIAH SAINS TEKNOLOGI DAN INFORMASI (JITI)• 2026 •CV. ALIM'SPUBLISHING

This research aims to classify students’ lifestyles using the K-Nearest Neighbors (KNN) algorithm. The dataset consists of 392 high school students obtained from Kaggle, with key attributes including study hours, social media usage, Netflix viewing duration, attendance, sleep quality, internet quality, mental health, and extracurricular activities. KNN was chosen for its simplicity in distance-based classification, measured using Euclidean Distance. The data was divided into training and testing sets, then evaluated using accuracy and a confusion matrix. The results show that KNN effectively classifies students’ lifestyles into four categories: healthy, less active, at risk, and highly at risk. This classification is expected to assist educational institutions, parents, and students in understanding lifestyle patterns and their impact on academic performance and mental well-being. Furthermore, this study emphasizes the relevance of applying machine learning in education, aligned with Islamic values concerning health, discipline, and the optimal use of time.

https://doi.org/10.59024/jiti.v4i2.1785

Open Access Website Google Scholar

An Attention-Enhanced CNN–RBF Framework for Network Intrusion Detection in Imbalanced Traffic

Kabura, Fabrice; Nsabimana, Thierry

Journal of Computing Theories and Applications• 2026 •Universitas Dian Nuswantoro

The increasing complexity and scale of modern network traffic driven by IoT and cloud-based infrastructures have made accurate intrusion detection a critical challenge. Conventional network intrusion detection systems (NIDS) and many deep learning–based approaches struggle to reliably detect minority and stealthy attacks due to severe class imbalance and limited discrimination of subtle traffic patterns. To address these limitations, this study proposes a hybrid CNN–RBF–Attention framework for network intrusion detection. The proposed model integrates three complementary components: (i) a convolutional neural network for hierarchical feature extraction from network flow data, (ii) a radial basis function (RBF) network for localized nonlinear classification using prototype-based decision regions, and (iii) an attention mechanism that adaptively weights RBF activations to emphasize discriminative traffic patterns. SMOTE is applied exclusively to the training data to mitigate class imbalance. The framework is evaluated on the widely used CICIDS2017 and CICIDS2018 benchmark datasets in both binary and multiclass settings, using recall, precision, F1-score, confusion matrices, and ROC analysis. Experimental results demonstrate that the proposed hybrid model consistently outperforms standalone CNN and RBF baselines, particularly in terms of recall and F1-score. On the CICIDS2018 dataset, the model achieves 99.81% accuracy and 99.81% F1-score in binary classification, and 99.54% accuracy and 99.54% F1-score in multiclass classification. On CICIDS2017, it achieves 98.12% accuracy and 98.12% F1-score in binary classification, and 98.92% accuracy and 98.92% F1-score in multiclass classification. Confusion matrix and ROC analyses further show strong class separability and reliable performance in low–false-positive-rate regions, which is critical for real-world IDS deployment. These results confirm that combining deep hierarchical feature learning, localized prototype-based classification, and attention-guided refinement yields a robust, operationally reliable intrusion detection framework for highly imbalanced network environments.

https://doi.org/10.62411/jcta.15419

Open Access Website Google Scholar

Analisis Klasifikasi Pengaruh Kegagalan dan Keterbatasan Metode Pembayaran Digital terhadap Churn Pelanggan Menggunakan Decision Tree

Dewa Ayu Putu Angelina Dewi; I Wayan Sudiarsa; Ni Made Dwi Junita Sariyani; Yuvensia Armelia Sumu; Gusti Ngurah Abhimanyu

Jurnal Bisnis Inovatif dan Digital• 2026 •Asosiasi Riset Ilmu Manajemen Kewirausahaan dan Bisnis Indonesia

The rapid development of digital technology has led to an increased adoption of digital payment methods in online transaction-based businesses. However, in practice, failures and limitations in the implementation of digital payment systems still occur, potentially disrupting transaction processes and reducing customer convenience. Payment related obstacles may result in transaction cancellations and increase the risk of customer churn. This study aims to analyze the impact of failures and limitations in digital payment methods on customer churn using a classification-based approach. The data used in this research are secondary e-commerce customer data obtained from the Kaggle platform, including transaction information, payment methods, customer behavior, and historical transaction records. The research methodology consists of data preprocessing, time-based feature engineering, and classification modeling using logistic regression, decision tree, and random forest algorithms. Model performance is evaluated using accuracy, precision, recall, F1-score, and confusion matrix metrics. The results indicate that the decision tree model demonstrates superior capability in identifying churn customers compared to the other models, although it does not always achieve the highest accuracy. In addition to digital payment methods, other factors such as purchase value, transaction frequency, purchase timing patterns, and product return rates also influence customer churn. The findings highlight the importance of optimizing digital payment systems as part of customer experience enhancement strategies and customer retention efforts in online transaction–based businesses.

https://doi.org/10.61132/jubid.v3i1.1232

Open Access Website Google Scholar

Optimisation of Renal Cyst Detection in Ct Urography Images Using Neo-ZasAI Based on the YOLO Algorithm

Zarkasyi Azri Sardar; Sudiyono Sudiyono; Rini Indrati; Aisyah Widayani

Journal of Health Sciences, Nursing and Nutrition• 2026 •International Forum of Researchers and Lecturers

Background: Accurate detection of renal cysts on CT urography requires high diagnostic precision, while manual interpretation by radiologists is susceptible to inter-observer variability and potential delays in clinical decision-making. These challenges underscore the need for a reliable automated detection system to support radiological assessment. Objective: This study aims to develop and evaluate the performance of the Neo-ZasAI application based on the YOLOv8 algorithm for the automatic identification of renal cysts. Methods: Employing a Research and Development design using the ADDIE model, the study encompassed needs analysis, model design, software development, system implementation using 200 CT urography images, and diagnostic performance evaluation. Classification results generated by Neo-ZasAI were compared with radiologist readings through confusion matrix analysis and ROC–AUC assessment. Results: The findings indicate that Neo-ZasAI achieved an accuracy of 97,5%, sensitivity of 96%, specificity of 99%, positive predictive value of 98,9%, and negative predictive value of 96,1%. The ROC analysis yielded an AUC of 0.988 (p < 0.001), demonstrating excellent discriminative capability and high concordance with radiologist interpretations as the diagnostic gold standard. Conclusion: These results suggest that Neo-ZasAI is capable of performing rapid, consistent, and accurate renal cyst detection and is thus feasible for implementation as a clinical decision support system in radiology, with potential integration into PACS workflows and further development to enhance model generalizability.

https://doi.org/10.70062/greenhealth.v3i1.268

Open Access Website Google Scholar

Analisis Prediksi Customer Churn pada Sektor E-Commerce Berdasarkan Perilaku Transaksi Menggunakan Pendekatan Machine Learning

Nadeerah Hani’ Fauziyyah; I Wayan Sudiarsa; Ida Ayu Eka Sastradewi; Kadek Agustine Yueyin Parisya; Sartika Sartika

Jurnal Manajemen Bisnis Digital Terkini• 2026 •Asosiasi Riset Ilmu Manajemen Kewirausahaan dan Bisnis Indonesia

Because it directly impacts revenue, customer loyalty, and long-term business sustainability, customer churn is a critical issue for the e-commerce industry. High churn rates indicate that a business is unable to retain existing customers, which means it is more expensive to acquire new customers. Therefore, a precise analytical approach is needed to identify customer behavior patterns that are likely to churn. Using machine learning methods, this study analyzes and predicts customer churn. For this study, the E-Commerce Customer Churn 2025 dataset, obtained from Kaggle, was used. This dataset consists of 10,000 customer data and contains fifteen variables covering transaction behavior, customer characteristics, and churn status. Data preprocessing, descriptive analysis, exploratory data analysis (EDA), and classification model development using Logistic Regression and Random Forest algorithms were part of the research project. Model evaluation was conducted using a Confusion Matrix and Receiver Operating Characteristic (ROC) Curve to evaluate the model's accuracy and ability to distinguish between churned and non-churned customers. The results showed that the Random Forest model performed better than Logistic Regression, with an ROC-AUC of 1.00. Furthermore, feature importance analysis revealed that the days_since_last_purchase variable was the most dominant factor in predicting customer churn. These findings are expected to help e-commerce companies design more effective, data-driven customer retention strategies.

https://doi.org/10.61132/jumbidter.v3i1.1228

Open Access Website Google Scholar

Prediksi Aritmia pada Lansia mengunakan Linear Regression berdasarkan Data Ekg

Sinaga, Willy; Prabowop, Agung; Siahaan, Yonathan Christian; Govandy, Govandy

Dinamik• 2026 •Universitas Stikubank

This study aims to develop a predictive model using linear regression to identify potential arrhythmias in the elderly based on electrocardiogram (ECG) data. Data were collected through observations at healthcare facilities from elderly patients with indications of arrhythmia, then preprocessed such as cleaning, normalization, feature selection, and outlier checking were carried out. The features used include PR interval, QRS duration, QT interval, and heart rate. The dataset was divided into training data (80%) and test data (20%) to build and evaluate the model. The training results showed that the model was able to predict the risk of arrhythmia with a Mean Squared Error (MSE) value of 0.15 and a coefficient of determination (R²) close to 1. Evaluation using a confusion matrix showed an accuracy of 76.19%, precision of 82.80%, recall of 76.19%, and F1 score of 72.70%. These results prove that linear regression can be used as an initial approach in the early detection of arrhythmias non-invasively in the elderly. This study provides a foundation for the development of ECG data-based clinical decision support systems and suggests future exploration of more complex models and integration with real-time monitoring technologies.

https://doi.org/10.35315/dinamik.v31i1.10328

Open Access Website Google Scholar

Identifikasi Aritmia pada Lansia menggunakan Algoritma K-Nearest Neighbor berdasarkan Data ElektroKardiogram

Siahaan, Maherni; Panjaitan, Sabina; Purba, Agnes Alvionita; Cahya, Mutiara; Simarmata, Allwin M.

Dinamik• 2026 •Universitas Stikubank

Aritmia merupakan gangguan irama jantung yang umum terjadi pada lansia dan dapat menimbulkan risiko kesehatan serius jika tidak terdeteksi secara dini. Penelitian yang dilakukan bertujuan untuk mengidentifikasi aritmia pada lansia menggunakan algortima K- Nearest Neighbor (KNN) berdasarkan data elektrokardiogram (EKG). Data yang digunakan berjumlah 105 data EKG lansia yang diperoleh dalam format CSV. Proses awal melibatkan pembersihan dan normalisasi data menggunakan metode StandardScaler, serta pelabelan awal menggunakan algoritma K-Means Clustering untuk mengelompokkan data ke dalam dua kelas: Normal dan Sangat Berpotensi Aritmia. Data kemudian dibagi menjadi 70% data latih dan 30% data uji dengan metode stratified split untuk menjaga proporsi label. Model KNN dilatih dengan parameter k = 3, dan dievaluasi menggunakan confusion matrix serta classification report. Hasil pengujian menunjukkan akurasi model sebesar 97% dengan nilai precision dan recall yang tinggi pada kedua kelas. Hasil ini menunjukkan bahwa algoritma KNN efektif dalam mengklasifikasikan kondisi aritmia pada lansia dan memiliki potensi untuk diterapkan dalam sistem pendukung diagnosis berbasis data EKG.

https://doi.org/10.35315/dinamik.v31i1.10336

Open Access Website Google Scholar

Perbandingan Algoritma Naïve Bayes Classifier (NBC) dengan Random Forest Untuk Klasifikasi Penyakit Ginjal Kronis (PGK)

Caterina Paras Dewi; Jasmir Jasmir; Willy Riyadi; Alya Rafina

Prosiding Seminar Nasional Ilmu Teknik• 2025 •Asosiasi Riset Ilmu Teknik Indonesia

Chronic Kidney Disease (CKD) is a heterogeneous disorder that gradually affects the structure and function of the kidneys, is difficult to recover, and causes the body to be unable to maintain metabolism and fail to maintain fluid and electrolyte balance, leading to increased urea levels. Chronic kidney disease data was obtained from Kaggle, in this study a comparison was made between two classification algorithms, namely Naïve Bayes Classifier (NBC) and Random Forest because it is not yet known what algorithm is best in classifying chronic kidney disease (CKD). Both algorithms are evaluated based on performance metrics such as accuracy, precision, recall, and confusion matrix. The results of the evaluation showed that in a dataset of 400 samples, the performance of the Naïve Bayes Classifier (NBC) algorithm obtained an accuracy of 94%, while Random Forest had an accuracy of 93%. Then in the small dataset (158 data), Random Forest got a better accuracy score with 87% compared to the Naïve Bayes Classifier (NBC) of 78%. Based on the results of the evaluation, Random Forest has a more stable performance on small datasets, while Naïve Bayes Classifier (NBC) provides higher performance on larger datasets in the context of chronic kidney disease classification.

https://doi.org/10.61132/prosemnasproit.v2i2.72

Open Access Website Google Scholar

Perbandingan Akurasi BERT dan RNN pada Analisis Sentimen Komentar Hotel

Mahruzar, Mahruzar; Setiawan Assegaff; Jasmir Jasmir; Yosefina Venus

Prosiding Seminar Nasional Ilmu Teknik• 2025 •Asosiasi Riset Ilmu Teknik Indonesia

The increasing volume of online hotel reviews provides valuable insights into customer perceptions but poses challenges for manual analysis due to its unstructured nature. This study aims to compare the performance of Recurrent Neural Network (RNN) and Bidirectional Encoder Representations from Transformers (BERT) in hotel review sentiment analysis. A total of 20,491 TripAdvisor hotel reviews were classified into three sentiment categories: negative, neutral, and positive. The research methodology includes text preprocessing, stratified data splitting, class imbalance handling using Random Over-Sampling, tokenization, and supervised model training. Model performance was evaluated using a confusion matrix and classification metrics. The results indicate that BERT outperforms RNN, achieving an accuracy of 80.54%, while RNN reached 62.21%. BERT demonstrated superior capability in capturing contextual and semantic information in hotel reviews. These findings suggest that transformer-based models are more effective for sentiment analysis of complex textual data in the hospitality domain and can support data-driven service improvement strategies.

https://doi.org/10.61132/prosemnasproit.v2i2.184

Open Access Website Google Scholar

Analisis Sentimen Publik Terhadap Kebijakan Efisiensi Anggaran Menggunakan Naive Bayes, dan SVM

Elin Tamaya; Sharipuddin Sharipuddin; Nurhadi Nurhadi

Prosiding Seminar Nasional Ilmu Teknik• 2025 •Asosiasi Riset Ilmu Teknik Indonesia

Budget efficiency is an important issue in state financial management because it is directly related to government spending priorities and their impact on public service programs. Discussions about budget efficiency policies are widespread on social media platform X, generating diverse public responses, thus necessitating an automated approach to understand public opinion trends more quickly and objectively. This research aims to analyze the sentiment of Indonesian people toward budget efficiency policies and compare the performance of the Naïve Bayes and Support Vector Machine (SVM) algorithms in classifying sentiment. The research data used 10,909 Indonesian-language tweets sourced from a public dataset, which were then processed thru the preprocessing stages including cleaning, case folding, normalization, tokenization, stopword removal, and stemming. Sentiment labeling is performed automatically using the Indonesian Sentiment Lexicon (InSet) approach to categorize data into positive, negative, and neutral sentiments. Feature extraction was performed using Term Frequency–Inverse Document Frequency (TF-IDF), and then the data was divided into training and testing sets with an 80:20 ratio. Model performance evaluation was conducted using a confusion matrix and the metrics of accuracy, precision, recall, and F1-score. The research results show that sentiment distribution is dominated by negative sentiment at 56.78%, followed by positive sentiment at 37.40%, and neutral sentiment at 5.83%. In the classification stage, SVM performed best with an accuracy of 86%, while Naïve Bayes achieved an accuracy of 74%. These findings indicate that SVM is more optimal for sentiment classification on social media text data and can be utilized to more effectively support the analysis of public response to budget efficiency policies.

https://doi.org/10.61132/prosemnasproit.v2i2.170

Open Access Website Google Scholar