SciRepID - Scientific Publication Search

Publication Search

18,135 articles from 385 journals · 1,447 citations tracked

Showing 1-20 of 95

Analytics

Putri Maria Theresia Kehi; I Wayan Sudiarsa; Maria Oktaviani Suryati; Yosefina Dehadi; Maria Karlinda

Saturnus: Jurnal Teknologi dan Sistem Informasi 2026 Asosiasi Riset Teknik Elektro dan Informatika Indonesia

This study aims to analyze consumer purchasing behavior on e-commerce platforms using the Decision Tree algorithm as an easily interpretable classification method. The dataset used consists of 12,330 transaction records with 18 attributes representing visitor characteristics and user activities during interactions with the e-commerce platform. The research stages include data exploration to identify initial patterns, data preprocessing to handle missing values and class imbalance, splitting the data into training and testing sets, training the Decision Tree model, evaluating model performance, and visualizing the tree structure to analyze decision rules.The test results show that the Decision Tree model with a maximum depth of 3 achieves fairly good performance, with an average accuracy of 89.78%, precision of 69.82%, recall of 59.95%, and an F1-score of 64.51% for the buyer class. The visualization of the decision tree provides clear interpretation of the main attributes influencing purchasing decisions, thereby facilitating understanding for non-technical decision makers. Overall, this study demonstrates that the Decision Tree method is effective in modeling consumer purchasing behavior in e-commerce and can be utilized as a basis for data-driven business decision making, particularly in marketing strategies and improving sales conversion rates.

Marjelin Putri Ndaparoka; Stefanus D.I. Mau; Sihang Gregorius Bali Mema

Modem : Jurnal Informatika dan Sains Teknologi 2026 Asosiasi Profesi Telekomunikasi Dan Informatika Indonesia

Savings and Loan Cooperatives (KSP) play a vital role in expanding community access to capital, especially within the informal sector. Nevertheless, non-performing loans remain a persistent challenge that can threaten liquidity and long-term institutional sustainability. KSP CU Mera Ndi Ate faces similar issues, which are assumed to stem not only from administrative weaknesses but also from members’ perceptions and behavioral factors. This research aims to examine the potential causes of non-performing loans through text-based sentiment analysis using an unsupervised learning approach. A quantitative method with a data mining framework was applied. Data were gathered through interviews, observations, documentation, and 200 customer opinion texts processed using the Orange Data Mining application. The analytical stages included preprocessing, corpus development, feature extraction, sentiment clustering, and visualization. Because the dataset lacked predefined labels, unsupervised learning was used to identify naturally emerging sentiment patterns. Findings reveal a predominance of critical sentiments related to credit assessment procedures and service quality. The highest sentiment score (75) concerned insufficient creditworthiness evaluation, followed by concerns about service efficiency (66.6667). These insights suggest that improving assessment accuracy and service quality may help reduce non-performing loans.

Ayyub Hamdanu Budi Nurmana MS; Andik Prakasa Hadi; Rudjiono Rudjiono

Digital Multimedia and Visualization Technology 2026 Asosiasi Pengelola Jurnal Informatika dan Komputer Indonesia

This study explores the role of visual analytics in enhancing decision-making processes within creative industries, focusing on its application to large-scale multimedia datasets. Visual analytics integrates interactive visualization techniques with computational algorithms, enabling users to explore complex datasets intuitively and derive actionable insights. The research centers on the design and implementation of interactive dashboards tailored to the creative sector, particularly film, music, and advertising industries, to facilitate real-time data exploration. The study also investigates the usability of these tools through expert-based evaluations, aiming to assess their effectiveness in supporting informed and timely decision-making. The findings reveal that interactive visualizations significantly improve insight discovery and pattern recognition, enabling decision-makers to uncover hidden trends in large multimedia datasets. However, challenges related to scalability, user acceptance, and real-time processing were encountered during the implementation phase. The research highlights the practical benefits of integrating visual analytics into industry workflows, which include enhanced content creation, audience engagement, and strategic planning. Furthermore, the study identifies key visual analytics techniques such as dynamic dashboards, pattern recognition, data mining, and clustering, which are essential for analyzing multimedia data. The study concludes by emphasizing the potential for wider applications of visual analytics in other sectors, suggesting future research directions to improve tool performance, scalability, and user accessibility, as well as exploring the integration of emerging technologies like artificial intelligence and virtual reality.

Aditya Abdulloh Masykur; Aditya Abdulloh Masykur; Rino Raihan Gumilang; Harun Al Rosyid

Jurnal Elektronika dan Komputer 2026 STEKOM PRESS

The performance of the Indonesian National Team (Timnas) in the 2026 World Cup qualifications has triggered massive and diverse responses on social media, particularly on platform X. This study aims to identify and classify public sentiment regarding Timnas Indonesia's performance into positive, negative, and neutral categories using a data mining approach. Text data was processed through pre-processing stages, term weighting using TF-IDF, and the application of the Synthetic Minority Over-sampling Technique (SMOTE) to address significant class distribution imbalance. The classification algorithm employed was Multinomial Naïve Bayes. Model performance evaluation was conducted by comparing two training-testing data split scenarios: 90:10 and 80:20 ratios. The results indicate that public opinion is dominated by negative sentiment at 73.2%, reflecting public disappointment. In terms of model performance, the 90:10 ratio scenario yielded the best accuracy of 80%, outperforming the 80:20 ratio which recorded an accuracy of 75%. These findings demonstrate that combining Multinomial Naïve Bayes with the SMOTE technique is effective in handling imbalanced text data and is capable of accurately mapping public perception.

Claudia K. Hamsi; I Wayan Sudiarsa; Vinsensia P.K Abu; Sarling C. Dhai; Maria A. Serero

Mars: Jurnal Teknik Mesin, Industri, Elektro Dan Ilmu Komputer 2025 Asosiasi Riset Teknik Elektro dan Informatika Indonesia

The rapid development of digital streaming platforms such as Netflix has generated a large volume of content data with diverse characteristics, thereby requiring effective analytical methods to understand emerging patterns and trends. This study aims to classify Netflix content into two main categories, namely movies and television shows, and to analyze genre trends and content characteristics using a data mining approach with the Naive Bayes algorithm. The dataset used in this study is the Netflix Shows dataset, consisting of 8,809 content entries, with the primary features analyzed including genre, rating, and country of production. The research process begins with data exploration and preprocessing stages, including data cleaning, handling missing values, and transforming categorical features to enable effective model construction. Subsequently, the dataset is divided into training and testing sets to objectively and systematically build and evaluate the Naive Bayes classification model. Model performance is evaluated using accuracy, precision, recall, and F1-score metrics to assess the model’s ability to accurately distinguish between Netflix content types. The experimental results demonstrate that the Naive Bayes algorithm is able to classify Netflix content into Movie and TV Show categories with accuracy, precision, recall, and F1-score values of 100%, respectively. The confusion matrix indicates that no misclassification occurred, suggesting that genre, rating, and country of production features provide a very clear separation between content classes. These findings indicate that the Naive Bayes algorithm can achieve exceptionally high classification performance with optimal evaluation results. The results further reveal distinct differences in characteristics between movies and television shows based on genre and production attributes. Therefore, this study is expected to contribute to the development of content recommendation systems and strategic content management within the streaming industry.

Hendra Jatnika; Mia Kusmiati

International Journal of Management Science and Entrepreneurship 2025 International Forum of Researchers and Lecturers

Goals – Goals from studies This is For explore approach strategic in development System Information Management (SIM) as integral part in support digital transformation of modern organizations. Study This emphasize importance integration technology information , effective data management as well as improvement digital competence resources Power man in operation system. Design/ methodology / approach – Conceptual article This use method review library with analyze various work relevant academic and technical manuals , in particular related implementation of SIM in the sector public and private . Study This referring to the works Jatnika et al. (2022–2024), including utilization Microsoft Office applications as skills supporting basis​ organizational digital literacy . Findings – Findings studies This show that SIM development is not just effort technical , but rather need strategic in support digital transformation . Key strategies covers design modular systems , data mining integration , training programs based users , and evaluation system in a way periodic . Components This allows organization build responsive and adaptive SIM ecosystem . Implications practical – Organizations that want to do digital transformation is necessary invest in development digital capabilities of sources Power the human as well as ensure effectiveness use developed SIM system in a way strategic can become driving force main in increase efficiency , accuracy , and capability taking decision across work units . Originality / value – Study This offers a conceptual model structured about development of SIM in context digital transformation , based on literature applications and needs organizations in the real world . This article give outlook practical for taker policy , IT managers , and HR developers .  

Senna Hendrian; V.H Valentino; Wisdariah, Wisdariah; Riezca Talita Trista; Dudi Parulian

Neptunus: Jurnal Ilmu Komputer Dan Teknologi Informasi 2025 Asosiasi Riset Teknik Elektro dan Informatika Indonesia

Selecting a faculty that aligns with students’ interests and talents is a strategic step in determining the success of higher education and future career paths. However, most vocational high school (SMK) students still face difficulties in identifying the most suitable faculty due to the lack of data-driven analysis. This study implements the C4.5 classification algorithm within data mining techniques to build an automatic and measurable faculty recommendation system. The dataset consists of attributes such as SMK major, interest level, aptitude test results, academic grade average, and gender, with the output being the recommended faculty. The C4.5 algorithm was chosen for its ability to generate a transparent and interpretable decision tree, which helps both guidance counselors and students understand the rationale behind the recommendations. The experimental results show that the constructed classification model achieved an accuracy rate of 88%, based on cross-validation testing using data from 12th-grade students. The implementation of this system is expected to serve as an objective tool in the faculty selection process and to promote a data-driven decision-making approach in secondary education environments.

Ricardus Mba Dala Pati; Eka Kusuma Pratama; Tuslaela Tuslaela

Repeater : Publikasi Teknik Informatika dan Jaringan 2025 Asosiasi Riset Teknik Elektro dan Informatika Indonesia

JakLingko is a digital-based public transportation integration system developed to facilitate access to various transportation modes in Jakarta. Along with the increasing number of users, reviews on the JakLingko application reflect user experiences and perceptions. This study aims to analyze the sentiment of user reviews on the Google Play Store using the Naïve Bayes method. Data collection was conducted through web scraping, resulting in 3,260 reviews. The data were preprocessed, sentiment-labeled, and classified using Orange Data Mining. The research applied a quantitative experimental approach with a machine learning framework. The classification results showed that neutral sentiment dominated user reviews, followed by negative and positive sentiments. The Naïve Bayes model achieved 100% accuracy based on the confusion matrix and other evaluation metrics such as precision, recall, and F1-score. The findings highlight that Naïve Bayes can be a reliable approach for analyzing public opinion and serve as a reference for evaluating and improving digital service applications.

Kikunda, Philippe Boribo; Kasongo, Issa Tasho; Nsabimana, Thierry; Ndikumagenge, Jérémie; Ndayisaba, Longin +2 more

Journal of Computing Theories and Applications 2025 Universitas Dian Nuswantoro

This study examines the application of Educational Data Mining (EDM) to predict the academic per-formance of first-year students at the Catholic University of Bukavu and the Higher Institute of Edu-cation (ISP) in the Democratic Republic of Congo. The primary objective is to develop a model that can identify at-risk students early, providing the university with a tool to enhance student support and academic guidance. To address the challenges posed by data imbalance (where successful cases outnumber failures), the study adopts a hybrid methodological approach. First, the SMOTE algorithm was applied to balance the dataset. Then, a stacking classification model was developed to combine the predictive power of multiple algorithms. The variables used for prediction include the National Exam score (PEx), the secondary school track (Humanities), and the type of prior institution (public, private, or religious-affiliated schools), as well as age and sex. The results demonstrate that this approach is highly effective. The model is not only capable of predicting success or failure but also of forecasting students' performance levels (e.g., honors or distinctions). Moreover, the use of the Apriori association rule mining algorithm allowed the identification of faculty-specific success profiles, transforming prediction into an interpretable decision-support tool. This research makes several significant contributions. Practically, it provides the University of Bukavu with a tool for student orientation and early risk detection. Methodologically, it illustrates the effectiveness of a combined approach to EDM in an African context. However, the study acknowledges certain limitations, including the non-public nature of the data and the geographical specificity of the sample. It therefore proposes avenues for future research, such as the integration of Explainable AI (XAI) techniques for more refined and transparent analysis of the results.

Harninda Br Keliat; Novriyenni Novriyenni; Tio Ria Pasaribu

Repeater : Publikasi Teknik Informatika dan Jaringan 2025 Asosiasi Riset Teknik Elektro dan Informatika Indonesia

The Computer-Based National Assessment (ANBK) is an essential instrument designed to comprehensively measure student competence, including literacy, numeracy, and character aspects. However, in practice, many students still face various challenges during preparation, such as cognitive limitations, psychological readiness, and technical barriers, which affect their overall readiness to participate in ANBK. This study aims to analyze the readiness level of students at SMP Negeri 2 Kuala by employing the Rough Set method. The variables examined include digital literacy, subject matter understanding, psychological readiness, and school facility support. Data were collected from 250 ninth-grade students through structured questionnaires and subsequently processed using the Rosetta software to perform attribute reduction and generate decision rules. The findings indicate that digital literacy, subject matter understanding, and psychological readiness are the most influential variables in determining student readiness, while facility support serves only as a complementary factor. The extraction process generated seven decision rules with an accuracy level of 100%, which effectively classified students into three readiness categories: highly ready, ready, and less ready. These results confirm that the Rough Set method is highly effective for identifying dominant factors and producing decision rules that can guide schools in developing targeted strategies to enhance student readiness for ANBK.

Muhammad Akmal Ar Rasid; Catur Pranomo; Elkin Rilvani

Bridge : Jurnal Publikasi Sistem Informasi dan Telekomunikasi 2025 Asosiasi Profesi Telekomunikasi Dan Informatika Indonesia

This study aims to utilize data mining techniques, specifically the K-Nearest Neighbors (KNN) algorithm, to classify leaf diseases in sugarcane (Saccharum officinarum). Early and accurate detection of leaf disease types is a crucial step in prevention and control strategies, thereby reducing potential crop losses caused by pathogen attacks. Leaf diseases in sugarcane, such as leaf scald, rust, and mosaic virus, are known to affect photosynthesis, inhibit growth, and reduce the quality and quantity of sugarcane produced. The classification process in this study was carried out through image analysis of infected sugarcane leaves, where features such as color, texture, and shape were extracted using digital image processing techniques. The KNN algorithm was chosen because of its non-parametric nature, ease of implementation, and its ability to provide accurate classification results even with limited data size. The working principle of KNN is to determine the class of a new sample based on the majority class of its k nearest neighbors in the feature space, making it very suitable for the case of leaf disease image classification. In addition to building a classification model, this study also examines disease prevention strategies based on the identification results. These strategies include the use of disease-resistant sugarcane varieties, the implementation of appropriate planting patterns, land moisture management, regular plantation sanitation, and the measured and environmentally friendly use of pesticides or fungicides. Model performance evaluation was conducted using accuracy, precision, recall, and F1-score metrics to assess model effectiveness across various data scenarios. The results of this study are expected to not only contribute to the development of decision support systems for farmers and related parties but also support the application of artificial intelligence-based technology in the agricultural sector.

Angdresey, Apriandy; Sitanayah, Lanny; Rumpesak, Zefanya Marieke Philia; Ooi, Jing-Quan

Journal of Computing Theories and Applications 2025 Universitas Dian Nuswantoro

Electricity has emerged as an essential requirement in modern life. As demand escalates, electricity costs rise, making wastefulness a drain on financial resources. Consequently, forecasting electricity usage can enhance our management of consumption. This study presents an IoT-based monitoring and forecasting system for electricity consumption. The system comprises two NodeMCU micro-controllers, a PZEM-004T sensor for collecting real-time power data, and three relays that regulate the current flow to three distinct electrical appliances. The data gathered is transmitted to a web application utilizing the k-Nearest Neighbor (k-NN) algorithm to forecast future electricity usage based on historical patterns. We evaluated the system's performance using four weeks of electricity consumption data. The results indicated that predictions were most accurate when the user’s daily consumption pattern remained stable, achieving a Mean Absolute Error (MAE) of approximately 1 watt and a Mean Absolute Percentage Error (MAPE) ranging from 1% to 1.7%. Additionally, predictions were notably precise during the early morning hours (3:00 AM to 8:00 AM) when k=6 was employed. This study demonstrates the effectiveness of integrating IoT-based systems with machine learning for real-time energy monitoring and forecasting. Furthermore, it emphasizes the application of data mining techniques within embedded IoT environments, providing valuable insights into the implementation of lightweight machine learning for smart energy systems.

Ame Ananda Br Ginting; Novriyenni Novriyenni; Tio Ria Pasaribu

Repeater : Publikasi Teknik Informatika dan Jaringan 2025 Asosiasi Riset Teknik Elektro dan Informatika Indonesia

This study aims to analyze the correlation between learning models and student achievement at SMA Negeri 1 Kuala by applying the Apriori algorithm in data mining, using Rapid Miner software as the primary tool for analysis. The research is motivated by the shift in educational approaches from conventional teacher-centered methods toward more innovative strategies such as project-based learning and cooperative learning, which are expected to foster higher levels of student engagement and improve academic outcomes. In many schools, particularly at the secondary level, the choice of learning model, availability of facilities, and attendance rates are crucial factors that shape learning effectiveness and student performance. The data collected in this study include student grades, the types of learning models implemented, school facility conditions, and attendance rates for the 2023/2024 academic year, covering a total of 680 students. The Apriori algorithm was employed to discover hidden patterns and associations among these variables, enabling the identification of relationships between learning factors and academic achievement. By applying Rapid Miner software, the research systematically generated association rules that reflect meaningful correlations in the dataset. The results indicated that the use of the Indonesian language subject in combination with a cooperative learning model, adequate and complete school facilities, and good student attendance was strongly associated with the attainment of an A grade. This finding was supported by a support level of 53.33% and a confidence level of 100%, suggesting a robust and reliable relationship between these factors. The implementation of data mining techniques through Rapid Miner not only allowed for efficient data processing but also provided practical recommendations for educators and school administrators in designing effective instructional strategies.

Dina Amalia Putri; Naza Sefti Prianita; Elkin Rilvani

Jupiter: Publikasi Ilmu Keteknikan Industri, Teknik Elektro dan Informatika 2025 Asosiasi Riset Ilmu Teknik Indonesia

The issue of determining the number of students' graduation times is one of the important indicators in transmitting the quality and effectiveness of the higher education process in universities. The rate of on-time graduation not only impacts accredited institutions, but also becomes a concern for campus management in designing learning strategies and academic guidance. This study aims to apply and compare two classification algorithms in data mining, namely C4.5 and K-Nearest Neighbor KNN, in predicting the accuracy of students' graduation times. Predictions are made based on academic attributes such as Grade Point Average GPA, number of credits that have been achieved, and Semester Grade Point Average IPS as input variables. The method used in this study is Knowledge Discovery in Database KDD which includes data selection, preprocessing, transformation, data mining, and evaluation of results. The study was conducted using the RapidMiner tool, with a dataset of 279 Informatics Study Program students from the 2015 to 2019 intake. The data was classified into two categories: "graduated on time" and "not graduated on time". The test results showed that the KNN algorithm provided better performance compared to C4.5. KNN produced an accuracy of 76.08%, with a precision of 73.11% and a recall of 41.92%. Meanwhile, the C4.5 algorithm produced an accuracy of 73.49%, with a precision of 64.62% and a recall of 41.89%. This difference in accuracy indicates that KNN is more effective in capturing patterns in the data and providing more accurate predictions in this context. Thus, the KNN algorithm can be considered a more optimal method to assist universities in predicting potential student admissions in a timely manner, thus enabling early intervention for students at risk of late graduation. This research also contributes to the development of data mining-based academic decision support systems in higher education.

Feronika, Fadia; Feronika, Fadia; Ariesanto Ramdhan, Nur; Mohamad Herdian Bhakti, Raden

Jurnal Elektronika dan Komputer 2025 STEKOM PRESS

Diabetes Mellitus merupakan salah satu penyakit kronis yang jumlah penderitanya terus bertambah setiap tahunnya, termasuk di wilayah Puskesmas Brebes. Banyaknya pasien dengan kondisi klinis yang beragam mendorong perlunya suatu metode untuk mengelompokkan pasien berdasarkan tingkat keparahannya. Penelitian ini bertujuan untuk menerapkan algoritma K-Means dalam proses pengelompokan pasien Diabetes Mellitus dengan menggunakan beberapa parameter klinis, yaitu Gula Darah Puasa (GDP), kadar HbA1c, Kolesterol Total (CHOL), serta tekanan darah sistolik dan diastolik. Pendekatan yang digunakan dalam penelitian ini adalah deskriptif kuantitatif dengan metode data mining berbasis algoritma K-Means. Data yang digunakan diperoleh dari rekam medis Puskesmas Brebes. Proses klasterisasi menghasilkan tiga kelompok, yaitu kategori risiko rendah, sedang, dan tinggi. Hasil penelitian menunjukkan bahwa algoritma K-Means mampu melakukan pengelompokan data pasien secara akurat sesuai tingkat keparahan. Hasil tersebut kemudian divisualisasikan melalui sistem berbasis web yang bertujuan untuk mempermudah pihak puskesmas dalam menganalisis kondisi pasien serta mendukung pengambilan keputusan medis yang lebih efektif.

Agung Permana, Tegar; Tegar Agung Permana; Saeful Bachri, Otong; Herdian Bhakti, RM

Jurnal Elektronika dan Komputer 2025 STEKOM PRESS

Kecelakaan lalu lintas di Kabupaten Brebes merupakan masalah kritis karena tingginya frekuensi insiden yang terjadi di wilayah tersebut. Penelitian ini bertujuan untuk menentukan area yang rentan terhadap kecelakaan dengan menggunakan algoritma K-Means Clustering , yang mendukung proses pengambilan keputusan berbasis data. Isu utama yang dieksplorasi dalam penelitian ini adalah bagaimana algoritma K-Means dapat diimplementasikan untuk mengelompokkan zona rawan kecelakaan dan meningkatkan kesadaran masyarakat terhadap keselamatan jalan. Metodologi yang digunakan meliputi pengumpulan data melalui tinjauan pustaka, observasi langsung, dan wawancara, yang dilanjutkan dengan penggunaan algoritma K-Means untuk mengklasifikasikan data kecelakaan berdasarkan jumlah kejadian, korban jiwa, dan cedera. Temuan menunjukkan bahwa algoritma K-Means secara efektif mengelompokkan lokasi rawan kecelakaan ke dalam tiga tingkat risiko yang berbeda: tinggi, sedang, dan rendah. Dengan demikian, informasi yang terklasifikasi ini dapat membantu otoritas terkait dalam meningkatkan langkah-langkah keselamatan lalu lintas dan mengedukasi masyarakat tentang area berisiko tinggi. Hasil penelitian ini diharapkan dapat berkontribusi pada pengembangan kebijakan keselamatan lalu lintas yang lebih terinformasi dan strategis di Kabupaten Brebes.

Abdah Syakiroh Gustian; Fathoni Mahardika

Jupiter: Publikasi Ilmu Keteknikan Industri, Teknik Elektro dan Informatika 2025 Asosiasi Riset Ilmu Teknik Indonesia

This study aims to develop an accurate predictive model for identifying students at risk of academic dropout using Decision Tree and Random Forest algorithms. The research utilizes a publicly available dataset sourced from Kaggle, which includes academic and demographic features such as GPA, attendance, credit load, financial aid status, and exam scores. The methodology involves several stages: data collection, preprocessing (handling missing values, encoding categorical variables, and feature scaling), model training, and evaluation using performance metrics such as Accuracy, Precision, Recall, F1-Score, and Confusion Matrix. Results show that the Random Forest algorithm outperforms Decision Tree in terms of accuracy and robustness, with notable feature importance on math, reading, and writing scores. The findings highlight the potential of machine learning in early detection of dropout risks and provide actionable insights for academic institutions to design timely interventions. This research contributes to the growing field of educational data mining and supports data-driven decision-making processes in higher education management.

Eka Wulansari Fidayanthie; Asep Sayfulloh; Mardiana Rafa Alzena; Nilam Kurnia Sari

Saturnus: Jurnal Teknologi dan Sistem Informasi 2025 Asosiasi Riset Teknik Elektro dan Informatika Indonesia

Lungs are vital organs in the human respiratory system, responsible for fulfilling the body's oxygen needs. If the lungs experience health problems, it can have adverse effects on the human respiratory system. Common causes of lung diseases are usually due to inhaling air contaminated by dust, smoke, viruses, and bacteria. This study aims to compare the performance of two classification algorithms, namely Random Forest and Naive Bayes, in predicting lung diseases. The data used was obtained from the Kaggle website and processed using RapidMiner software. The attributes involved include smoking habits, pre-existing conditions, staying up late, exercise activities, age, and outcomes. Based on the test results, the Random Forest algorithm demonstrated the best performance with an accuracy of 93%, while the Naive Bayes algorithm achieved an accuracy of 87%. These findings indicate that the Random Forest algorithm outperforms the Naive Bayes algorithm in terms of lung disease prediction accuracy.

Herdina Putri Ahmadi; Magdalena Simanjuntak; Muammar Khadapi

Saturnus: Jurnal Teknologi dan Sistem Informasi 2025 Asosiasi Riset Teknik Elektro dan Informatika Indonesia

Crime is a social issue that continues to evolve alongside increasing community activity and regional development. This study aims to Cluster crime data in Binjai City based on the location of incidents using the K-Means algorithm and the Cross Industry Standard Process for Data Mining (CRISP-DM) approach. The data were obtained from the Binjai Police Department, with attributes including the type of crime, time of occurrence, and location, categorized by district. A comprehensive data preprocessing stage was carried out, involving the extraction of information from raw data, normalization of crime type labels, and conversion of categorical data into numerical form using label encoding. The optimal number of Clusters was determined using the Silhouette score method, which yielded the best result at K = 10. The Clustering results were further evaluated using the Davies-Bouldin Index (DBI) to ensure Cluster quality. The analysis revealed that Binjai Utara District has the highest number of crimes, particularly aggravated theft (curat), which frequently occurs from early morning to late morning. This Clustering is expected to provide valuable insights for authorities in formulating more targeted and data-driven regional security strategies.

Fathoni Dwi Atmoko

Uranus: Jurnal Ilmiah Teknik Elektro, Sains dan Informatika 2025 Asosiasi Riset Teknik Elektro dan Informatika Indonesia

Public transportation, with Transjakarta as its main pillar, requires a deep understanding of customer behavior to improve service quality and maintain loyalty. This study aims to segment Transjakarta customers using data mining techniques, specifically the K-Means Clustering algorithm, based on the RFM (Recency, Frequency, Monetary/Value) behavioral model. 37,900 rows of raw transaction data were processed into a clean database, resulting in 1,917 unique customers for analysis. The RFM metrics were then normalized using Min-Max Scaler. The optimal number of clusters was evaluated using the Elbow Curve and Silhouette Score Methods, which led to the determination of k = 4 clusters. The segmentation results identified four customer groups requiring specific strategies: Cluster 3 (Champions) with high R, F, and V (requiring rewards and retention); Cluster 0 (Active, Low Value) with high R and F but low V (requiring upsells and cross-sells); Cluster 1 (Potential/At-Risk); and Cluster 2 (Dormant/Lost). Preliminary analysis (EDA) showed that nearly half of customers (49.3%) used Bank DKI cards, dominated by the productive age group (25–45 years old), with the Rusun Kapuk Muara–Penjaringan route being the busiest. The main managerial recommendation is to strengthen the partnership with Bank DKI and optimize services in this busy corridor.