SciRepID - Scientific Publication Search

Implementasi model Naive Bayes multikategori untuk analisis sentimen produk Wardah di E-commerce Shoppe

Mesra Betty Yel; Sopan Adrianto; Rasiban Rasiban; Eva Widiyanti

International Journal of Information Engineering and Science• 2026 •Asosiasi Riset Teknik Elektro dan Infomatika Indonesia

The growth of information technology has driven changes in consumer behavior, one of which is through e-commerce platforms such as Shopee. This phenomenon has generated a large number of customer reviews, including those for local cosmetic products such as Wardah. These reviews serve as an important source of information for understanding customer perceptions and satisfaction levels. However, manual analysis of large and linguistically diverse datasets is inefficient and potentially subjective. This study aims to implement the multi-category Naive Bayes algorithm to classify the sentiment of Wardah product reviews on Shopee into three categories: positive, negative, and neutral. The data were collected using a web scraping technique and processed through a series of preprocessing stages including case folding, tokenization, stopword removal, stemming, and text cleaning. Subsequently, term weighting was performed using the TF-IDF method prior to classification. Model performance was evaluated using a confusion matrix as well as accuracy, precision, and recall metrics. The results indicate that the multi-category Naive Bayes algorithm achieved an accuracy of 86.00%, a precision of 86.63%, and a recall of 98.24%. This approach can assist business practitioners in objectively understanding customer opinions and support decision-making in business strategy and product development.

https://doi.org/10.62951/ijies.v2i2.6

Open Access Website Google Scholar

ANALISIS SENTIMEN TRENDING TOPIK #INDONESIAGELAP DI MEDIA SOSIAL X MENGGUNAKAN ALGORITMA NAIVE BAYES BERBASIS PARTICLE SWARM OPTIMIZATION

Untung Surapati; Veri Arinal; Tri Wahyudi; Ahmad Fauzan

International Journal of Applied Mathematics and Computing• 2026 •Asosiasi Riset Ilmu Matematika dan Sains Indonesia

The rise of social media has created a digital public sphere that enables users to express their opinions on social and political issues openly and in real-time. One of the most discussed topics on social media platform X is the trending hashtag #IndonesiaGelap, which reflects public concern and criticism regarding various governmental and societal conditions. This study aims to conduct sentiment analysis on tweets containing the hashtag to determine the overall sentiment trend among users. The method employed in this research is the Naive Bayes classification algorithm, known for its simplicity and effectiveness in text classification. To enhance the model’s performance, Particle Swarm Optimization (PSO) is applied to optimize feature selection and parameter tuning. The dataset consists of public tweets collected via the Twitter API, followed by preprocessing, feature extraction using TF-IDF, and sentiment classification into three categories: positive, negative, and neutral. The results indicate that the integration of PSO significantly improves the classification accuracy of the Naive Bayes model compared to the baseline. The majority of tweets related to #IndonesiaGelap exhibit a negative sentiment, indicating widespread public dissatisfaction and criticism. This research is expected to contribute to a better understanding of public perception and serve as valuable input for stakeholders in addressing social issues in the digital age.

https://doi.org/10.62951/ijamc.v2i2.127

Open Access Website Google Scholar

Komparasi Algoritma SVM dan Random Forest Dalam Sentimen Analisis Review Shopee di Google Play Store Dengan Anova

Eko Susanto; Sharipuddin Sharipuddin; Benni Purnama

Prosiding Seminar Nasional Ilmu Teknik• 2026 •Asosiasi Riset Ilmu Teknik Indonesia

The rapid growth of e-commerce in Indonesia, particularly the Shopee platform, has generated a large volume of user reviews on the Google Play Store, which can be analyzed to understand consumer sentiment. This study aims to compare the performance of the Support Vector Machine (SVM) and Random Forest (RF) algorithms in binary sentiment classification (positive and negative) on Shopee reviews, as well as to statistically test the significance of their differences using One-Way ANOVA. A total of 400,498 reviews were collected via web scraping, preprocessed through text normalization, tokenization, and Indonesian language stemming, and then feature-extracted using TF-IDF and Count Vectorizer. Evaluation results show that SVM achieved an accuracy of 91.77%, precision of 91.49%, recall of 91.77%, and F1-Score of 91.56%, while RF achieved an accuracy of 90.07%, precision of 91.68%, recall of 90.07%, and F1-Score of 90.55%. ANOVA confirmed that the performance difference between the two algorithms is statistically significant (p-value = 0.0007) with a large effect size (η² = 0.1815). Therefore, SVM is recommended as a more optimal and consistent algorithm for automated sentiment analysis of Indonesian e-commerce reviews, while also providing a replicable methodological framework for similar future research.

https://doi.org/10.61132/prosemnasproit.v2i2.177

Open Access Website Google Scholar

Klasifikasi Berita Hoaks Menggunakan Algoritma Support Vector Machine

Putri Ramadani; Nur Aisyah Pandia; Salsabila Putri Hati Siregar

Prosiding Seminar Nasional Ilmu Teknik• 2026 •Asosiasi Riset Ilmu Teknik Indonesia

The spread of hoax news in digital media is a serious problem because it can affect public opinion and social stability. This study aims to classify hoax news using the Support Vector Machine (SVM) algorithm. The dataset used is a hoax clarification dataset from the Ministry of Communication and Digital (Komdigi) of the Republic of Indonesia, totaling 1,872 data. The research process includes data collection, text pre-processing, feature extraction using TF-IDF, and classification using the SVM algorithm. Implementation was carried out using Google Colaboratory (Google Colab). Test results show that the SVM algorithm is able to provide good performance in classifying hoax news based on its topic with satisfactory accuracy, precision, recall, and F1-score values.

https://doi.org/10.61132/prosemnasproit.v2i2.201

Open Access Website Google Scholar

Analisis Sentimen Publik pada TikTok terhadap Rencana Penerapan Sistem Balik Nama Ponsel Bekas menggunakan Naive Bayes dan Support Vector Machine

Afif Lustyo Muji; Aziz Musthofa; Dihin Muriyatmoko

Prosiding Seminar Nasional Ilmu Teknik• 2026 •Asosiasi Riset Ilmu Teknik Indonesia

Since the announcement of the policy plan for a name transfer system in the sale of used mobile phones, the issue has attracted widespread public attention and discussion. People have expressed their opinions on social media platforms, particularly TikTok. This study aims to classify the sentiment of TikTok users using Naive Bayes and Support Vector Machine (SVM) algorithms. The data were collected through a comment scraping technique on related content.The research stages include text preprocessing, sentiment labeling into positive, negative, and neutral categories, and feature extraction using TF-IDF. The classification process employs Naive Bayes and Support Vector Machine algorithms, which are then evaluated based on accuracy, precision, recall, and F1-score. The results of this study indicate that both methods are capable of classifying sentiment effectively. However, the Support Vector Machine method is superior to the Naive Bayes method with an accuracy rate of 99.57% compared to 94.30%. This study is expected to help the government understand public responses to the planned policy of the used mobile phone name transfer system.

https://doi.org/10.61132/prosemnasproit.v2i2.198

Open Access Website Google Scholar

Analisis Sistem Temu Balik Sertifikat Pendidik Menggunakan Metode Cosine Similarity

Rangga Wahyu Dealova; Deo Pradana; Ali Akbar Ramadhan; Safrizal Safrizal

Jurnal Kendali Teknik dan Sains• 2026 •International Forum of Researchers and Lecturers

Educator certificates are official documents that play a crucial role for teachers, as they serve as legal proof of professional competence and are required for various administrative purposes, such as professional allowance applications, promotion, transfer, and institutional accreditation. Along with the increasing number of educators in Indonesia, the volume of educator certificate data managed by educational institutions has also grown significantly. However, certificate management is still largely conducted in a conventional manner, functioning merely as digital or physical archives without an effective search mechanism, resulting in inefficiencies and difficulties in retrieving relevant documents. Therefore, an information retrieval approach is needed to support fast and accurate document searching. This study aims to analyze and implement an information retrieval system for educator certificates using the Cosine Similarity method. The research data consist of educator certificate documents, including professional educator certificates, training certificates, and competency certificates. The retrieval process involves text preprocessing, term weighting using TF-IDF, and similarity measurement using Cosine Similarity. The results show that document d1 (Professional Mathematics Educator Certificate) has the highest similarity value to the query “educator certificate,” as it contains all query terms with relatively high TF-IDF weights. Document d3 ranks second due to partial term similarity, while document d2 has the lowest similarity value because it shares only one common term with the query. These findings indicate that the Cosine Similarity method is effective in ranking educator certificate documents based on their content relevance in an objective and measurable manner. The proposed system can improve the efficiency and accuracy of educator certificate document management and retrieval in educational institutions.

https://doi.org/10.59581/jkts-widyakarya.v4i1.5877

Open Access Website Google Scholar

Sistem Temu Balik Informasi Data Penerima Bantuan PIP pada Sekolah Menengah Pertama Menggunakan Metode TF-IDF

Achmad Faris Fadhlulah; Dika Arif Sihombing; Muhammad Fahri Rinanda; Rizki Riandi; Sotar Ferdinand Hutabarat

Jurnal Kendali Teknik dan Sains• 2026 •International Forum of Researchers and Lecturers

The Indonesia Smart Program (Program Indonesia Pintar/PIP) is a government initiative aimed at ensuring equal access to education for students from underprivileged families, including those at the junior high school (SMP) level. However, at the school level, the management of PIP recipient data still faces several challenges, particularly in data searching and utilization, due to the increasing volume of data and the use of simple or manual search methods. These conditions can lead to delays in obtaining information and reduce the accuracy of decision-making. Therefore, an effective information retrieval system is needed to manage and search PIP recipient data efficiently. This study aims to design and develop an Information Retrieval System for PIP recipient data at the junior high school level using the Term Frequency–Inverse Document Frequency (TF-IDF) method. The TF-IDF method is applied to assign weights to terms in each document, enabling the system to identify and rank documents based on their relevance to user queries. The test results show that the system is able to measure document relevance accurately, where documents D3 and D4 obtain the highest similarity value of 0.099586089 and are classified as highly relevant, while other documents show lower similarity values down to zero. These results are also supported by graphical visualization, which helps users compare relevance levels more clearly. Thus, the implementation of the TF-IDF method has proven to be effective in supporting accurate, efficient, and systematic searching and management of PIP recipient data at the junior high school level.

https://doi.org/10.59581/jkts-widyakarya.v4i1.5875

Open Access Website Google Scholar

Analisis Sentimen X: Kegagalan Timnas ke Piala Dunia 2026 dengan Naive Bayes

Aditya Abdulloh Masykur; Aditya Abdulloh Masykur; Rino Raihan Gumilang; Harun Al Rosyid

Jurnal Elektronika dan Komputer• 2026 •STEKOM PRESS

The performance of the Indonesian National Team (Timnas) in the 2026 World Cup qualifications has triggered massive and diverse responses on social media, particularly on platform X. This study aims to identify and classify public sentiment regarding Timnas Indonesia's performance into positive, negative, and neutral categories using a data mining approach. Text data was processed through pre-processing stages, term weighting using TF-IDF, and the application of the Synthetic Minority Over-sampling Technique (SMOTE) to address significant class distribution imbalance. The classification algorithm employed was Multinomial Naïve Bayes. Model performance evaluation was conducted by comparing two training-testing data split scenarios: 90:10 and 80:20 ratios. The results indicate that public opinion is dominated by negative sentiment at 73.2%, reflecting public disappointment. In terms of model performance, the 90:10 ratio scenario yielded the best accuracy of 80%, outperforming the 80:20 ratio which recorded an accuracy of 75%. These findings demonstrate that combining Multinomial Naïve Bayes with the SMOTE technique is effective in handling imbalanced text data and is capable of accurately mapping public perception.

https://doi.org/10.51903/elkom.v18i2.3371

Open Access Website Google Scholar