SciRepID - Scientific Publication Search

Analisis Sentimen Publik Terhadap Kebijakan Efisiensi Anggaran Menggunakan Naive Bayes, dan SVM

Elin Tamaya; Sharipuddin Sharipuddin; Nurhadi Nurhadi

Prosiding Seminar Nasional Ilmu Teknik• 2025 •Asosiasi Riset Ilmu Teknik Indonesia

Budget efficiency is an important issue in state financial management because it is directly related to government spending priorities and their impact on public service programs. Discussions about budget efficiency policies are widespread on social media platform X, generating diverse public responses, thus necessitating an automated approach to understand public opinion trends more quickly and objectively. This research aims to analyze the sentiment of Indonesian people toward budget efficiency policies and compare the performance of the Naïve Bayes and Support Vector Machine (SVM) algorithms in classifying sentiment. The research data used 10,909 Indonesian-language tweets sourced from a public dataset, which were then processed thru the preprocessing stages including cleaning, case folding, normalization, tokenization, stopword removal, and stemming. Sentiment labeling is performed automatically using the Indonesian Sentiment Lexicon (InSet) approach to categorize data into positive, negative, and neutral sentiments. Feature extraction was performed using Term Frequency–Inverse Document Frequency (TF-IDF), and then the data was divided into training and testing sets with an 80:20 ratio. Model performance evaluation was conducted using a confusion matrix and the metrics of accuracy, precision, recall, and F1-score. The research results show that sentiment distribution is dominated by negative sentiment at 56.78%, followed by positive sentiment at 37.40%, and neutral sentiment at 5.83%. In the classification stage, SVM performed best with an accuracy of 86%, while Naïve Bayes achieved an accuracy of 74%. These findings indicate that SVM is more optimal for sentiment classification on social media text data and can be utilized to more effectively support the analysis of public response to budget efficiency policies.

https://doi.org/10.61132/prosemnasproit.v2i2.170

Open Access Website Google Scholar

Klasifikasi Sentimen Ulasan Produk Olahraga di Tokopedia Menggunakan Metode Machine Learning dengan Pendekatan TF-IDF

Fransiskus Dapot Sihaloho; Jasmir Jasmir; Gunardi Gunardi

Prosiding Seminar Nasional Ilmu Teknik• 2025 •Asosiasi Riset Ilmu Teknik Indonesia

The rapid growth of e-commerce platforms in Indonesia, particularly Tokopedia, has resulted in a large volume of consumer reviews containing valuable information regarding customer perceptions and satisfaction. However, manual analysis of such reviews is inefficient and prone to subjectivity, necessitating an automated approach based on machine learning. This study aims to classify the sentiment of sports product reviews on Tokopedia into positive, negative, and neutral categories by applying Logistic Regression, Support Vector Machine (SVM), and Random Forest using the Term Frequency–Inverse Document Frequency (TF-IDF) approach. The data were collected through web scraping of Indonesian-language sports product reviews and processed through several preprocessing stages, including data cleaning, case folding, tokenization, stopword removal, and stemming. Feature representation was performed using TF-IDF to transform textual data into numerical vectors, after which the dataset was divided into training and testing sets with an 80:20 ratio. Model performance was evaluated using accuracy, precision, recall, and F1-score metrics. The results indicate that the application of TF-IDF significantly improves the performance of all models, with SVM consistently achieving the most optimal performance compared to Logistic Regression and Random Forest. These findings demonstrate that classical machine learning algorithms combined with TF-IDF remain highly effective for sentiment analysis of Indonesian-language text. The implications of this study are expected to assist sellers in understanding customer opinions, support consumers in making informed purchasing decisions, and serve as a foundation for the development of sentiment analysis and recommendation systems on e-commerce platforms.

https://doi.org/10.61132/prosemnasproit.v2i2.130

Open Access Website Google Scholar

Penerapan NLP Menggunakan Algoritma Naive Bayes, C4.5, XGBoost untuk Analisis Sentimen Ulasan Produk Kecantikan di Tokopedia dan Shopee

Srikandi Alifya; Jasmir Jasmir; Elvi yanti

Prosiding Seminar Nasional Ilmu Teknik• 2025 •Asosiasi Riset Ilmu Teknik Indonesia

The growth of e-commerce in Indonesia has led to an increase in product reviews, including for beauty products on Tokopedia and Shopee. These reviews serve as important sources of information to assess consumer satisfaction; however, manually analyzing thousands of reviews daily is impractical. This study applies Natural Language Processing (NLP) with Naive Bayes, C4.5, XGBoost algorithms to classify sentiment in Indonesian-language reviews. The dataset used consists of 76,256 reviews labeled as positive, negative, and neutral. The research stages include text preprocessing, feature representation using BoW and TF-IDF, data balancing through SMOTE, and model performance evaluation based on accuracy, precision, and recall. Differences in results among the algorithms were analyzed using ANOVA. The results show that Naive Bayes achieved the highest accuracy at 67.71%, followed by XGBoost at 65.91%, and C4.5 at 58.39%, with Naive Bayes performing best in identifying positive and negative sentiments, while XGBoost and C4.5 handled more complex data patterns effectively. These findings provide guidance for sentiment analysis in Indonesian and support businesses in obtaining automated insights from customer reviews to improve product quality and services.

https://doi.org/10.61132/prosemnasproit.v2i2.71

Open Access Website Google Scholar

Analisis Sentimen Ulasan Penggunaan Aplikasi Maxim Pada Google Play Store Menggunakan Algoritma Naive Bayes, SVM, CatBoost Berbasis NLP

Nanda Mediya Sari; Jasmir Jasmir; Elvi Yanti

Prosiding Seminar Nasional Ilmu Teknik• 2025 •Asosiasi Riset Ilmu Teknik Indonesia

Sentiment analysis is a technique in Natural Language Processing (NLP) used to identify user opinion tendencies based on textual reviews. This study analyzer user reviews of the Maxim application on the Google Play Store and compares three Machine Learning algoritmhs-Naïve Bayes, Support Vector Machine (SVM), and CatBoost-in classifying sentiment. The research stages include data collection, text preprocessing, feature extraction using TF-IDF and Chi-Square, class balancing using SMOTE, and performance evaluation through Accuracy, Precision, Recall, and F1-Score. ANOVA is used to examine the influence of feature selection on model performance. The results show that each model exhibits different performance level across the tested feature combinations. The CatBoost achieved the highest accuracy of 99,26% and demonstrating the most stable performance. Meanwhile, the Naïve Bayes and SVM models experienced performance decreases experiments, especially after applying SMOTE. These findings indicate that the choise of algorithm, feature extraction method, and class balancing technique significantly affects classification outcomes. Overall, CatBoost is identified as the best-performing model, providing more consistenst classification result in accordance with the characteristics of the user reviews.

https://doi.org/10.61132/prosemnasproit.v2i2.65

Open Access Website Google Scholar

Mapping Public Sentiment on Generative AI via Twitter NLP and Topic Modeling*

Noronha, Marcelino Caetano; Dwiasnati, Saruni; Helena P Panjaitan, Cherlina

Journal of Information Technology and Computer Science• 2025 •International Forum of Researchers and Lecturers

Abstract: The rapid diffusion of Generative Artificial Intelligence (AI) has intensified public debate regarding its benefits, risks, and societal implications. This study investigates public sentiment and thematic structures surrounding Generative AI by analyzing Twitter discourse as a representation of large-scale, real-time public perception. The research addresses two main problems: how public sentiment toward Generative AI is distributed and what dominant themes shape this perception. Accordingly, the objective is to map both emotional polarity and thematic narratives embedded in social media conversations. A computational mixed-methods approach was employed using a dataset of 12,470 tweets collected on 17 December 2024. Sentiment classification was conducted using a transformer-based DistilBERT model, while semantic representations were generated with Sentence-BERT. Topic modeling was performed using BERTopic, integrating HDBSCAN clustering and class-based TF-IDF to extract coherent and interpretable topics. Human-in-the-loop validation supported the interpretive robustness of topic labeling. The findings reveal that public sentiment toward Generative AI is predominantly positive (41.8%), particularly in relation to productivity enhancement, education, and creative applications. Neutral sentiment (31.4%) reflects informational discourse, while negative sentiment (26.8%) centers on ethical concerns, privacy risks, misinformation, and AI hallucinations. Seven dominant topics were identified, with clear topic–sentiment alignment showing optimism in utility-driven themes and skepticism in ethics- and risk-related discussions. In conclusion, public perception of Generative AI is dualistic—characterized by strong enthusiasm alongside persistent caution. These results provide empirical insights for AI governance, responsible innovation, and future research on socio-technical impacts of Generative AI. *    

https://doi.org/10.70062/globalscience.v1i4.183

Open Access Website Google Scholar

Analisis Sentimen Ulasan E-Commerce Menggunakan Metode SVM

Ryzal Nur Alvandy; Ryzal Nur Alvandy; Arita Witianti

Jurnal Elektronika dan Komputer• 2025 •STEKOM PRESS

The rapid expansion of e-commerce in Indonesia has resulted in a significant rise in the number of customer reviews, which serve as a valuable source of insight for understanding consumer satisfaction. This study aims to classify or identify sentiments from product reviews on the Tokopedia platform into three categories, using the Support Vector Machine algorithm. The classification method data were ethically collected through web scraping and include review text, ratings, and the number of “likes.” The preprocessing stage involved several NLP techniques such as pre-procesesing data representation was generated using the Term Frequency–Inverse Document Frequency method, while the issue of class imbalance was addressed using the Synthetic Minority Over-sampling Technique. Based on the test results, the SVM model achieved an accuracy of 79.48% on the test data using a linear kernel, showing the best performance in classifying positive sentiments. However, the classification of neutral and negative sentiments still requires improvement. This study demonstrates that the combination of the TF-IDF method, additional numerical features, and data balancing techniques can produce an an efficient sentiment analysis model within the e-commerce domain.

https://doi.org/10.51903/elkom.v18i2.3253

Open Access Website Google Scholar

Analisis sentimen ulasan tamu terhadap layanan hotel menggunakan pendekatan machine learning

Gunawan, Ricardho; Hendry, Hendry

IT-Explore: Jurnal Penerapan Teknologi Informasi dan Komunikasi• 2025 •Fakultas Teknologi Informasi, Universitas Kristen Satya Wacana

Sentiment analysis of guest reviews is a crucial aspect in improving the quality of hotel services. This study aims to analyze the sentiment of guest reviews regarding the services of Grand Diamond Hotel Yogyakarta using a machine learning approach with the Support Vector Machine (SVM) algorithm. SVM was chosen because it can handle high-dimensional data such as text and is capable of forming an optimal separating hyperplane between sentiment classes. The research data was obtained through web scraping from Traveloka, yielding 1,119 reviews, which were processed through preprocessing, translation, and sentiment labeling using the TextBlob library. After TF-IDF weighting, the data was divided into 80% for training and 20% for testing. The linear kernel SVM model achieved 80% accuracy in classifying the reviews into positive, negative, and neutral categories. The results of this study were implemented in a web-based application equipped with data visualization and model evaluation features, allowing hotel management to efficiently monitor and analyze guest sentiment and support data-driven service quality improvement.

https://doi.org/10.24246/itexplore.v4i3.2025.pp295-306

Open Access Website Google Scholar

Analisis Sentimen Pengguna Aplikasi Gopay di X Menggunakan Algoritma Naïve Bayes Classifier

Devi Daniyanti; Belsana Butar Butar

Jurnal Sistem Informasi dan Ilmu Komputer• 2025 •International Forum of Researchers and Lecturers

This research aims to analyze GoPay user sentiments on the X social media platform (formerly known as Twitter) using the Naive Bayes Classifier algorithm. Sentiment analysis was conducted to understand user perceptions and satisfaction levels towards GoPay digital payment services based on their shared comments and reviews. Data was collected through a tweet crawling process containing the keyword "GoPay" within a specific period. The research stages included data preprocessing (case folding, tokenizing, filtering, and stemming), sentiment labeling (positive, negative), word weighting using TF-IDF, and classification using the Naive Bayes algorithm. The results showed that from a total of 1,431 analyzed tweets, 797 data contained positive sentiments, and 643 data contained negative sentiments. With a classification accuracy rate reaching 82.94%. The most frequently positively commented factors included ease of use and offered promotions, while the main complaints were related to technical issues and customer service. This research provides insights for GoPay developers to improve services according to user feedback.  

https://doi.org/10.59581/jusiik-widyakarya.v3i1.4739

Open Access Website Google Scholar

Penerapan Algoritma Multinomial Naïve Bayes dengan Penyeimbangan Data SMOTE pada Kl Asifikasi Sentimen Pengguna Shopee terhadap Produk Facial Wash Kahf

Farendika Rezzi

Uranus: Jurnal Ilmiah Teknik Elektro, Sains dan Informatika• 2025 •Asosiasi Riset Teknik Elektro dan Informatika Indonesia

The rapid growth of e-commerce platforms has significantly transformed the way consumers share and access product feedback. One of the widely used platforms in Indonesia is Shopee, where customers actively provide reviews of various products, including local skincare brands such as Kahf facial wash. Customer reviews on e-commerce platforms contain valuable information that can be analyzed to understand consumer opinions and preferences. Sentiment analysis, as a branch of natural language processing, enables the classification of textual data into categories such as positive, negative, or neutral. This study aims to classify Shopee user sentiments regarding Kahf facial wash products by implementing the Multinomial Naïve Bayes algorithm, a well-known probabilistic classifier suitable for text categorization. The research methodology consisted of several preprocessing stages, including data cleansing, case folding, tokenizing, stopword removal, and stemming, to prepare raw review texts for further analysis. For feature representation, the Term Frequency–Inverse Document Frequency (TF-IDF) method was applied to capture the importance of words across documents. To evaluate the classification performance, K-Fold cross-validation was employed with K values of 4, 5, 6, and 10 to ensure model reliability and robustness. Considering the issue of imbalanced datasets in user-generated reviews, the Synthetic Minority Over-sampling Technique (SMOTE) was utilized to balance the distribution of sentiment classes. Based on the confusion matrix, the Multinomial Naïve Bayes algorithm demonstrated effective performance in classifying sentiments, achieving satisfactory levels of accuracy, precision, and recall across different folds. These results indicate that the algorithm is capable of handling sentiment analysis tasks for local product reviews effectively. The findings of this study are expected to provide meaningful insights for businesses in understanding consumer perceptions, thereby supporting decision-making processes in product development, marketing strategies, and customer engagement for local brands.

https://doi.org/10.61132/uranus.v3i3.1022

Open Access Website Google Scholar

Analisis Sentimen Ulasan pada Google Review di Sebuah Penginapan Menggunakan Algoritma Naïve Bayes: Studi Kasus: Grand Jatra Hotel Pekanbaru

Muhammad Azlan; Elvi Rahmi

Neptunus: Jurnal Ilmu Komputer Dan Teknologi Informasi• 2025 •Asosiasi Riset Teknik Elektro dan Informatika Indonesia

This study aims to analyze the sentiment of customer reviews of the Grand Jatra Hotel Pekanbaru on the Google Review platform using the Naïve Bayes algorithm. Social media and online review platforms are increasingly becoming the primary source of information for potential customers in making purchasing decisions, particularly in the hospitality sector. Therefore, sentiment analysis of customer reviews is crucial for understanding consumer perceptions and providing strategic input for hotels in improving service quality. The research data was collected using web scraping techniques to obtain publicly available customer reviews. The obtained data was then processed through text preprocessing stages including case folding, tokenizing, normalization, stopword removal, and stemming. The Term Frequency-Inverse Document Frequency (TF-IDF) method was then used to weight each word, so that more relevant words have a greater influence in the classification process. The sentiment classification process was carried out into two main categories, namely positive and negative. The Naïve Bayes model was trained using training data and then tested with test data to measure the algorithm's performance in classifying sentiment. The evaluation results show that the model built is able to achieve an accuracy level of 98%, with a precision value of 97% and a recall of 100% in the positive class, and 92% in the negative class. These findings confirm that the Naïve Bayes algorithm can be effectively used in analyzing customer sentiment towards hotel services and facilities. Practically, the results of this study are expected to provide insight for the management of Grand Jatra Hotel Pekanbaru in understanding customer perceptions, identifying service strengths and weaknesses, and formulating more targeted marketing strategies. In addition, this study can also be a reference for the development of similar studies in the hotel industry and other service sectors.

https://doi.org/10.61132/neptunus.v3i3.1003

Open Access Website Google Scholar

Analisis Sentimen Aplikasi Liputan6.Com pada Ulasan Pengguna di Google Playstore dengan Menggunakan Algoritma Support Vector Machine (Svm) dan Naïve Bayes

Yayang Tika Robiatush Sholiha; Lubna Asjad Muhda Nabilah; Imron Imron

Saturnus: Jurnal Teknologi dan Sistem Informasi• 2025 •Asosiasi Riset Teknik Elektro dan Informatika Indonesia

This study aims to evaluate user sentiment toward the Liputan6.com application available on the Google Play Store. In the digital era, user reviews serve as a significant indicator in assessing the quality of an application. However, the inconsistency between rating scores and review content renders manual analysis less objective. To address this issue, a machine learning approach was adopted by comparing two algorithms, namely Support Vector Machine (SVM) and Naïve Bayes (NB). A total of 2,500 reviews were collected through a web scraping process and automatically labeled based on the rating (positive if ≥ 3, negative if < 3). The data preprocessing stages included cleaning, case folding, tokenizing, stopword removal, and token filtering. Subsequently, word weighting was carried out using the TF-IDF method, followed by classification using 10-Fold Cross Validation in RapidMiner. The evaluation results indicate that, in the positive class, NB demonstrated superior precision (89.47%), whereas SVM achieved higher recall (98.94%) and F1-score (90.96%). In the negative class, SVM performed better in terms of precision (66.15%), while NB attained higher recall (65.65%) and F1-score (36.34%). Further evaluation based on AUC and accuracy positioned SVM in the good category (AUC 0.842; accuracy 83.82%), while NB was categorized as fail (AUC 0.505; accuracy 60.87%). Overall, SVM is considered to be more effective than NB.

https://doi.org/10.61132/saturnus.v3i3.867

Open Access Website Google Scholar

Analisis Sentimen Ulasan Aplikasi HeyJapan di Google Play Store Menggunakan Algoritma NLP

Jasmine Aulia Mumtaz; Kinaya Khairunnisa Komariansyah; Wildan Holik; Muhammad Galuh Gumelar; Reza Pratama +1 more

Jurnal Rumpun Ilmu Bahasa dan Pendidikan• 2025 •Asosiasi Periset Bahasa Sastra Indonesia

Digital learning applications like HeyJapan are increasingly popular. User reviews on platforms such as Google Play Store contain valuable information on user perceptions and experiences. To process this information systematically, this study employs a Natural Language Processing (NLP) approach to analyze sentiment toward the HeyJapan application. Data was collected using web scraping techniques with Python and the google play scraper library, resulting in 1,000 latest user reviews. The analysis included data collection, preprocessing, sentiment labeling using TextBlob, visualization, modeling with Logistic Regression, and evaluation. After preprocessing, 923 valid reviews were classified into three sentiment categories based on polarity which are positive, neutral, and negative. Results showed 71.4% of reviews positive, 26.1% neutral, and 2.5% negative. Visualizations in pie charts and word clouds provided an overview of user perceptions. Modeling with TF-IDF and Logistic Regression achieved 88% accuracy with the highest f1-score in the positive sentiment category. Evaluation indicates the model is fairly reliable in classifying sentiments, especially for positive and neutral categories, though negative sentiment classification needs improvement. This study shows the NLP approach can evaluate user perceptions of educational applications based on reviews and serve as a basis for improving foreign language learning app quality.

https://doi.org/10.61132/pragmatik.v3i3.1801

Open Access Website Google Scholar

Analisis Sentimen Pengguna Aplikasi OVO pada Media Sosial X menggunakan Metode Naive Bayes dan Support Vector Machine (SVM)

Annisa Qomariah; Rizaldy Khair

Jurnal Sistem Informasi dan Ilmu Komputer• 2025 •International Forum of Researchers and Lecturers

The rapid development of financial technology (fintech), particularly digital wallet applications like OVO, has significantly transformed transaction patterns in society. However, issues such as server instability and unsatisfactory user experiences frequently emerge on social media platforms. This study aims to analyze user sentiments toward OVO on platform X (formerly Twitter) by comparing the performance of two machine learning algorithms: Naïve Bayes and Support Vector Machine (SVM). Data were collected through web scraping from 1,000 Indonesian-language tweets containing the keyword "OVO." The research methodology included text preprocessing (data cleaning, tokenization, stopword removal), feature extraction using TF-IDF, and sentiment classification (positive, negative, neutral). Evaluation results demonstrated that SVM achieved the highest accuracy of 85.2%, while Naïve Bayes reached 78.5%. SVM also outperformed in precision (87%) and recall (83%) due to its ability to handle non-linear data. These findings provide actionable recommendations for OVO developers to enhance server stability and features based on user feedback. Additionally, this study serves as a reference for future sentiment analysis research employing algorithmic comparisons.

https://doi.org/10.59581/jusiik-widyakarya.v3i2.5148

Open Access Website Google Scholar