SciRepID - Scientific Publication Search

Publication Search

22,072 articles from 385 journals · 1,447 citations tracked

Showing 1-20 of 95

Analytics

Rasiban Rasiban; Dadang Iskandar Mulyana; Muhammad Joko Umbaran Kharis Bahrudin; Nicola Marthy

International Journal of Information Engineering and Science 2026 Asosiasi Riset Teknik Elektro dan Infomatika Indonesia

The development of social media, especially TWITTER, has become one of the main means for people to express opinions and criticism on various issues, including the performance of law in Indonesia. This study aims to analyze public sentiment towards the performance of law based on TWITTER user comments using the Naïve Bayes algorithm. The research data consists of 1004 comments collected from several videos related to legal topics. The analysis process includes the stages of data crawling, pre- processing (text cleaning, normalization, and tokenization), labeling sentiment into positive, negative, and neutral, and testing the Naïve Bayes model. The results show that the Naïve Bayes algorithm is able to classify sentiment with an accuracy level of 93.73%. The distribution of sentiment from 1004 comments shows that the majority of public opinion is (negative/positive/neutral), which indicates that public perception of the performance of law is still (critical/positive). These findings are expected to be input for related parties to understand public opinion and improve the quality of legal performance in

Sutisna Sutisna; Tri Wahyudi; Dwi Swasono Rachmad; Fachrur Rozi

International Journal of Information Engineering and Science 2026 Asosiasi Riset Teknik Elektro dan Infomatika Indonesia

Social media X (Twitter) has become the main platform for the Indonesian public to express opinions, including on the trend of 'kabur aja dulu' (let's just run away for a bit). This research aims to classify the sentiments of the public using the Naïve Bayes and Support Vector Machine (SVM) methods, and to compare the accuracy of both in sentiment analysis. Data was collected via the Twitter API with the hashtag #kaburajadulu, resulting in 2,067 tweets, which, after the cleansing process and manual labeling, left 385 data points. The analysis process followed the CRISP-DM stages, which include business understanding, data understanding, data preparation, modeling, evaluation, and deployment. Model evaluation was conducted using a confusion matrix with accuracy, precision, and recall metrics. The classification results show that 82% of tweets have a positive sentiment and 18% negative. The Naïve Bayes algorithm achieved an accuracy of 86.49%, slightly lower than SVM, which reached 88.05%. In conclusion, Support Vector Machine is more effective in sentiment classification on public opinion data. This research contributes to the digital mapping of public opinion and recommends the development of automatic labeling methods as well as the exploration of advanced algorithms in the future.

Veri Arinal; Satria Wira Yudha; Muhammad Joko Umbaran Kharis Bahrudin; Dessyanti Ryantina

International Journal of Information Engineering and Science 2026 Asosiasi Riset Teknik Elektro dan Infomatika Indonesia

QRIS (Quick Response Code Indonesian Standard) has become a widely used national digital payment standard. User satisfaction with this service needs to be monitored continuously to ensure its sustainability. This study aims to predict the level of QRIS user satisfaction based on their experiences and perceptions expressed organically on the Twitter social media platform. The method used is sentiment analysis with the Naive Bayes classification algorithm implemented using RapidMiner software. The research data was obtained from Twitter user comments collected through web scraping techniques. The text data then went through a preprocessing stage that included cleansing, stopword filtering, stemming, and tokenizing to be prepared as features ready to be processed by the model. The data was divided into training (80%) and testing (20%) subsets for model training and validation. The results showed that the Naive Bayes model was able to predict user satisfaction sentiment with an accuracy of 80.99%. These findings indicate that the model is highly accurate in identifying satisfied comments and sufficiently sensitive in detecting dissatisfaction. This study concludes that sentiment analysis of Twitter UGC data using Naive Bayes is an effective and efficient approach for predicting QRIS user satisfaction in real time. The practical implication of this study is to provide an automatic feedback system for service providers to monitor public sentiment and take targeted corrective actions.

Mesra Betty Yel; Sopan Adrianto; Rasiban Rasiban; Eva Widiyanti

International Journal of Information Engineering and Science 2026 Asosiasi Riset Teknik Elektro dan Infomatika Indonesia

The growth of information technology has driven changes in consumer behavior, one of which is through e-commerce platforms such as Shopee. This phenomenon has generated a large number of customer reviews, including those for local cosmetic products such as Wardah. These reviews serve as an important source of information for understanding customer perceptions and satisfaction levels. However, manual analysis of large and linguistically diverse datasets is inefficient and potentially subjective. This study aims to implement the multi-category Naive Bayes algorithm to classify the sentiment of Wardah product reviews on Shopee into three categories: positive, negative, and neutral. The data were collected using a web scraping technique and processed through a series of preprocessing stages including case folding, tokenization, stopword removal, stemming, and text cleaning. Subsequently, term weighting was performed using the TF-IDF method prior to classification. Model performance was evaluated using a confusion matrix as well as accuracy, precision, and recall metrics. The results indicate that the multi-category Naive Bayes algorithm achieved an accuracy of 86.00%, a precision of 86.63%, and a recall of 98.24%. This approach can assist business practitioners in objectively understanding customer opinions and support decision-making in business strategy and product development.

Aura Rahayu Aksa Radiana; Fathoni Mahardika; Dani Indra Junaedi

Merkurius : Jurnal Riset Sistem Informasi dan Teknik Informatika 2026 Asosiasi Riset Teknik Elektro dan Informatika Indonesia

This study aims to develop a sentiment classification method for YouTube user comments related to the game Love and Deepspace using the Naïve Bayes algorithm, focusing on improving the text data processing and understanding user perceptions. Comment data were collected through scraping from YouTube videos, followed by preprocessing including text cleaning, normalization, stopword removal, stemming, and translation into English. Initial labeling was conducted using TextBlob, then the data were randomly sampled for training the Naïve Bayes model. Evaluation involved comparing sentiment distributions and visualization using Word Cloud and bar charts. The Naïve Bayes model achieved an accuracy of 77.36% in sentiment classification. The sentiment distribution shows differences between TextBlob (positive: 1,011, neutral: 1,312, negative: 575) and Naïve Bayes (positive: 901, neutral: 1,627, negative: 370), with Naïve Bayes being more conservative. The Word Cloud visualization identifies dominant words such as "bang," "game," and "main," while the bar chart shows the largest proportion of neutral sentiment. Naïve Bayes is effective for sentiment classification on informal comment data, with significant differences from rule-based methods like TextBlob. This research contributes to the development of text data processing techniques and user perception analysis, as well as opening up optimization opportunities with other algorithms like SVM for better accuracy.

Diajeng Febriana; Suci Suci; Darmawati Darmawati

Jurnal Penelitian Komunikasi dan Sosialisasi 2026 Asosiasi Peneliti dan Pengajar Ilmu Sosial Indonesia

This research critically investigates the circulation of disinformation concerning the instability of fuel prices on the digital platform X and its subsequent implications for the polarization of modern society. In an era where unverified economic news frequently dictates public reaction, fake news often acts as a potent catalyst for mass anxiety. By implementing a quantitative framework driven by lexicon-based computational sentiment analysis, this study effectively processed a dataset of 500 public opinion samples extracted via Google Colab spanning from April 2024 to April 2026. To ensure computational accuracy and eliminate textual noise, the data underwent a rigorous preprocessing phase encompassing case folding, alongside the systematic removal of URLs, account mentions, numbers, hashtags, and punctuation marks. The statistical outcomes revealed a highly disproportionate emotional landscape, overwhelmingly dominated by 451 negative reviews. In stark contrast, neutral observations and positive affirmations were nearly absent, recording only 40 and 9 instances, respectively. The data compellingly illustrates that the relentless influx of pessimistic narratives regarding economic instability directly induces financial panic, undermines rational discourse, and severely fragments cyberspace into deeply polarized factions.

Ayu Astuti Siregar; Al-Khowarizmi

Merkurius : Jurnal Riset Sistem Informasi dan Teknik Informatika 2026 Asosiasi Riset Teknik Elektro dan Informatika Indonesia

Social media has evolved into a significant platform where consumers freely express their opinions, experiences, and levels of satisfaction regarding various products, including those offered by Micro, Small, and Medium Enterprises (MSMEs). The comments and reviews shared by customers on these platforms contain diverse sentiments that can serve as valuable indicators of how consumers perceive product quality. Understanding these sentiments is crucial for MSME owners, as it allows them to evaluate their products and adapt to market expectations more effectively. This study aims to analyze customer sentiment toward MSME products on social media by utilizing the Naïve Bayes algorithm, a widely used classification method in text mining. The data used in this research consist of customer comments collected from various social media platforms. The research process involves several stages, including data collection, manual labeling of sentiments, text preprocessing (such as tokenization, case folding, and stopword removal), and splitting the dataset into training and testing subsets. Subsequently, the classification process is carried out using the Naïve Bayes algorithm to categorize sentiments into positive, negative, and neutral classes. The results of this study demonstrate that the Naïve Bayes method is effective in classifying customer sentiments with a satisfactory level of accuracy. These findings provide a comprehensive overview of consumer perceptions regarding the quality of MSME products. Furthermore, this research is expected to assist MSME business owners in understanding customer feedback more systematically and using it as a basis for improving product quality and enhancing customer satisfaction in a competitive digital marketplace.

Susanto, Eko; Sharipuddin; Purnama, Benni

Prosiding Seminar Nasional Ilmu Teknik 2026 Asosiasi Riset Ilmu Teknik Indonesia

The rapid growth of e-commerce in Indonesia, particularly the Shopee platform, has generated a large volume of user reviews on the Google Play Store, which can be analyzed to understand consumer sentiment. This study aims to compare the performance of the Support Vector Machine (SVM) and Random Forest (RF) algorithms in binary sentiment classification (positive and negative) on Shopee reviews, as well as to statistically test the significance of their differences using One-Way ANOVA. A total of 400,498 reviews were collected via web scraping, preprocessed through text normalization, tokenization, and Indonesian language stemming, and then feature-extracted using TF-IDF and Count Vectorizer. Evaluation results show that SVM achieved an accuracy of 91.77%, precision of 91.49%, recall of 91.77%, and F1-Score of 91.56%, while RF achieved an accuracy of 90.07%, precision of 91.68%, recall of 90.07%, and F1-Score of 90.55%. ANOVA confirmed that the performance difference between the two algorithms is statistically significant (p-value = 0.0007) with a large effect size (η² = 0.1815). Therefore, SVM is recommended as a more optimal and consistent algorithm for automated sentiment analysis of Indonesian e-commerce reviews, while also providing a replicable methodological framework for similar future research.

Afif Lustyo Muji; Aziz Musthofa; Dihin Muriyatmoko

Prosiding Seminar Nasional Ilmu Teknik 2026 Asosiasi Riset Ilmu Teknik Indonesia

Since the announcement of the policy plan for a name transfer system in the sale of used mobile phones, the issue has attracted widespread public attention and discussion. People have expressed their opinions on social media platforms, particularly TikTok. This study aims to classify the sentiment of TikTok users using Naive Bayes and Support Vector Machine (SVM) algorithms. The data were collected through a comment scraping technique on related content.The research stages include text preprocessing, sentiment labeling into positive, negative, and neutral categories, and feature extraction using TF-IDF. The classification process employs Naive Bayes and Support Vector Machine algorithms, which are then evaluated based on accuracy, precision, recall, and F1-score. The results of this study indicate that both methods are capable of classifying sentiment effectively. However, the Support Vector Machine method is superior to the Naive Bayes method with an accuracy rate of 99.57% compared to 94.30%. This study is expected to help the government understand public responses to the planned policy of the used mobile phone name transfer system.

Dihin Muriyatmoko; Aziz Musthafa; Yusuf Al Banna

Prosiding Seminar Nasional Ilmu Teknik 2026 Asosiasi Riset Ilmu Teknik Indonesia

Sentiment analysis on social media is widely used to represent public perceptions of sports performance, particularly in international competitions. This study aims to analyze the sentiment of YouTube user comments regarding the performance of the Indonesian National Football Team during the FIFA World Cup 2026 Asian Qualifiers. The data were collected from user comments on videos related to the matches and analyzed using a machine learning–based sentiment analysis approach. Sentiment classification was performed using the Naive Bayes algorithm. The results indicate that the proposed approach is able to effectively identify public sentiment toward the national team’s performance during the qualification matches. The findings of this study are expected to provide insights into public perceptions and contribute to sentiment analysis research in the field of sports.

Marjelin Putri Ndaparoka; Stefanus D.I. Mau; Sihang Gregorius Bali Mema

Modem : Jurnal Informatika dan Sains Teknologi 2026 Asosiasi Profesi Telekomunikasi Dan Informatika Indonesia

Savings and Loan Cooperatives (KSP) play a vital role in expanding community access to capital, especially within the informal sector. Nevertheless, non-performing loans remain a persistent challenge that can threaten liquidity and long-term institutional sustainability. KSP CU Mera Ndi Ate faces similar issues, which are assumed to stem not only from administrative weaknesses but also from members’ perceptions and behavioral factors. This research aims to examine the potential causes of non-performing loans through text-based sentiment analysis using an unsupervised learning approach. A quantitative method with a data mining framework was applied. Data were gathered through interviews, observations, documentation, and 200 customer opinion texts processed using the Orange Data Mining application. The analytical stages included preprocessing, corpus development, feature extraction, sentiment clustering, and visualization. Because the dataset lacked predefined labels, unsupervised learning was used to identify naturally emerging sentiment patterns. Findings reveal a predominance of critical sentiments related to credit assessment procedures and service quality. The highest sentiment score (75) concerned insufficient creditworthiness evaluation, followed by concerns about service efficiency (66.6667). These insights suggest that improving assessment accuracy and service quality may help reduce non-performing loans.

Duvalio Adnan Zordi; Mohammad Syahrul Ihsan; Muhamad Aprian Nazarudin; Tria Patrianti

Jurnal Ilmu Komunikasi, Administrasi Publik dan Kebijakan Negara 2026 Asosiasi Peneliti Dan Pengajar Ilmu Sosial Indonesia

The 21st century is marked by a profound transformation in digital communication. Social media has become a new public space, enabling people to interact, disseminate information, and shape public opinion rapidly and massively. This article analyzes the role of social media in shaping public opinion and its influence on the dynamics of government policy in Indonesia. Through a literature review and case analysis of policies influenced by viral issues on social media, this study finds that social media increases citizen participation and accelerates government responses to public issues. However, the pattern of 'viral-based policy' also carries risks, such as reactive policies, a lack of evidence-based policies, and inequality in representation. To manage this phenomenon, the government needs to develop an inclusive digital communication strategy, establish an early detection system for public sentiment, and uphold the principles of good governance and evidence-based policy. These findings are relevant for academics and policymakers seeking to understand the interaction between social media, public opinion, and government policy in the digital era.

Arrasyifah Leby; Saeful Mujab; Abellia Nathany; Syafaat Ariski

Jurnal Ilmu Komunikasi, Administrasi Publik dan Kebijakan Negara 2026 Asosiasi Peneliti Dan Pengajar Ilmu Sosial Indonesia

This study examines Terra Drone Indonesia's implementation of management dialogue in addressing crisis communication following a fire at the company's office building. The incident sparked a wave of negative sentiment on Instagram, marked by increased public comments assessing occupational safety, data security, and the company's transparency in conveying information related to the legal process. The study used a qualitative approach with a case study method to understand how the company developed a crisis communication strategy through official statements published on social media. Data were analyzed based on dialogic elements of communication, particularly empathy for victims, humanitarian commitment, and the company's position and normative and defensive stance in affirming legal handling and compliance measures. The results show that the company attempted to balance an empathetic narrative to mitigate public pressure with a defensive strategy to maintain institutional legitimacy. However, the dynamics of public opinion on Instagram indicate that the company's response has not fully met the expectations of two-way communication. This is evident in the dominance of one-way communication patterns and the lack of technical clarifications needed by the public, thus creating a productive economic outlook. Overall, dialogic management has been implemented responsively, but has not been optimal in building a space for dialogue and public trust as a whole.

Muhimatul Ifadah; Muhimatul Ifadah; Bambang Irawan

Jurnal Elektronika dan Komputer 2026 STEKOM PRESS

User reviews on the Shopee e-commerce platform represent an important source of information for understanding consumer perceptions of products and services. Sentiment analysis is commonly applied to classify user opinions into positive, neutral, and negative sentiment categories based on textual data. This study aims to analyze the performance of the Long Short-Term Memory (LSTM) method in sentiment classification of Shopee user reviews. The dataset used in this study consists of Indonesian-language user reviews that have undergone preprocessing stages, including case folding, text cleaning, tokenization, and stopword removal. The LSTM model was trained using preprocessed text represented as word sequences. Model performance was evaluated using overall accuracy and class-wise classification results. The experimental results indicate that the LSTM method achieved an overall accuracy of 87.62%. In addition, the classification performance for the positive sentiment class reached 95.27%, the neutral class achieved 4.96%, and the negative class reached 74.26%. These results demonstrate that the LSTM method performs well in classifying sentiment in Shopee user reviews, particularly for positive sentiment. This study is expected to provide insights and references for the application of deep learning methods in sentiment analysis of Indonesian e-commerce review data.

Feli Samudra; Muhamad Sopyan

Jurnal Penelitian Komunikasi dan Sosialisasi 2026 Asosiasi Peneliti dan Pengajar Ilmu Sosial Indonesia

This study aims to analyze the influence of the @starbucksindonesia account's use of Instagram as a communication medium on brand image following the pro-Israel boycott. The boycott arose from Starbucks' alleged involvement in the Israeli-Palestinian conflict, which triggered a decline in its reputation and negative sentiment among Indonesians. In this situation, Instagram, as a visual-based social media platform, was utilized as a primary means of shaping public opinion and responding to the crisis. The study employed a quantitative approach using an online questionnaire survey. Respondents were 100 followers of the @starbucksindonesia Instagram account, aged 18–35, and former Starbucks customers. Data analysis was conducted using validity and reliability tests, simple linear regression, t-tests, and coefficients of determination using SPSS version 27. The results showed that all research instruments were valid and reliable. The main finding demonstrated that Instagram use had a statistically significant effect on Starbucks' brand image. The coefficient of determination value indicated a strong relationship, indicating that the majority of changes in respondents' perceptions were influenced by communication via Instagram. This research supports the Uses and Effects theory, which states that social media not only serves as an information provider but also has the ability to shape consumer perceptions and attitudes. Therefore, Instagram plays a strategic role in digital communications for crisis management and brand image restoration.

Aditya Abdulloh Masykur; Aditya Abdulloh Masykur; Rino Raihan Gumilang; Harun Al Rosyid

Jurnal Elektronika dan Komputer 2026 STEKOM PRESS

The performance of the Indonesian National Team (Timnas) in the 2026 World Cup qualifications has triggered massive and diverse responses on social media, particularly on platform X. This study aims to identify and classify public sentiment regarding Timnas Indonesia's performance into positive, negative, and neutral categories using a data mining approach. Text data was processed through pre-processing stages, term weighting using TF-IDF, and the application of the Synthetic Minority Over-sampling Technique (SMOTE) to address significant class distribution imbalance. The classification algorithm employed was Multinomial Naïve Bayes. Model performance evaluation was conducted by comparing two training-testing data split scenarios: 90:10 and 80:20 ratios. The results indicate that public opinion is dominated by negative sentiment at 73.2%, reflecting public disappointment. In terms of model performance, the 90:10 ratio scenario yielded the best accuracy of 80%, outperforming the 80:20 ratio which recorded an accuracy of 75%. These findings demonstrate that combining Multinomial Naïve Bayes with the SMOTE technique is effective in handling imbalanced text data and is capable of accurately mapping public perception.

Noronha, Marcelino Caetano; Dwiasnati, Saruni; Helena P Panjaitan, Cherlina

Journal of Information Technology and Computer Science 2025 International Forum of Researchers and Lecturers

Abstract: The rapid diffusion of Generative Artificial Intelligence (AI) has intensified public debate regarding its benefits, risks, and societal implications. This study investigates public sentiment and thematic structures surrounding Generative AI by analyzing Twitter discourse as a representation of large-scale, real-time public perception. The research addresses two main problems: how public sentiment toward Generative AI is distributed and what dominant themes shape this perception. Accordingly, the objective is to map both emotional polarity and thematic narratives embedded in social media conversations. A computational mixed-methods approach was employed using a dataset of 12,470 tweets collected on 17 December 2024. Sentiment classification was conducted using a transformer-based DistilBERT model, while semantic representations were generated with Sentence-BERT. Topic modeling was performed using BERTopic, integrating HDBSCAN clustering and class-based TF-IDF to extract coherent and interpretable topics. Human-in-the-loop validation supported the interpretive robustness of topic labeling. The findings reveal that public sentiment toward Generative AI is predominantly positive (41.8%), particularly in relation to productivity enhancement, education, and creative applications. Neutral sentiment (31.4%) reflects informational discourse, while negative sentiment (26.8%) centers on ethical concerns, privacy risks, misinformation, and AI hallucinations. Seven dominant topics were identified, with clear topic–sentiment alignment showing optimism in utility-driven themes and skepticism in ethics- and risk-related discussions. In conclusion, public perception of Generative AI is dualistic—characterized by strong enthusiasm alongside persistent caution. These results provide empirical insights for AI governance, responsible innovation, and future research on socio-technical impacts of Generative AI. *    

Windi Astuti; Windi Astuti; Bambang Irawan; Nur Ariesanto Ramdhan

Jurnal Elektronika dan Komputer 2025 STEKOM PRESS

The development of social media platforms like TikTok has created new spaces for digital economic activities, including the practive of thrifting, which has now become a trend among the public. However, government policies that block these activities have sparked various public reactions. This study aims to analyze public sentiment regarding the issue of thrifting bans on the TikTok platform using the Bidirectional Long Short-Term Memory (Bi-LSTM) method. This method was chosen because it can understand text context from both directions, allowing it to capture deeper semantic meaning. The dataset consist of 4,000 TikTok user comments collected through a crawling process. The research stages include data preprocessing, sentiment labeling, splitting training and test data, training the Bi-LSTM model, and evaluating performance using accuracy, precision, recall, and F1-score metrics. The research results show that the Bi-LSTM model achieved an accuracy of 86.15%, with stable classification performance and minimal error rate. These findings indicate that Bi-LSTM is effective for sentiment analysis of public opinions on Indonesian language social media, particularly on context specific policy issues. Further development can be carried out by adding pre-trained embeddings or attention mechanisms to improve the model’s performance.

Maulita, Erika; Nyale, M Hendri Yan

Jurnal Ilmiah Komputerisasi Akuntansi 2025 Universitas Sains dan Teknologi Komputer

In the investment world, stock returns are the leading indicator of a company’s performance and the basis for investor decision-making in the capital market. Fluctuations in stock returns reflect market expectations of the company’s prospects. The retail sector in Indonesia is facing significant pressure from post-pandemic shifts in consumer behavior and increased competition. This study aims to analyze the effect of financial distress, company size, liquidity, operating cash flow, and accounting profit on stock returns in retail sub-sector companies listed on the Indonesia Stock Exchange (IDX) during the period 2021 to 2023. This type of research is causally associated with a quantitative approach. The data used is secondary, in the form of financial statements from retail companies. The sampling technique used was purposive, yielding a total of 39 data points from 13 retail companies. Data testing was carried out using SPSS version 24. The results showed that partially, the variables of financial distress, company size, liquidity, and accounting profit had no significant effect on stock returns. Meanwhile, operating cash flow positively impacts stock returns. These findings indicate that fundamental indicators are not always the main determinants of stock returns. Therefore, investors are advised also to consider external factors such as market sentiment, macroeconomic conditions, and government policies that may have a greater influence on stock performance in the capital market.

Firdaus, Muhammad; Rosyidah, Ulya Anisatur; Handayani, Luluk

Router : Jurnal Teknik Informatika dan Terapan 2025 Asosiasi Profesi Telekomunikasi dan Informatika Indonesia

Sugar consumption in Indonesia remains high, with diabetes affecting 20.4 million people. This condition has prompted the government to introduce an excise policy on Minuman Berpemanis Dalam Kemasan (MBDK) to reduce sugar intake. Social media, particularly the X platform, serves as a medium for the public to express their opinions regarding this policy. This study aims to analyze public sentiment toward the MBDK excise policy using a lexicon-based approach for data labeling and the Multinomial Naive Bayes algorithm with unigram and bigram feature extraction. The initial results show that the highest performance was achieved using 5-Fold Cross Validation, with an average accuracy of 83%, precision of 84%, recall of 75%, and an F1-Score of 77%. After applying data balancing using Stratified Cross Validation combined with Borderline-SMOTE and limiting the features to the 700 most frequent terms, the model’s performance improved. The best results were obtained with 10-Fold Cross Validation, achieving 86% accuracy, 84% precision, 83% recall, and an F1-Score of 83%. These findings indicate that the Multinomial Naive Bayes model can effectively classify public sentiment regarding the MBDK excise policy after the data balancing process.