SciRepID - Scientific Publication Search

Publication Search

41,520 articles from 397 journals · 1,447 citations tracked

Showing 1-20 of 29

Analytics

Untung Surapati; Dadang Iskandar Mulyana; Dedi Gunawan; Anggit Purnama

International Journal of Applied Mathematics and Computing 2026 Asosiasi Riset Ilmu Matematika dan Sains Indonesia

Early detection of a potential heart attack is a crucial step in preventing sudden death from heart disease. This research aims to develop an Internet of Things (IoT)-based health monitoring system capable of measuring vital body data in real time and predicting the likelihood of a heart attack from CSV data obtained from sensors, integrated through RapidMiner as learning data using a machine learning algorithm, the Support Vector Machine (SVM). The system was built using an ESP32 microcontroller connected to a MAX30102 sensor to measure heart rate and finger oxygen levels (SpO₂), as well as a DHT22 sensor to measure temperature and humidity. The resulting data is sent to the Blynk application to display real-time data according to its parameters. The initial prediction logic was developed using a rule-based method based on medical thresholds for four vital parameters. The data was then used to train an SVM model as a classification system to detect potential heart attacks. Test results showed that the system can identify abnormal conditions with a good level of accuracy and provide early warnings based on changes in vital parameters in real time. This system is expected to be an initial solution for personal health monitoring, especially for individuals at risk of heart disease. It can be further developed with cloud integration and automatic notifications to users' devices.

Rinaldi Bursan

International Journal of Economics and Management Sciences 2026 Asosiasi Riset Ekonomi dan Akuntansi Indonesia

Algorithmic technologies are widely used in contemporary marketing strategies due to the growth of the digital economy. Digital companies can evaluate consumer activity data in real time and provide highly personalized digital experiences thanks to artificial intelligence-based solutions, especially machine learning. In addition to examining how algorithmic governance and surveillance capitalism affect algorithmic personalization, this study looks into how these mechanisms affect consumer engagement, purchase intention, and perceptions of hyperreality within the digital market ecosystem. 356 active users of digital platforms, such as social media and e-commerce, were surveyed as part of this study's quantitative methodology. The links between the constructs in the suggested conceptual model were examined through data analysis using Partial Least Squares Structural Equation Modeling (PLS-SEM). The results show that the development of algorithmic personalization systems is strongly influenced by data-driven capitalism practices and algorithmic governance. Additionally, it has been demonstrated that algorithmic personalization improves customers' sense of hyperreality and increases their interaction with digital platforms. Additionally, the study shows that the most powerful factor influencing purchase intention is consumer interaction. By combining viewpoints from technology, the political economics of data, and hyperreality theory into a thorough empirical framework, these findings add to the body of knowledge on digital marketing.

Santo Dewatmoko; Nadia Rizky Vindiazhari; Zaenal Muttaqien

Jurnal Manajemen Riset Inovasi 2026 Pusat Riset dan Inovasi Nasional

This study examines customer churn prediction in subscription-based telecommunications from a digital marketing perspective using machine learning. The analysis utilizes a secondary dataset of 7,043 customer records that simulate behavioral, contractual, and financial attributes commonly found in telecom services. Three classification algorithms Logistic Regression, Random Forest, and Gradient Boosting are applied to model churn behavior. Data preprocessing includes handling missing values, encoding categorical variables, and splitting data into training and testing sets. Model performance is evaluated using accuracy, recall, and ROC-AUC, with emphasis on recall due to its importance in identifying at-risk customers. The results show that Gradient Boosting achieves the highest overall performance with an ROC-AUC of 0.84, while Logistic Regression provides relatively higher recall. Key drivers of churn include short-term contracts, higher monthly charges, and lower service engagement. However, recall remains moderate, indicating limitations in capturing complex behavioral factors. These findings suggest the need to combine predictive models with behavioral insights and highlight the importance of early customer engagement and long-term retention strategies.

Suyahman Suyahman; Deny Prasetyo; Ahmad Budi Trisnawan; Ardy Wicaksono; Muhamad Furqon

Predictive maintenance (PdM) plays a crucial role in modern industrial systems by minimizing downtime, reducing maintenance costs, and optimizing asset performance. However, many predictive models operate as “black box” systems, limiting transparency and making it difficult for operators to interpret their outputs. This study aims to integrate Explainable Artificial Intelligence (XAI) techniques with Remaining Useful Life (RUL) prediction models to improve both accuracy and interpretability. Various machine learning and deep learning approaches, including Support Vector Machines (SVM), Random Forest (RF), XGBoost, Long Short-Term Memory (LSTM), and Convolutional Neural Networks (CNN), are employed to predict RUL using real-time sensor data from rotating machinery. XAI methods such as SHAP, LIME, and attention mechanisms are applied to provide human-understandable explanations of model predictions. The models are evaluated based on accuracy, Root Mean Square Error (RMSE), and interpretability scores. The results show that XAI-enhanced models outperform traditional approaches in predictive performance while offering greater transparency. These explanations help maintenance engineers better understand the factors influencing predictions, thereby improving decision-making and trust in the system. Nevertheless, the integration of XAI introduces additional computational complexity, which may pose challenges for large-scale industrial implementation. Overall, this study highlights the potential of combining XAI with RUL prediction to develop more reliable, transparent, and effective predictive maintenance solutions.

Widiastuti, Tiwuk; Richard , Berlien; Maryo Indra, Manjaruni

Journal of Information Technology and Computer Science 2026 International Forum of Researchers and Lecturers

High-dimensional clinical data exhibit complex and non-linear relationships among patient attributes, where outcomes are often influenced by feature interactions rather than isolated variables. However, many existing machine learning models prioritize predictive performance while providing limited interpretability and insufficient insight into interaction structures. This study aims to address this limitation by developing an interpretable and robust framework for feature interaction mining in clinical data. We propose a hybrid tree–neural modeling framework that explicitly captures and ranks feature interactions while maintaining stable predictive performance. Tree-based ensemble models are employed to identify non-linear interaction patterns, while neural representations enhance learning flexibility and generalization. The framework integrates interaction importance analysis, cross-validation–based stability assessment, and evaluation across multiple data splits to ensure robustness and interpretability. Experiments conducted on a real-world high-dimensional clinical dataset demonstrate that the proposed approach achieves consistent predictive performance, with AUC values ranging from 0.628 to 0.641 across five cross-validation folds (mean AUC ≈ 0.633). Performance remains stable under varying train–test splits, indicating strong generalizability. Interaction analysis reveals that a small number of dominant feature interactions—such as age combined with length of hospital stay and medication count combined with diagnostic information—consistently contribute to model predictions, appearing in over 80% of validation folds. Ablation studies further confirm that removing interaction-aware components leads to noticeable performance degradation, highlighting their importance.  In conclusion, this study demonstrates that explicit feature interaction modeling enhances interpretability, stability, and generalization in clinical prediction tasks. The proposed hybrid framework provides a reliable foundation for developing trustworthy and transparent clinical decision-support systems

Devianto, Yudo; Saragih, Rusmin; Cahyana, Yana

Journal of Information Technology and Computer Science 2026 International Forum of Researchers and Lecturers

This research benchmarks multiple machine learning (ML) algorithms for large-scale loan default prediction using a real-world dataset of 255,000 borrower records, where default cases represent only ~9–12% of total observations. The study addresses the persistent gap in comparative analyses of ML models that balance predictive accuracy, interpretability, and computational efficiency for credit risk assessment. Six algorithmic families were evaluated Logistic Regression, Random Forest, XGBoost, LightGBM, CatBoost, Artificial Neural Networks (ANN), and Stacked Ensemble—using standardized preprocessing, hybrid imbalance handling (SMOTE, class weighting, under-sampling), and comprehensive evaluation metrics (AUC, F1, Recall, Precision, PR-AUC, and Brier Score). Empirical results show Logistic Regression achieved the highest AUC of 0.732, outperforming nonlinear models under the baseline configuration, while LightGBM attained perfect recall (1.0) but low precision (0.116), indicating over-prediction of defaults. Gradient boosting models demonstrated robust calibration (Brier ≈ 0.114–0.116) and the best computational efficiency, with LightGBM showing the fastest training and lowest memory use. CatBoost exhibited strong recall but the slowest computation, and ANN underperformed on tabular data (AUC ≈ 0.56). The Stacked Ensemble delivered balanced results with AUC = 0.664 and improved overall stability. These findings confirm that boosting-based models, particularly LightGBM and CatBoost, offer superior scalability and calibration, whereas Logistic Regression remains a valuable interpretable baseline. The study concludes that effective default prediction requires integrating rebalancing, calibration, and threshold optimization to enhance recall and operational deployment reliability in large-scale credit ecosystems.

Deki Marizaldi; M. Herdi Pratama; Lindrianasari Lindrianasari; Tagor Hutapea

International Journal of Social Sciences and Communication 2026 International Forum of Researchers and Lecturers

This study aims to provide a comprehensive analysis of Predictive Policing and its implications for law enforcement transformation in Indonesia, based on an extensive review of its global applications, benefits, and challenges. The study uses qualitative literature and international case study review methods to assess the impact and complexity of implementing digital technologies such as artificial intelligence (AI), machine learning, and big data analytics within a Predictive Policing framework. The results of this review highlight that while Predictive Policing offers significant potential for proactive crime prevention and increased operational efficiency, its implementation is consistently fraught with critical legal, ethical, and technical challenges, including regulatory gaps, risks of algorithmic bias, and data privacy concerns, which are particularly relevant to Indonesia. The findings underscore that public trust and police legitimacy in the context of adopting such technologies are strongly influenced by transparency, strong accountability mechanisms, and community involvement in shaping their use. This study contributes to the growing discourse on digital policing in developing countries and culminates in practical policy recommendations designed to guide the Indonesian police towards the development and implementation of Predictive Policing models that are effective, efficient, and fundamentally respectful of legal and human rights principles.

Ahmad Yuan Arby

Prosiding Seminar Nasional Ilmu Teknik 2026 Asosiasi Riset Ilmu Teknik Indonesia

This study presents ReflectAI, a web-based system designed to automate the creation of teaching materials tailored to students' learning styles using behavior data from a Learning Management System (LMS). Student digital activity data—such as logins, material access, forum participation, assignment submission, and quiz results—are extracted and processed using a Hierarchical Clustering algorithm to categorize students into three learning styles: visual, auditory, and kinesthetic. Based on the clustering results, the system automatically generates personalized learning modules using generative AI (ChatGPT API), aligned with each student's learning preferences. Employing a data-driven system development approach, the system was tested with data from 230 students in a mathematics course. The results show diverse learning style distributions and relevant, tailored content generation. ReflectAI is designed to reduce teachers’ administrative workload and enhance personalized and adaptive learning. This system contributes to educational transformation through deep, data-driven technology integration.

Tengku Syahvina Rival Dini; Rani Chantika; Pebi Mina Husania; Puji Sri Alhirani

Prosiding Seminar Nasional Ilmu Teknik 2026 Asosiasi Riset Ilmu Teknik Indonesia

This research develops a machine learning model to classify customer loyalty using the Random Forest algorithm. Customer churn is a critical issue that reduces revenue and increases acquisition costs. A dataset of 50,000 customers from global e-commerce and subscription platforms was processed through data cleaning, imputation, outlier handling, and class balancing with SMOTE. The Random Forest model was built as a baseline and optimized with hyperparameter tuning. Evaluation using accuracy, precision, recall, and F1-score shows that the optimized model achieved 90.81% accuracy and 83.87% F1-score, outperforming previous Naïve Bayes approaches. Feature importance analysis highlights customer service interactions, lifetime value, and demographic factors as key predictors of churn. These findings demonstrate Random Forest’s effectiveness in churn prediction and provide practical insights for customer retention strategies

Arsyapradana Fadlanabil Bahri; Oddy Virgantara Putra; Dihin Muriyatmoko

Prosiding Seminar Nasional Ilmu Teknik 2026 Asosiasi Riset Ilmu Teknik Indonesia

The increasing sedentary lifestyle in the digital era has the potential to cause various health problems due to lack of physical activity. One approach that can be taken to encourage physical activity is through the use of digital games with body movement-based control mechanisms. This study aims to develop a body gesture-based game character control system using a hybrid Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) model. CNN is used to extract spatial features from each video frame, while LSTM serves to model the temporal relationship between frames so that movement patterns can be recognized sequentially. The research method used refers to the Machine Learning Lifecycle stages, starting from data collection, preprocessing, model development, to implementation in the endless runner game genre. Testing results show that the CNN–LSTM model is capable of classifying body gestures and generating outputs that can be used as commands to control game characters. The implementation of this system enables more natural and interactive game interactions without conventional input devices, and has the potential to encourage players to lead a more active lifestyle.

Dihin Muriyatmoko; Aziz Musthafa; Yusuf Al Banna

Prosiding Seminar Nasional Ilmu Teknik 2026 Asosiasi Riset Ilmu Teknik Indonesia

Sentiment analysis on social media is widely used to represent public perceptions of sports performance, particularly in international competitions. This study aims to analyze the sentiment of YouTube user comments regarding the performance of the Indonesian National Football Team during the FIFA World Cup 2026 Asian Qualifiers. The data were collected from user comments on videos related to the matches and analyzed using a machine learning–based sentiment analysis approach. Sentiment classification was performed using the Naive Bayes algorithm. The results indicate that the proposed approach is able to effectively identify public sentiment toward the national team’s performance during the qualification matches. The findings of this study are expected to provide insights into public perceptions and contribute to sentiment analysis research in the field of sports.

Imakulata Kresnawati M Bili; I Wayan Sudiarta; Maria Yuditia Wungabelen; Ni Kadek Alika Rosdiana; Putri Rafiana

Jurnal Bisnis Inovatif dan Digital 2026 Asosiasi Riset Ilmu Manajemen Kewirausahaan dan Bisnis Indonesia

Customer churn is a strategic challenge for digital streaming platforms because it directly Impacts revenue and business sustainability. This study aims to analyze the factors influencing customer Churn and develop a churn prediction model using the Random Forest algorithm. The study uses a Quantitative approach with an explanatory design and utilizes secondary data from the Netflix Customer Churn and Engagement Dataset available on Kaggle. The dataset consists of 1,000 customer data with 16 Variables covering demographic characteristics, service usage behavior, financial condition, and customer Satisfaction level. The data was processed through preprocessing, one-hot encoding, and a 70:30 split Between training and test data. Model performance was evaluated using accuracy, precision, recall, F1 Score, and ROC-AUC metrics. The results show that the Random Forest model produces an accuracy of 53.7%, precision of 56.3%, recall of 63.6%, F1-score of 59.7%, and ROC-AUC of 0.534, indicating Moderate predictive ability and only slightly better than random classification. Feature importanceAn.evealed that user engagement levels, such as viewing duration and frequency of interactions, Were the most dominant factors influencing churn, followed by economic factors and customer satisfaction. The results of this study are expected to provide a basis for streaming platforms to design more effective Customer retention strategies.

Tiara Bela Harahap; Lailan Sofinah Harahap; Naina Nazwa Hasibuan

Polygon : Jurnal Ilmu Komputer dan Ilmu Pengetahuan Alam 2026 Asosiasi Riset Ilmu Matematika dan Sains Indonesia

Rainfall is a crucial factor in the stability of the Earth's ecosystem and has a significant impact on agriculture, forestry, energy, and water management. However, increasingly unstable climate change makes rainfall patterns difficult to predict accurately using traditional methods. The city of Medan, the capital of North Sumatra Province, has a tropical rainforest climate with an average annual rainfall of approximately ±2200 mm and an average temperature of 27°C. Significant weather fluctuations in this area can trigger flooding when rainfall increases and cause water shortages when rainfall decreases (BMKG, 2021). Therefore, a prediction approach that can manage non-linear and dynamic data is needed. Artificial Neural Networks (ANN) are one of the reliable machine learning methods for detecting data patterns. By using the backpropagation algorithm, the model can gradually reduce prediction errors, making it widely used in weather forecasting applications. In this regard, this study uses ANN with the backpropagation method to forecast monthly rainfall in Medan City by utilizing data from 2022–2024 as training and testing data.

Eva Andini; Lailan Sofinah Harahap; Siti Nurjanah

Saturnus: Jurnal Teknologi dan Sistem Informasi 2026 Asosiasi Riset Teknik Elektro dan Informatika Indonesia

This study examines the development of a Crude Palm Oil (CPO) price forecasting model using an artificial neural network algorithm, specifically the backpropagation algorithm. As one of Indonesia’s main export commodities, CPO has a significant economic impact and influences the income of oil palm farmers. The CPO price data used in this study were obtained from CIF Rotterdam, covering the period from January 2019 to December 2023. The research methodology consists of several stages, including data collection, preprocessing, model design, and model implementation using Python programming. The training results of the backpropagation algorithm show an error value of 0.537829578 after 1,000 epochs, while the evaluation using Mean Squared Error (MSE) indicates an MSE of 0.022709 during the training process and 0.017604 during the testing process. The model also produces CPO price predictions for the next three months, namely 932.578 for the first month, 949.568 for the second month, and 774.855 for the third month. These findings indicate that the developed model is capable of predicting future CPO prices with adequate accuracy, which can assist companies in making better financial decisions and managing risks associated with CPO price fluctuations.

Agung Narayana Adhi Putra; I Wayan Sudiarsa; I Kadek Adi Gunawan; Kadek Bagus Karunia Dwi Dharmayasa; I Wayan Eka Saputra

Saturnus: Jurnal Teknologi dan Sistem Informasi 2026 Asosiasi Riset Teknik Elektro dan Informatika Indonesia

The retail industry generates an extremely large and continuously growing volume of transactional data along with the advancement of digital technology, thereby requiring sophisticated and systematic data analysis approaches to support effective and evidence-based business decision-making. This study aims to analyze retail sales data by utilizing the Retail Sales Dataset obtained from the Kaggle platform, which consists of 100,000 transaction records and broadly represents the characteristics of retail transactions. The main focus of this study is to classify product categories and predict customer segments, including the identification of high-spending customers (high spenders), based on demographic attributes such as age and gender, as well as various transaction-related features. The research methodology includes data preprocessing, label encoding, and feature engineering to generate additional variables, including Age_Group, Is_Holiday, and Spender_Group, which are expected to enhance the predictive capability of the models. Several machine learning algorithms, namely Decision Tree, Random Forest, and XGBoost, were implemented and evaluated to compare their respective performance. The experimental results indicate that multiclass product category classification achieves relatively low accuracy, ranging from 27% to 34%. These findings suggest the high complexity of retail data and highlight the need for further model optimization, class balancing techniques, and feature refinement to improve predictive performance in future studies.

Nadeerah Hani’ Fauziyyah; I Wayan Sudiarsa; Ida Ayu Eka Sastradewi; Kadek Agustine Yueyin Parisya; Sartika Sartika

Jurnal Manajemen Bisnis Digital Terkini 2026 Asosiasi Riset Ilmu Manajemen Kewirausahaan dan Bisnis Indonesia

Because it directly impacts revenue, customer loyalty, and long-term business sustainability, customer churn is a critical issue for the e-commerce industry. High churn rates indicate that a business is unable to retain existing customers, which means it is more expensive to acquire new customers. Therefore, a precise analytical approach is needed to identify customer behavior patterns that are likely to churn. Using machine learning methods, this study analyzes and predicts customer churn. For this study, the E-Commerce Customer Churn 2025 dataset, obtained from Kaggle, was used. This dataset consists of 10,000 customer data and contains fifteen variables covering transaction behavior, customer characteristics, and churn status. Data preprocessing, descriptive analysis, exploratory data analysis (EDA), and classification model development using Logistic Regression and Random Forest algorithms were part of the research project. Model evaluation was conducted using a Confusion Matrix and Receiver Operating Characteristic (ROC) Curve to evaluate the model's accuracy and ability to distinguish between churned and non-churned customers. The results showed that the Random Forest model performed better than Logistic Regression, with an ROC-AUC of 1.00. Furthermore, feature importance analysis revealed that the days_since_last_purchase variable was the most dominant factor in predicting customer churn. These findings are expected to help e-commerce companies design more effective, data-driven customer retention strategies.  

M. Fiqram Chan Safetra; Nayla Desviona; Helmina Helmina; Amelia Rianti; M.Rezan Prayogi

Algoritma : Jurnal Matematika, Ilmu pengetahuan Alam, Kebumian dan Angkasa 2026 Asosiasi Riset Ilmu Matematika dan Sains Indonesia

Graph theory as a branch of discrete mathematics has experienced significant development in its application to modern complex network systems, particularly in digital social networks and transportation systems. This research aims to analyze fundamental concepts of graph theory, examine characteristics of cycle detection algorithms along with their computational complexity, investigate their application in digital social network analysis, and explore their implementation in digital transportation system optimization. The research method employs a qualitative approach with library research focusing on scientific literature from 2020-2025 period from accredited academic databases such as Scopus, Web of Science, and IEEE Xplore, utilizing thematic analysis techniques to identify meaningful patterns from the examined literature. Research findings indicate that fundamental graph theory concepts including vertices, edges, and graph classifications form the foundation for relational structure modeling. Cycle detection algorithms such as Depth-First Search, Union-Find, and Tarjan demonstrate effectiveness with O(V+E) complexity for large-scale graphs. Applications in digital social networks facilitate community identification through Multi-View Clustering, centrality analysis for influencer detection, and understanding viral information dissemination patterns. Implementation in digital transportation systems demonstrates route planning optimization using Dijkstra and Bellman-Ford algorithms, vulnerability analysis through articulation point and bridge identification, and bottleneck detection with betweenness centrality. The research concludes that integration of graph theory in discrete mathematics education enhances critical thinking skills and real-world application understanding, with recommendations for algorithm development for massive dynamic graphs and machine learning integration in graph algorithm optimization.

Asro Asro; Solihin Solihin; Irlon Irlon

Big Data Analytics and Data Science 2026 Asosiasi Pengelola Jurnal Informatika dan Komputer Indonesia

Real time decision making applications, such as those used in autonomous vehicles, smart cities, and industrial IoT, require fast, scalable, and accurate analytics to ensure timely responses and optimized operations. Traditional cloud-based systems face significant challenges in meeting these requirements due to high latency, limited scalability, and bottlenecks in data processing. This study explores the use of a hybrid Edge Cloud architecture to optimize End to end machine learning (ML) pipelines for real time applications. The proposed system offloads time-sensitive tasks to edge devices, while computationally intensive processes are handled by the cloud, ensuring efficient use of resources and reduced latency. Experimental results demonstrate that the hybrid model reduces inference latency by up to 70% compared to cloud-only systems, while maintaining model accuracy and increasing throughput. Additionally, the scalability of the hybrid architecture is highlighted, as it can handle large-scale data streams and adapt to varying workloads. The findings show that hybrid Edge Cloud architectures are well-suited for applications where fast decision making is critical, such as autonomous systems and real time analytics in smart cities. However, challenges remain in managing resources across edge and cloud systems, particularly in balancing computational loads and ensuring system reliability. Future research should focus on optimizing task partitioning, integrating advanced edge AI models, and exploring the use of 5G networks to enhance performance further. Overall, the study demonstrates the potential of hybrid Edge Cloud systems in overcoming the limitations of traditional cloud-based ML pipelines and provides insights into the future of real time data processing.

Rinna Rachmatika; Kecitaan Harefa

Indonesian Journal of Infomatics 2026 Asosiasi Pengelola Jurnal Informatika dan Komputer Indonesia

Concept drift, the phenomenon where the statistical properties of data streams change over time, poses a significant challenge in machine learning, particularly for long term data streams. Traditional machine learning models, including batch learning and non-adaptive approaches, struggle to detect and adapt to these changes, leading to degraded performance and inaccurate predictions. This study proposes an adaptive computational model designed to detect and respond to concept drift using incremental learning techniques and statistical drift detection mechanisms. The model integrates an Adaptive Drift Detector (ADD) and Incremental Learning System, enabling real-time adjustments to data distribution changes. The model is evaluated across synthetic and real-world datasets, demonstrating its superior ability to detect abrupt, gradual, and recurring drifts compared to traditional models. Experimental results indicate that the adaptive model maintains high prediction accuracy, minimizes false positive rates, and reduces detection delays. Furthermore, the model performs well in resource-constrained environments, making it suitable for real-time applications such as healthcare prediction, fault detection, and IoT systems. Despite its promising performance, the study identifies challenges related to computational complexity and the model’s performance with imbalanced datasets and noisy data. Future research should focus on optimizing the model’s scalability, computational efficiency, and adaptability to more complex data types to ensure broader applicability in dynamic environments. This work contributes to advancing the detection and adaptation of concept drift, offering a robust solution for dynamic and evolving data streams.

Imam Rangga Bakti; Yola Permata Bunda; Mohammad Muhsin

Big Data Analytics and Data Science 2026 Asosiasi Pengelola Jurnal Informatika dan Komputer Indonesia

Distributed software systems face significant challenges related to data quality due to their complex, decentralized architecture. These systems often involve multiple nodes responsible for processing and storing data, making it difficult to maintain consistency and ensure accurate data across the entire network. In particular, issues like data inconsistency, latency, and data fragmentation are prevalent in distributed environments. To address these challenges, this study proposes an integrated data quality governance strategy that combines real time monitoring and automated anomaly detection using machine learning models. The proposed strategy aims to improve data consistency, enhance anomaly detection capabilities, and reduce the need for manual intervention, ultimately improving overall data governance in distributed systems. Real time monitoring ensures immediate identification of data issues as they occur, while machine learning models, such as autoencoders and Isolation Forests, automate the detection of anomalies based on high reconstruction errors and data isolation techniques. The study evaluates the proposed strategy through real-world distributed system scenarios, comparing its effectiveness to traditional approaches like periodic audits and manual validation. Results demonstrate that the integrated approach leads to faster anomaly detection, reduced data inconsistencies, and improved overall system performance. The use of advanced machine learning techniques and real time analytics significantly enhances the system's ability to maintain high data quality standards across multiple distributed nodes. This strategy has wide-ranging implications for industries that rely on distributed systems, such as finance, healthcare, and IoT, where data integrity is essential for operational success. Future research can focus on integrating more advanced machine learning techniques and optimizing the real time monitoring framework to handle larger and more complex systems.