Integrating Hybrid Statistical and Unsupervised LSTM-Guided Feature Extraction for Breast Cancer Detection

Abstract
Breast cancer is the most prevalent cancer among women worldwide, requiring early and accurate diagnosis to reduce mortality. This study proposes a hybrid classification pipeline that integrates Hybrid Statistical Feature Selection (HSFS) with unsupervised LSTM-guided feature extraction for breast cancer detection using the Wisconsin Diagnostic Breast Cancer (WDBC) dataset. Initially, 20 features were selected using HSFS based on Mutual Information, Chi-square, and Pearson Correlation. To address class imbalance, the training set was balanced using the Synthetic Minority Over-sampling Technique (SMOTE). Subsequently, an LSTM encoder extracted non-linear latent features from the selected features. A fusion strategy was applied by concatenating the statistical and latent features, followed by re-selection of the top 30 features. The final classification was performed using a Support Vector Machine (SVM) with RBF kernel and evaluated using 5-fold cross-validation and a held-out test set. Experimental results showed that the proposed method achieved an average training accuracy of 98.13%, F1-score of 98.13%, and AUC-ROC of 99.55%. On the held-out test set, the model reached an accuracy of 99.30%, precision of 100%, and F1-score of 99.05%, with an AUC-ROC of 0.9973. The proposed pipeline demonstrates improved generalization and interpretability compared to existing methods such as LightGBM-PSO, DHH-GRU, and ensemble deep networks. These results highlight the effectiveness of combining statistical selection and LSTM-based latent feature encoding in a balanced classification framework.
Keywords
How to Cite

Setiadi, et al. (2025). Integrating Hybrid Statistical and Unsupervised LSTM-Guided Feature Extraction for Breast Cancer Detection. Journal of Computing Theories and Applications, 2(4). https://doi.org/10.62411/jcta.12698

Setiadi, De Rosal Ignatius Moses; Ojugo, Arnold Adimabua; Pribadi, Octara; Kartikadarma , Etika; Setyoko, Bimo Haryo; Widiono, Suyud; Robet, Robet; Aghaunor, Tabitha Chukwudi; Ugbotu, Eferhire Valentine, "Integrating Hybrid Statistical and Unsupervised LSTM-Guided Feature Extraction for Breast Cancer Detection," Journal of Computing Theories and Applications, vol. 2, no. 4, 2025.

Setiadi, De Rosal Ignatius Moses; Ojugo, Arnold Adimabua; Pribadi, Octara; Kartikadarma , Etika; Setyoko, Bimo Haryo; Widiono, Suyud; Robet, Robet; Aghaunor, Tabitha Chukwudi; Ugbotu, Eferhire Valentine. "Integrating Hybrid Statistical and Unsupervised LSTM-Guided Feature Extraction for Breast Cancer Detection." Journal of Computing Theories and Applications, vol. 2, no. 4, 2025.

Setiadi, De Rosal Ignatius Moses; Ojugo, Arnold Adimabua; Pribadi, Octara; Kartikadarma , Etika; Setyoko, Bimo Haryo; Widiono, Suyud; Robet, Robet; Aghaunor, Tabitha Chukwudi; Ugbotu, Eferhire Valentine. "Integrating Hybrid Statistical and Unsupervised LSTM-Guided Feature Extraction for Breast Cancer Detection." Journal of Computing Theories and Applications 2, no. 4 (2025).

Setiadi, et al. (2025) 'Integrating Hybrid Statistical and Unsupervised LSTM-Guided Feature Extraction for Breast Cancer Detection', Journal of Computing Theories and Applications, 2(4). doi: 10.62411/jcta.12698.

Setiadi, De Rosal Ignatius Moses; Ojugo, Arnold Adimabua; Pribadi, Octara; Kartikadarma , Etika; Setyoko, Bimo Haryo; Widiono, Suyud; Robet, Robet; Aghaunor, Tabitha Chukwudi; Ugbotu, Eferhire Valentine. Integrating Hybrid Statistical and Unsupervised LSTM-Guided Feature Extraction for Breast Cancer Detection. Journal of Computing Theories and Applications. 2025;2(4).

Artikel Terkait
Tren Sitasi Jurnal