CyberBERT: A Semantic Search Framework for Security Terminologies Using Transformer Models

Abstract
: The rapid expansion of cybersecurity standards and threat intelligence frameworks has led to significant semantic fragmentation among security terminologies, hindering effective information retrieval and interoperability across systems. Traditional keyword-based search approaches are inadequate for capturing the contextual meaning of security terms, particularly within formal frameworks such as NIST, MITRE ATT&CK, and CWE. This study addresses this challenge by proposing CyberBERT, a transformer-based semantic search framework designed to align cybersecurity terminologies through deep contextual representation and ontology-driven reasoning. Research Objectives: The primary objective of this research is to develop a semantic retrieval model capable of understanding conceptual relationships between security terms beyond lexical similarity. Methodology: The proposed methodology fine-tunes a BERT-based model on the NIST Glossary corpus using a combination of masked language modeling and triplet loss objectives to generate discriminative semantic embeddings. These embeddings are further aligned with cybersecurity ontologies, including MITRE ATT&CK and CWE, to enhance semantic consistency and explainability. Semantic retrieval is performed using cosine similarity within a 768-dimensional embedding space and evaluated using Mean Reciprocal Rank (MRR) and Precision@K metrics. Results: Experimental results demonstrate that CyberBERT achieves an MRR of 0.832, outperforming domain-adapted baselines such as SecureBERT and CyBERT. The integration of ontology alignment improves semantic accuracy by over 6%, while robustness evaluations confirm resilience against adversarial linguistic perturbations. Visualization using t-SNE reveals coherent semantic clustering aligned with the five core NIST Cybersecurity Framework functions. Conclusions: In conclusion, CyberBERT effectively bridges semantic gaps across cybersecurity terminologies by combining transformer-based contextual learning with ontological reasoning. The framework offers a robust, interpretable, and scalable solution for semantic search, supporting improved interoperability and knowledge discovery in cybersecurity operations and standards harmonization.
Keywords
How to Cite

Sinaga, et al. (2025). CyberBERT: A Semantic Search Framework for Security Terminologies Using Transformer Models. Journal of Information Technology and Computer Science, 1(4). https://doi.org/10.70062/globalscience.v1i4.179

Sinaga, Rudolf; Frangky, "CyberBERT: A Semantic Search Framework for Security Terminologies Using Transformer Models," Journal of Information Technology and Computer Science, vol. 1, no. 4, 2025.

Sinaga, Rudolf; Frangky. "CyberBERT: A Semantic Search Framework for Security Terminologies Using Transformer Models." Journal of Information Technology and Computer Science, vol. 1, no. 4, 2025.

Sinaga, Rudolf; Frangky. "CyberBERT: A Semantic Search Framework for Security Terminologies Using Transformer Models." Journal of Information Technology and Computer Science 1, no. 4 (2025).

Sinaga, et al. (2025) 'CyberBERT: A Semantic Search Framework for Security Terminologies Using Transformer Models', Journal of Information Technology and Computer Science, 1(4). doi: 10.70062/globalscience.v1i4.179.

Sinaga, Rudolf; Frangky. CyberBERT: A Semantic Search Framework for Security Terminologies Using Transformer Models. Journal of Information Technology and Computer Science. 2025;1(4).

Artikel Terkait
Tren Sitasi Jurnal