SciRepID - CyberBERT: A Semantic Search Framework for Security Terminologies Using Transformer Models


CyberBERT: A Semantic Search Framework for Security Terminologies Using Transformer Models

Journal of Information Technology and Computer Science
International Forum of Researchers and Lecturers (IFREL)

📄 Abstract

: The rapid expansion of cybersecurity standards and threat intelligence frameworks has led to significant semantic fragmentation among security terminologies, hindering effective information retrieval and interoperability across systems. Traditional keyword-based search approaches are inadequate for capturing the contextual meaning of security terms, particularly within formal frameworks such as NIST, MITRE ATT&CK, and CWE. This study addresses this challenge by proposing CyberBERT, a transformer-based semantic search framework designed to align cybersecurity terminologies through deep contextual representation and ontology-driven reasoning. Research Objectives: The primary objective of this research is to develop a semantic retrieval model capable of understanding conceptual relationships between security terms beyond lexical similarity. Methodology: The proposed methodology fine-tunes a BERT-based model on the NIST Glossary corpus using a combination of masked language modeling and triplet loss objectives to generate discriminative semantic embeddings. These embeddings are further aligned with cybersecurity ontologies, including MITRE ATT&CK and CWE, to enhance semantic consistency and explainability. Semantic retrieval is performed using cosine similarity within a 768-dimensional embedding space and evaluated using Mean Reciprocal Rank (MRR) and Precision@K metrics. Results: Experimental results demonstrate that CyberBERT achieves an MRR of 0.832, outperforming domain-adapted baselines such as SecureBERT and CyBERT. The integration of ontology alignment improves semantic accuracy by over 6%, while robustness evaluations confirm resilience against adversarial linguistic perturbations. Visualization using t-SNE reveals coherent semantic clustering aligned with the five core NIST Cybersecurity Framework functions. Conclusions: In conclusion, CyberBERT effectively bridges semantic gaps across cybersecurity terminologies by combining transformer-based contextual learning with ontological reasoning. The framework offers a robust, interpretable, and scalable solution for semantic search, supporting improved interoperability and knowledge discovery in cybersecurity operations and standards harmonization.

🔖 Keywords

#Semantic Search; Cybersecurity Ontologies; Transformer Models; Terminology Alignment; Semantic Interoperability

ℹ️ Informasi Publikasi

Tanggal Publikasi
29 December 2025
Volume / Nomor / Tahun
Volume 1, Nomor 4, Tahun 2025

📝 HOW TO CITE

Sinaga, Rudolf; Frangky, "CyberBERT: A Semantic Search Framework for Security Terminologies Using Transformer Models," Journal of Information Technology and Computer Science, vol. 1, no. 4, Dec. 2025.

ACM
ACS
APA
ABNT
Chicago
Harvard
IEEE
MLA
Turabian
Vancouver

🔗 Artikel Terkait dari Jurnal yang Sama

📊 Statistik Sitasi Jurnal

Tren Sitasi per Tahun