📄
Abstract
Human–Robot Interaction (HRI) systems increasingly rely on data-driven approaches to interpret multimodal sensory inputs and support natural interaction. However, purely neural-based HRI models often suffer from limited interpretability and insufficient context-aware decision-making, which can reduce user trust and adaptability in dynamic interaction scenarios. To address these limitations, this study proposes a hybrid neural–symbolic HRI framework that integrates multimodal neural perception with explicit symbolic reasoning for adaptive and interpretable robot behavior. The proposed system combines deep neural networks for processing visual, speech, and gesture inputs with a rule-based symbolic reasoning layer that models interaction context, user states, and behavioral constraints. A loosely coupled integration strategy enables neural outputs to be transformed into symbolic representations, allowing logical inference to guide action selection while preserving perceptual accuracy. The framework was evaluated through controlled HRI experiments comparing a neural-only baseline with the proposed hybrid configuration across multiple interaction scenarios. Experimental results demonstrate that the hybrid neural–symbolic system significantly improves interaction accuracy, contextual responsiveness, and user satisfaction, while achieving substantial gains in interpretability. These findings indicate that symbolic reasoning effectively complements neural perception by enhancing transparency and context-aware adaptation without compromising performance. The study concludes that hybrid neural–symbolic architectures provide a promising foundation for developing trustworthy, adaptive, and human-centered HRI systems.