📄
Abstract
The increasing adoption of collaborative robots in modern manufacturing environments requires reliable perception systems that can ensure both safety and operational efficiency during human–robot collaboration. This study proposes a CNN-based real-time computer vision system for object and human detection in shared robotic workspaces. The research focuses on developing and evaluating a single-stage deep learning detection model optimized for real-time performance while maintaining high detection accuracy. The proposed methodology includes dataset preparation, model training using transfer learning, real-time system implementation, and comprehensive performance evaluation. Experimental results demonstrate that the developed system achieves high detection accuracy, as reflected by strong precision, recall, and mean Average Precision (mAP) values, while maintaining low inference latency suitable for real-time operation. The system consistently operates above real-time frame-rate thresholds, ensuring timely perception updates required for safety-related decision-making in collaborative robotic environments. Graphical and quantitative analyses further confirm the stability of inference performance under dynamic interaction scenarios involving human movement and multiple objects. Compared with existing approaches, the proposed system provides a balanced trade-off between accuracy and computational efficiency, making it practical for deployment in safety-aware human–robot collaboration scenarios. Overall, the findings indicate that CNN-based real-time object detection systems can effectively support perception and situational awareness in collaborative robotics, contributing to safer and more efficient industrial automation.