Prakash, Chandra; Sisodia, Avneesh; Lind, Mary
Agentic artificial intelligence (AI) systems capable of autonomous goal-directed behavior, multi-step planning, tool use, multi-agent coordination, and iterative self-correction represent a transition from passive clinical AI tools toward systems that can participate in complex healthcare workflows. However, empirical evidence remains fragmented across clinical decision support, patient monitoring, and administrative applications, and no systematic synthesis has evaluated which agentic principles have been technically demonstrated and which have accumulated sufficient evidence to support responsible clinical deployment. We conducted a PRISMA-informed systematic review of peer-reviewed empirical studies published between January 2025 and April 2026. Searches across five bibliographic databases and Google Scholar, supplemented by citation tracking, identified 443 unique records for screening, of which 25 met the predefined PICOS and quality appraisal criteria. Evidence was synthesized using an evidence-informed seven-principle framework derived from the integration of agentic AI, clinical AI, and healthcare governance literature. This framework provides a structured lens for examining how agentic principles are evaluated individually and in combination, enabling a deployment-readiness perspective that extends beyond capability-focused assessments alone. The evidence base was concentrated on technical capability principles, whereas human oversight, safety, compliance, and equity-related evaluation received comparatively limited attention. Most studies remained at the laboratory, benchmark, or proof-of-concept stage, and none reported demographic-stratified performance outcomes. Overall, the findings suggest a structural asymmetry in agentic healthcare AI: empirical research is advancing agentic capabilities more rapidly than it is generating evidence for the oversight, safety, equity, and governance mechanisms required for responsible clinical translation.