INTERPRETABLE DEEP REINFORCEMENT LEARNING FOR AUTONOMOUS SYSTEMS: INTEGRATING CAUSAL INFERENCE WITH POLICY GRADIENTS

Abdullah Farhan Mahdi

doi:10.12732/ijam.v38i4s.239

PDF

Published: Sep 27, 2025

DOI: https://doi.org/10.12732/ijam.v38i4s.239

Keywords:

Interpretable Reinforcement Learning, Causal Policy Optimization, Autonomous Systems Safety, Structural Causal Models, Policy Gradient Algorithms

Abdullah Farhan Mahdi, Baraa Mohammed Abed,

Abstract

Deep Reinforcement Learning (DRL) faces significant deployment challenges in safety-critical autonomous systems—such as self-driving vehicles and surgical robots—due to the inherent opacity of policy decisions, where unexplained failures obstruct diagnostics and accountability. This work introduces Causal Policy Optimization (CPO), a novel framework that fundamentally addresses this limitation by integrating Structural Causal Models (SCMs) with policy gradient optimization (e.g., PPO). CPO’s core innovation leverages do-calculus-based interventions to modify policy gradients, embedding causal invariances directly into the learning process. Extensive validation across CARLA driving simulations, Safety Gym robotic environments, and physical TurtleBot3 deployments demonstrates that CPO achieves 40-60% higher interpretability than traditional XAI methods (SHAP/LIME), quantified by the Causal Fidelity Score (CFS=0.89), while preserving ≥95% of the performance of conventional policies (cumulative return: 9.72 vs. 9.91 for PPO). Crucially, CPO reduces collision rates by 74.8% in edge-case scenarios and generates real-time, auditable causal explanations (e.g., "Emergency braking triggered by pedestrian trajectory (β=0.67)"). This breakthrough enables regulatory compliance and precise liability attribution, advancing trustworthy autonomy for high-stakes applications where human lives depend on transparent decision-making.

Issue

Vol. 38 No. 4s (2025)

Section

Articles

Article Sidebar

Main Article Content

Abstract

Article Details