CROSS-MODAL FEDERATED LEARNING WITH DIFFERENTIAL PRIVACY FOR REAL-TIME CLINICAL DECISION SUPPORT
Main Article Content
Abstract
Brain tumor segmentation using AI is faced with a built-in dilemma of diagnostic accuracy vs data privacy guidelines that prevent multi-site collaboration, while existing federated learning techniques lack in making strong privacy guarantees and only operate on image data without leveraging clinical text. We propose a privacy-preserving cross-modal federated learning framework with MRI images and clinical text integration using ResNet-50 and BERT encoders with cross-modal attention fusion secured through differential privacy with gradient clipping and Gaussian noise injection. Ranked on 7,023 brain MRI images across simulated federated environments with up to 20 clients, our solution achieves 92.8% IID accuracy, 89.1% non-IID accuracy, and 87.3% differential privacy (ε=1.0, δ=1×10⁻⁵) accuracy, which is equivalent to 92.7% centralized performance retention. The multimodal fusion shows 6.9% improvement over vision-only methods, and the sub-500ms inference offers possible real-time deployment. This work introduces a novel cross-modal attention mechanism for federated medical imaging with strict differential privacy that maintains clinical utility, provides rigorous testing across heterogeneous data distributions, and demonstrates scalable and diversified framework support with mathematical privacy guarantees against membership inference attacks.