MITIGATING VERBATIM MEMORIZATION IN DEEP LEARNING VIA DYNAMIC ATTENTION PRUNING

Deepesh Khanna

MITIGATING VERBATIM MEMORIZATION
IN DEEP LEARNING VIA DYNAMIC ATTENTION PRUNING

Deepesh Khanna
Ashburn, Virginia, USA - 20147

Abstract

Abstract. Large-scale deep learning models, particularly Transformer-based architectures, have demonstrated an increasing tendency to memorize training data verbatim. This phenomenon poses significant privacy risks, such as the extraction of Personally Identifiable Information (PII) and the leakage of proprietary datasets. Existing mitigation strategies, such as Differential Privacy (DP), often incur severe utility costs, degrading model accuracy and increasing training latency. This paper proposes a novel framework, Dynamic Entropy-Based Attention Pruning (DEBAP), which identifies and disables attention heads that exhibit high "copy-mechanism" behaviors during training. By analyzing the entropy of attention distributions, we demonstrate that specific heads are disproportionately responsible for memorization. Our experiments on GPT-2 Small trained on WikiText-103 and Vision Transformers (ViT) trained on CIFAR-100 show that DEBAP reduces the success rate of canary extraction attacks by approximately 44.5% while maintaining test set perplexity within 1.5% of the baseline. These findings suggest that privacy-preserving generalization can be achieved through targeted architectural sparsification rather than blanket regularization.

Received: 30 Nov. 2023

Key Words and Phrases: Deep Learning, Memorization, Privacy, Transformer Pruning, Attention Mechanisms, Generalization, Machine Unlearning, AI Governance.

Download paper from here.

How to cite this paper?
Source: International Journal of Applied Mathematics
ISSN printed version: 1311-1728
ISSN on-line version: 1314-8060
Year: 2023
Volume: 36
Issue: 6

Back to Contents

References

[1] Kaplan, J., et al. (2020). "Scaling laws for neural language models." arXiv preprint arXiv:2001.08361.
[2] Carlini, N., et al. (2021). "Extracting Training Data from Large Language Models." USENIX Security Symposium.
[3] Shokri, R., et al. (2017). "Membership inference attacks against machine learning models." IEEE Symposium on Security and Privacy (SP).
[4] Abadi, M., et al. (2016). "Deep learning with differential privacy." ACM CCS.
[5] Olsson, C., et al. (2022). "In-context learning and induction heads." Transformer Circuits Thread.
[6] Feldman, V. (2020). "Does learning require memorization? A short tale about a long tail." ACM STOC.
[7] Michel, P., Levy, O., & Neubig, G. (2019). "Are Sixteen Heads Really Better Than One?" NeurIPS.
[8] Voita, E., et al. (2019). "Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting." ACL.
[9] Geva, M., et al. (2021). "Transformer Feed-Forward Layers Are Key-Value Memories." EMNLP.

IJAM: Volume 36, No. 6 (2023)

Abstract

References

Search

IJAM