A new attack method called ProjRes identifies whether a specific person's data was used to train a model with nearly 100% accuracy.
April 24, 2026
Original Paper
Toward Efficient Membership Inference Attacks against Federated Large Language Models: A Projection Residual Approach
arXiv · 2604.21197
The Takeaway
Federated learning is supposed to protect privacy by keeping data on local devices, but the final global model still leaks the training history. Even when developers add strong differential privacy layers, the projection residual of the gradients remains a fingerprint of the original data. This attack proves that the privacy promised by federated systems is often a mathematical illusion. A malicious actor can verify if a specific medical record or private conversation was part of the training set just by looking at the model updates. This discovery puts high-stakes industries like healthcare and finance at risk of violating data protection laws. Privacy in distributed training needs a fundamental redesign to withstand this type of inference.
From the abstract
Federated Large Language Models (FedLLMs) enable multiple parties to collaboratively fine-tune LLMs without sharing raw data, addressing challenges of limited resources and privacy concerns. Despite data localization, shared gradients can still expose sensitive information through membership inference attacks (MIAs). However, FedLLMs' unique properties, i.e. massive parameter scales, rapid convergence, and sparse, non-orthogonal gradients, render existing MIAs ineffective. To address this gap, w