Identifies 'label leakage' from limited task diversity as the primary bottleneck for relational foundation models, rather than raw data volume.
April 1, 2026
Original Paper
Task Scarcity and Label Leakage in Relational Transfer Learning
arXiv · 2603.29914
The Takeaway
Practitioners building tabular or relational models often focus on adding more data; this work shows that without diversifying prediction targets and using specific gradient projection methods to suppress shortcuts, representations will fail to transfer.
From the abstract
Training relational foundation models requires learning representations that transfer across tasks, yet available supervision is typically limited to a small number of prediction targets per database. This task scarcity causes learned representations to encode task-specific shortcuts that degrade transfer even within the same schema, a problem we call label leakage. We study this using K-Space, a modular architecture combining frozen pretrained tabular encoders with a lightweight message-passing