Canonical Correlation Analysis (CCA) can reduce image representation dimensionality by 75% while actually improving downstream performance through cross-model agreement.
April 2, 2026
Original Paper
Representation Selection via Cross-Model Agreement using Canonical Correlation Analysis
arXiv · 2604.00921
The Takeaway
It provides a training-free way to distill and refine overcomplete embeddings from massive pre-trained models. This allows practitioners to use smaller, more efficient vectors in downstream search or classification pipelines without the cost of fine-tuning.
From the abstract
Modern vision pipelines increasingly rely on pretrained image encoders whose representations are reused across tasks and models, yet these representations are often overcomplete and model-specific. We propose a simple, training-free method to improve the efficiency of image representations via a post-hoc canonical correlation analysis (CCA) operator. By leveraging the shared structure between representations produced by two pre-trained image encoders, our method finds linear projections that ser