Mathematical proof that cosine similarity between label representations (unembeddings) in softmax classifiers is fundamentally uninformative.
April 1, 2026
Original Paper
Why not to use Cosine Similarity between Label Representations
arXiv · 2603.29488
The Takeaway
It demonstrates that any softmax model can be transformed into an equivalent model where cosine similarities between labels are flipped (e.g., from 1 to -1) without changing output probabilities. This warns practitioners against using raw cosine similarity for tasks like zero-shot classification or semantic probing without proper centering.
From the abstract
Cosine similarity is often used to measure the similarity of vectors. These vectors might be the representations of neural network models. However, it is not guaranteed that cosine similarity of model representations will tell us anything about model behaviour. In this paper we show that when using a softmax classifier, be it an image classifier or an autoregressive language model, measuring the cosine similarity between label representations (called unembeddings in the paper) does not give any