Introduces a feature-matching objective for LLM fine-tuning that targets sequence-level statistics without requiring reward models or ground-truth verifiers.
March 13, 2026
Original Paper
Matching Features, Not Tokens: Energy-Based Fine-Tuning of Language Models
arXiv · 2603.12248
The Takeaway
Energy-Based Fine-Tuning (EBFT) provides a path to align models on open-ended tasks where simple pass/fail verifiers don't exist. It outperforms SFT by using on-policy rollout features, providing a denser semantic signal than standard cross-entropy training.
From the abstract
Cross-entropy (CE) training provides dense and scalable supervision for language models, but it optimizes next-token prediction under teacher forcing rather than sequence-level behavior under model rollouts. We introduce a feature-matching objective for language-model fine-tuning that targets sequence-level statistics of the completion distribution, providing dense semantic feedback without requiring a task-specific verifier or preference model. To optimize this objective efficiently, we propose