Eliminates the need for strictly aligned image pairs in infrared and visible image fusion.
March 24, 2026
Original Paper
Beyond Strict Pairing: Arbitrarily Paired Training for High-Performance Infrared and Visible Image Fusion
arXiv · 2603.21820
The Takeaway
Collecting aligned cross-modal data is extremely expensive; this method allows training on unaligned or arbitrarily paired data. It matches the performance of models trained on datasets 100x larger, democratizing high-performance image fusion for resource-constrained teams.
From the abstract
Infrared and visible image fusion(IVIF) combines complementary modalities while preserving natural textures and salient thermal signatures. Existing solutions predominantly rely on extensive sets of rigidly aligned image pairs for training. However, acquiring such data is often impractical due to the costly and labour-intensive alignment process. Besides, maintaining a rigid pairing setting during training restricts the volume of cross-modal relationships, thereby limiting generalisation perform