AI & ML Efficiency Breakthrough

S-VGGT introduces structure-aware subscene decomposition to break the quadratic scaling bottleneck of 3D foundation models.

March 19, 2026

Original Paper

S-VGGT: Structure-Aware Subscene Decomposition for Scalable 3D Foundation Models

Xinze Li, Pengxu Chen, Yiyuan Wang, Weifeng Su, Wentao Cheng

arXiv · 2603.17625

The Takeaway

It replaces global attention with parallel processing of geometrically bridged subscenes. This allows for massive scaling in 3D reconstruction from dense captures without the traditional memory and compute overhead, making high-fidelity 3D foundation models practical.

From the abstract

Feed-forward 3D foundation models face a key challenge: the quadratic computational cost introduced by global attention, which severely limits scalability as input length increases. Concurrent acceleration methods, such as token merging, operate at the token level. While they offer local savings, the required nearest-neighbor searches introduce undesirable overhead. Consequently, these techniques fail to tackle the fundamental issue of structural redundancy dominant in dense capture data. In thi