AI & ML Efficiency Breakthrough

Extracts dense 3D Signed Distance Fields from images in under 3 seconds using feed-forward geometry transformer latents.

March 30, 2026

Original Paper

Fus3D: Decoding Consolidated 3D Geometry from Feed-forward Geometry Transformer Latents

Laura Fink, Linus Franke, George Kopanas, Marc Stamminger, Peter Hedman

arXiv · 2603.25827

The Takeaway

Traditional 3D reconstruction requires costly per-scene optimization or post-hoc fusion; this method decodes geometry directly from pre-trained transformer embeddings, enabling near real-time 3D perception from unstructured images.

From the abstract

We propose a feed-forward method for dense Signed Distance Field (SDF) regression from unstructured image collections in less than three seconds, without camera calibration or post-hoc fusion. Our key insight is that the intermediate feature space of pretrained multi-view feed-forward geometry transformers already encodes a powerful joint world representation; yet, existing pipelines discard it, routing features through per-view prediction heads before assembling 3D geometry post-hoc, which disc