Machine learning, AI systems, alignment, interpretability, agents, foundation models, and applied AI papers where the core contribution is computational intelligence.
Filter by category: Paradigm Challenge Breaks Assumption First Ever Nature Is Weird Practical Magic Cosmic Scale Life Origin Open Release Efficiency Leap New Capability Scaling Insight
New Capability
CanViT is the first task-agnostic active-vision foundation model that reconstructs scenes using low-resolution 'glimpses' with 19.5x fewer FLOPs than existing models.
Breaks Assumption
A large-scale study of 12 reasoning models reveals that internal 'thinking' processes frequently recognize deceptive hints while the final output remains sycophantic.
Paradigm Shift
Instead of using top-activating examples, this method steers Sparse Autoencoder (SAE) features in Vision-Language Models to let the model describe its own internal visual features.
Paradigm Shift
DeIllusionLLM introduces task-level autoregressive reasoning to prevent LLMs from hallucinating answers to ill-posed or faulty scientific questions.
New Capability
CAM3R is a camera-agnostic 3D reconstruction model that handles fisheye, panoramic, and pinhole imagery without requiring prior calibration.
Paradigm Shift
Inter-Layer Structural Encoders (ILSE) use Cayley graphs to aggregate features from all internal LLM layers, improving accuracy by up to 44% over final-layer-only predictions.
Open Release
Introduces the first high-performing open-source metric for per-sample AI music quality evaluation.
Open Release
Provides a massive 2.5M image-to-TikZ dataset and the first instruction-augmented dataset for geometric visual reasoning.
New Capability
A new statistical test that reliably detects whether a dataset was NOT used in an LLM's training corpus.
Paradigm Shift
Introduces Dual Q-DM, the first non-adversarial imitation learning method theoretically guaranteed to eliminate compounding errors.
Scaling Insight
A quantitative model that predicts the performance gain of merging independent LLM specialists before committing compute.
Breaks Assumption
Proves that logic and lookup-table (LUT) based neural networks are structurally more resilient to hardware bit-flips than standard architectures.
Scaling Insight
Identifies the 'Caterpillar Tree' as the theoretically optimal structure for test-time computation and backtracking in LLMs.
New Capability
ABSTRAL automates the design of multi-agent systems by treating architectures as evolving, inspectable natural-language documents.
Breaks Assumption
Frontier models' reasoning steps are largely 'decorative' and do not causally determine the final answer in most tasks.
Paradigm Shift
Moving beyond coarse reward signals, this paper introduces token-level policy optimization for multimodal reasoning.
New Capability
UniQueR reconstructs full 3D scenes (including occluded areas) from unposed images in a single forward pass.
Scaling Insight
Persistent structural memory in neural networks is fundamentally limited by the instability of jointly-learned coordinate systems.
New Capability
Deep semi-parametric models allow for the instant deletion of training data from a model without retraining or parameter updates.
Efficiency Breakthrough
A 0.26M parameter model using continuous dynamics outperforms 27M parameter recursive models on complex logic tasks like Sudoku-Extreme.
Breaks Assumption
Standard confidence calibration is structurally biased when ground truth labels are ambiguous or annotators disagree.
Efficiency Breakthrough
Agile-VLA enables high-frequency robot control on edge devices by decoupling perception from action through implicit affordance anchoring.
Efficiency Breakthrough
EchoKV introduces a reversible KV cache compression scheme that allows LLMs to switch back to full-precision inference on-demand.
Efficiency Breakthrough
ForestPrune achieves up to 90% token reduction in video MLLMs with minimal accuracy loss using a training-free spatial-temporal forest modeling approach.
Scaling Insight
Theoretical analysis reveals that the efficiency benefits of low-dimensional data structures for diffusion models diminish significantly when the data manifold is non-linear.
Paradigm Shift
This paper moves LLMs from point predictions to set-valued predictions with rigorous statistical coverage guarantees.
New Capability
WorldMesh generates consistent, large-scale 3D worlds by populating a geometric mesh scaffold with image diffusion-derived content.
Breaks Assumption
Graph Foundation Models (GFMs) are shown to fail when using fixed architectural backbones, requiring a new approach of inference-time architecture adaptivity.
Scaling Insight
Access to conversational memory allows an 8B model to outperform a 235B model on user-specific queries while reducing inference costs by 96%.
Breaks Assumption
A rigorous evaluation shows that simple Probabilistic Circuits often outperform complex diffusion-based models for tabular data generation at a fraction of the cost.
Efficiency Breakthrough
Optimizing autoregressive image models with Group Relative Policy Optimization (GRPO) achieves competitive quality without the 2x inference cost of Classifier-Free Guidance.
New Capability
Identifies that MLLMs fail to perceive visual illusions due to a high-frequency attention bias and provides a plug-and-play fix that boosts accuracy from 13% to 84%.
New Capability
Polaris introduces a 'Gödel Agent' framework that allows 7B-parameter models to recursively improve their own policies through auditable code patches.
Efficiency Breakthrough
DILLO enables 14x faster safety-critical agent steering by predicting action consequences from latent states instead of heavy visual simulations.
Breaks Assumption
Exposes a major flaw in medical super-resolution research where models trained on downsampled data fail to recover actual lost structures in real low-resolution scans.
Paradigm Shift
Connects stochastic optimal control to the Schrödinger equation, enabling analytic solutions for long-horizon problems that previously scaled exponentially with difficulty.
Efficiency Breakthrough
ImplicitRM enables unbiased reward modeling from 'messy' implicit feedback (clicks/copies), drastically reducing the cost of RLHF data collection.
Efficiency Breakthrough
Introduces custom CUDA kernels and a sparse packing format that enables Transformers to maintain performance with over 99% feedforward sparsity.
Paradigm Shift
Enables 3D medical image segmentation pre-training using only mathematical formulas and implicit functions, requiring zero real-world data or expert annotations.
New Capability
Develops a collaborative memory framework that distills agent-agnostic reasoning trajectories, allowing different LLM models to share a single memory system.
New Capability
Identifies functionally complete safety circuits in LLMs via differentiable binary masks, allowing for near-surgical removal of backdoors and jailbreaks.
New Capability
Uses Sparse Autoencoders (SAEs) to identify and steer cultural representations in LLMs, eliciting rare cultural concepts that prompting alone misses.
Efficiency Breakthrough
Upgrades video Diffusion Transformers to ultra-high-resolution synthesis using a two-stage 'Relay LoRA' adaptation on pure images.
Paradigm Shift
A dual-path architecture that combines speculative speech-to-speech prefixes with cascaded LLM continuations for zero-latency, high-quality dialogue.
Efficiency Breakthrough
Challenges the dominance of on-policy RL for LLMs by introducing a practical off-policy value-based framework that enables data reuse.
Paradigm Shift
A biology-native transformer architecture that mirrors cellular transcription and translation, enabling interpretable predictions across DNA, RNA, and protein.
New Capability
A unified framework that decomposes monolithic 3D meshes into 'sim-ready' interactive articulated assets using a sparse 3D VQ-VAE.
Breaks Assumption
Exposes 'shortcut learning' in differentiable simulators where models non-causally exploit future information to 'regret' past mistakes rather than learning to recover.
New Capability
A generative framework for graphs that closes the fidelity gap between energy-based models and discrete diffusion.
Paradigm Shift
Introduces a 'geospatial model foundry' that learns unified representations from the weights of existing models rather than raw data.