AI & Machine Learning

2,557 papers · Page 47 of 52

Machine learning, AI systems, alignment, interpretability, agents, foundation models, and applied AI papers where the core contribution is computational intelligence.

Filter by category: Paradigm Challenge Breaks Assumption First Ever Nature Is Weird Practical Magic Cosmic Scale Life Origin Open Release Efficiency Leap New Capability Scaling Insight

Breaks Assumption

Identifies that extended reasoning in Multimodal LLMs causes 'attention dispersion,' where models literally lose focus on visual inputs as the reasoning chain lengthens.

Efficiency Breakthrough

Enables model adaptation on edge devices and non-differentiable (quantized) models using a purely backpropagation-free optimization framework.

Breaks Assumption

Discovers that frozen video diffusion models already encode physical plausibility in their features, allowing for cost-effective inference-time physics filtering.

Introduces a decentralized, multi-agent framework for scientific discovery that uses an 'ArtifactReactor' for plannerless coordination and full computational lineage.

Scaling Insight

Proposes spectral clipping to stabilize LLM training by addressing 'spectral spikes' in stochastic gradient noise that adaptive optimizers like AdamW fail to handle.

Efficiency Breakthrough

Achieves real-time, low-latency talking avatar generation at 34ms per frame using a one-step streaming diffusion framework.

Scaling Insight

Introduces Matrix-to-Matrix RNNs (M$^2$RNN) with matrix-valued hidden states that outperform hybrid Transformers while using 3x smaller state sizes.

Proposes the 'Theory Compiler,' a system that automatically translates formal domain specifications into neural architectures with built-in physical or logical constraints.

Introduces 'Visual Chronometer' to estimate physical frame rates directly from visual dynamics, addressing the 'chronometric hallucinations' common in generative video models.

Segment Anything Reasoner (StAR) successfully introduces parallel test-time scaling to visual segmentation tasks, eliciting latent reasoning capabilities from base models.

Breaks Assumption

Argues that probability gradients are superior to standard log-probability gradients for RL training, proposing a new optimization method (DGPO) to solve divergence in soft clipping.

Presents DataEvolve, a framework that enables AI to autonomously evolve and iteratively optimize pretraining data curation strategies.

Efficiency Breakthrough

Introduces ZoomUI, a trainless method for GUI grounding that uses inference-time scaling to anchor natural language instructions to interface elements.

Efficiency Breakthrough

FLORE achieves 1000x error reduction in linear sketching while being 100x faster than previous learning-based solutions.

V-JEPA 2.1 unlocks dense, spatially structured features in video self-supervised learning, yielding massive gains in robotic manipulation and navigation.

This paper provides a new identifiability theorem for causal representation learning to uncover physical system parameters from raw data without predefined libraries.

Scaling Insight

The Infinite Problem Generator (IPG) uses executable code to synthesize and verify 100% accurate physics reasoning data, overcoming LLM hallucination in data scaling.

Breaks Assumption

Simple regularization and data-hybrid training are shown to be sufficient to prevent catastrophic forgetting in MLLMs, challenging the need for complex anti-forgetting architectures.

Efficiency Breakthrough

SleepGate introduces a biologically inspired 'sleep cycle' for the KV cache to resolve proactive interference in long-context LLMs.

One-Policy-Fits-All (OPFA) learns a single manipulation policy across 11 different embodiments, including grippers and dexterous hands, using geometry-aware action latents.

Interp3R is the first method to estimate depth and camera poses at arbitrary time instants by interpolating pointmaps using asynchronous event data.

Breaks Assumption

Distilled VAE encoders are found to perform significantly better on higher, unseen resolutions than on their native training resolution.

Efficiency Breakthrough

ASAP reduces LVLM computational FLOPs by ~80% with virtually no loss in performance using a training-free KV-Cache pruning recipe.

MorFiC achieves zero-shot locomotion transfer across quadrupeds of different sizes and masses with up to 5x speed gains over standard baselines.

Top-b sampling introduces entropy-aware adaptive bandwidth for LLM decoding, effectively approximating a self-regulating control system for generation.

SuperLocalMemory V3 establishes information-geometric foundations for agent memory, enabling high-accuracy retrieval without cloud-based LLM dependency.

Efficiency Breakthrough

FlashHead is a drop-in replacement for the LM classification head that provides 1.75x inference speedup by treating vocabulary selection as a retrieval problem.

Introduces 'Delight' to policy gradients, weighting updates by the product of advantage and action surprisal to fix pathologies in RL training.

Scaling Insight

Determines the optimal compute distribution for retrieval agents, showing that re-ranking depth is far more critical than query expansion strength.

Proposes the Spectrum Matching Hypothesis to explain why some VAE latents are 'undiffusable' and introduces techniques to align power spectral densities for superior image generation.

Discovers interpretable 'atoms' of model behavior by decomposing training gradients, enabling unsupervised discovery and steering of complex behaviors like refusal or arithmetic.

Introduces RenderMem, a spatial memory system that treats rendering as a query interface for embodied agents to reason about 3D geometry and occlusion.

Breaks Assumption

Reveals that larger language models are significantly better at concealing knowledge during audits, with detection traces vanishing beyond 70 billion parameters.

Achieves pose-free 3D Gaussian Splatting using only event streams, enabling reconstruction in extreme lighting and high-speed motion scenarios.

Efficiency Breakthrough

Reformulates diffusion sampling as a graph-theoretic planning problem that dynamically allocates compute to the most difficult denoising stages.

Breaks Assumption

Formalizes the 'Visual Confused Deputy' attack, where agents are tricked into authorizing privileged actions via slight visual screen manipulations.

Efficiency Breakthrough

Generates novel, structurally plausible protein sequences from small alignments using a training-free stochastic attention mechanism on a standard laptop.

Breaks Assumption

Explicit identity framing is not necessary and may be inferior for low-data LoRA safety fine-tuning.

Gauge-equivariant neural operators enable discretization-invariant and geometry-consistent solving of complex PDEs.

Efficiency Breakthrough

Adaptive computation for multimodal LLMs drastically reduces compute waste on easy cases while focusing on hard ones.

Breaks Assumption

BrainBench exposes a significant gap between LLM benchmark performance and genuine commonsense reasoning.

A training-free operator for streaming 3D reconstruction reduces geometric drift using Grassmannian manifolds.

POLCA uses LLMs as stochastic optimizers with theoretical convergence guarantees for complex system-level tasks.

DynaAvatar achieves zero-shot 3D human reconstruction from a single image with motion-dependent cloth dynamics.

Efficiency Breakthrough

HO-SFL enables backprop-free fine-tuning on edge devices without the convergence penalty typical of zeroth-order methods.

Agent architectures require an explicit epistemic control layer to route questions between incompatible reasoning frameworks.

Efficiency Breakthrough

RAZOR provides a lightweight, targeted unlearning framework for Transformers and Diffusion models without retraining.

Breaks Assumption

Demonstrates that safety and utility in LVLMs are not inherently antagonistic and can be simultaneously improved through inference-time projection.

Scaling Insight

Provides the first theoretical proof that dataset distillation efficiently encodes the low-dimensional structure of non-linear tasks.

Breaks Assumption

Proves a fundamental expressivity limit where Message-Passing Graph Neural Networks are infinitely weaker than standard Color Refinement algorithms.