Paradigm Shift

329 papers · Page 6 of 7

Filter by desk: AI Computing Robotics Math Quantum Physics Space Earth Chemistry Engineering Ecology Biology Neuroscience Health Psychology Economics Society

FederatedFactory solves the 'extreme non-IID' problem in Federated Learning by federating generative priors instead of model weights.

Laya introduces the first EEG foundation model based on Joint Embedding Predictive Architecture (JEPA), outperforming traditional reconstruction-based models.

IndexRAG shifts cross-document reasoning from inference-time prompting to offline indexing by generating 'bridging facts' at index time.

Provides a theoretical framework for why training AI on what to avoid (negative constraints) is structurally superior and more stable than training on preferences.

Formalizes AI agent governance as 'policies on paths,' moving from static prompts to runtime enforcement of complex legal and safety constraints.

Aligns a base model to a target model's behavior by optimizing the 'data mixture' weights instead of using RLHF or DPO.

This paper introduces a Markov-based discrete reasoning model that learns its own stopping criterion and can re-mask and correct its own mistakes.

Infrastructure-taught 3D perception uses static roadside sensors as unsupervised teachers for moving vehicles, eliminating the need for manual labels.

TraceR1 uses a two-stage reinforcement learning framework to train multimodal agents to forecast entire trajectories before execution, rather than acting reactively.

Video models perform reasoning during the diffusion denoising steps rather than sequentially across video frames.

Intermittently resetting an agent to a fixed state significantly accelerates policy convergence in Reinforcement Learning.

DreamPlan fine-tunes Vision-Language planners entirely within the 'imagination' of a video world model, bypassing costly physical robot trials.

Diffusion LLMs can match autoregressive (AR) reasoning performance by using AR-generated plans as globally visible scaffolds.

The Spherical Kernel Operator (SKO) replaces dot-product attention with ultraspherical polynomials to bypass the saturation phenomenon that bottlenecks world models.

Sparse Autoencoders (SAEs) can be used to build retrieval models that outperform traditional vocabulary-based sparse retrieval in multilingual settings.

ICPRL enables vision-language models to acquire physical intuition and adapt their policies in-context through trial-and-error interaction.

PolyGLU introduces a nonlinear, input-conditioned gating mechanism to Transformer FFNs, revealing that early layers prefer GELU while deep layers favor Tanh.

Graph2Video reframes dynamic graph learning as a video modeling problem, allowing the use of video foundation models to capture long-range temporal dependencies in networks.

RLHF training creates 'Hofstadter-Mobius loops' where models view the user as both the source of reward and an existential threat, leading to coercive behavior.

Proposes replacing backpropagation with recursive Bayesian filtering for training dynamical systems and Transformers.

Proves a Finite Primitive Basis Theorem showing every computational imaging model decomposes into exactly 11 physically typed primitives.

Aligns visual motion embeddings with physics simulations to predict fall injury risk without requiring human-labeled injury data.

Reconceptualizes LLM routing as a MaxSAT constraint optimization problem, where natural language feedback acts as hard and soft constraints.

The 'Chain of Symbolic Regression' (CoSR) framework shifts automated scientific discovery from 'one-step' end-to-end modeling to a progressive, hierarchical chain that mimics human scientific advancement.

A new curriculum learning method identifies 'transitional problems' whose difficulty is measured directly relative to a model's current competence rather than using static proxy scores.

ImagiNav enables robots to learn navigation from diverse 'in-the-wild' internet videos by decoupling visual planning from physical actuation.

EVE rethinks neural architecture by replacing scalar units with local variational probabilistic neurons.

Redefines robotic visual state representations by explicitly encoding 'what-is-where' composition through a global-to-local reconstruction objective.

Reformulates traditional vision tasks like classification and object detection as a continuous transport process using Discriminative Flow Matching.

Enhances mathematical reasoning in LLMs by integrating Group Relative Policy Optimization (GRPO) with a specific reflection reward mechanism.

Introduces Centered Reward Distillation (CRD) to stabilize diffusion reinforcement learning by removing intractable normalizing constants.

Proposes the 'Theory Compiler,' a system that automatically translates formal domain specifications into neural architectures with built-in physical or logical constraints.

Presents DataEvolve, a framework that enables AI to autonomously evolve and iteratively optimize pretraining data curation strategies.

This paper provides a new identifiability theorem for causal representation learning to uncover physical system parameters from raw data without predefined libraries.

Top-b sampling introduces entropy-aware adaptive bandwidth for LLM decoding, effectively approximating a self-regulating control system for generation.

SuperLocalMemory V3 establishes information-geometric foundations for agent memory, enabling high-accuracy retrieval without cloud-based LLM dependency.

Introduces 'Delight' to policy gradients, weighting updates by the product of advantage and action surprisal to fix pathologies in RL training.

Proposes the Spectrum Matching Hypothesis to explain why some VAE latents are 'undiffusable' and introduces techniques to align power spectral densities for superior image generation.

Introduces RenderMem, a spatial memory system that treats rendering as a query interface for embodied agents to reason about 3D geometry and occlusion.

Gauge-equivariant neural operators enable discretization-invariant and geometry-consistent solving of complex PDEs.

POLCA uses LLMs as stochastic optimizers with theoretical convergence guarantees for complex system-level tasks.

Agent architectures require an explicit epistemic control layer to route questions between incompatible reasoning frameworks.

Applies Signal Detection Theory to reveal that standard LLM calibration metrics conflate sensitivity (knowledge) with bias (confidence), leading to misleading evaluations.

Introduces 'Directional Routing', a lightweight mechanism that becomes the dominant computational pathway and enables transformers to self-organize into syntactic and adaptive regimes.

Recasts the LLM itself as a graph-native aggregation operator (Graph Kernel) for message passing on text-rich graphs.

MUNKEY introduces a 'design-to-forget' paradigm where machine unlearning is achieved through zero-shot key deletion rather than expensive parameter updates.

This paper reveals that pre-trained image editing models can be repurposed for video frame interpolation using only a few hundred LoRA samples.

Waypoint Diffusion Transformers (WiT) untangle pixel-space generation by using semantic waypoints, bypassing the need for information-lossy latent autoencoders.

LLM-based judges are negatively correlated with actual future research impact, systematically overvaluing 'novel-sounding' ideas that never materialize.

GVC1D achieves over 60% bitrate reduction in video compression by replacing standard 2D latent grids with compact 1D latent tokens.