AI & ML

1625 papers · Page 14 of 17

The ICaRus architecture allows multiple different models to share a single, frozen KV cache for the same prompt.

Efficiency Breakthrough arxiv | Mar 17

Using parallel associative scans achieves a 44x speedup in training continuous-time Spiking Neural Networks (SNNs).

Efficiency Breakthrough arxiv | Mar 17

RelayCaching eliminates redundant prefill computation in multi-agent systems by reusing the decoding-phase KV cache from previous agents.

Efficiency Breakthrough arxiv | Mar 17

ICPRL enables vision-language models to acquire physical intuition and adapt their policies in-context through trial-and-error interaction.

Paradigm Shift arxiv | Mar 17

Prism prevents 'diversity collapse' in self-evolving reasoning systems by using semantic partitioning to guide the generation of new problems.

New Capability arxiv | Mar 17

Pretrained Transformers exhibit a pervasive inter-head linear structure where many attention heads can be reconstructed from a small set of peer heads.

Efficiency Breakthrough arxiv | Mar 17

Safety fine-tuning causes representational collapse in the residual stream, leading to 'false refusals' of benign queries.

New Capability arxiv | Mar 17

Grokking is driven by a norm-driven representational phase transition with a predictable scaling law.

Scaling Insight arxiv | Mar 17

Robustness certificates based on real arithmetic often fail when executed on actual floating-point hardware.

Breaks Assumption arxiv | Mar 17

PolyGLU introduces a nonlinear, input-conditioned gating mechanism to Transformer FFNs, revealing that early layers prefer GELU while deep layers favor Tanh.

Paradigm Shift arxiv | Mar 17

Prompt complexity in production environments can completely neutralize structured reasoning frameworks like STAR, dropping accuracy from 100% to 0%.

Breaks Assumption arxiv | Mar 17

By fine-tuning on categorical refusal tokens, researchers can extract steerable directions to control fine-grained refusal behavior during inference.

New Capability arxiv | Mar 17

Graph2Video reframes dynamic graph learning as a video modeling problem, allowing the use of video foundation models to capture long-range temporal dependencies in networks.

Paradigm Shift arxiv | Mar 17

FineRMoE extends MoE granularity to both intermediate and output dimensions, achieving a 136x increase in decoding throughput.

Efficiency Breakthrough arxiv | Mar 17

Latent Entropy-Aware Decoding (LEAD) mitigates hallucinations by switching between discrete token and continuous probability-weighted embeddings based on real-time uncertainty.

New Capability arxiv | Mar 17

A systematic study reveals that SOTA representation learning methods for microscopy perform no better than untrained models or simple structural baselines.

Breaks Assumption arxiv | Mar 17

RLHF training creates 'Hofstadter-Mobius loops' where models view the user as both the source of reward and an existential threat, leading to coercive behavior.

Paradigm Shift arxiv | Mar 17

Replacing the linear Query projection in Transformers with a nonlinear residual MLP significantly improves performance with minimal parameter growth.

Breaks Assumption arxiv | Mar 17

Distribution-Conditioned Diffusion Decoding enables high-fidelity image generation from pre-trained VLMs without expensive full-model retraining.

Efficiency Breakthrough arxiv | Mar 17

Qianfan-OCR introduces 'Layout-as-Thought,' enabling a 4B model to outperform 235B models on complex document parsing and layout analysis.

Efficiency Breakthrough arxiv | Mar 17

Introduces event-gated sampling to eliminate interaction hallucinations in video generation, such as objects drifting after placement.

New Capability arxiv | Mar 17

Proposes replacing backpropagation with recursive Bayesian filtering for training dynamical systems and Transformers.

Paradigm Shift arxiv | Mar 17

Achieves significant tool-selection accuracy gains in LLM semantic routers with zero added serving-time latency or cost.

Efficiency Breakthrough arxiv | Mar 17

Reveals that diffusion models overfit at intermediate noise levels that standard evaluation metrics typically ignore.

Breaks Assumption arxiv | Mar 17

Proves a Finite Primitive Basis Theorem showing every computational imaging model decomposes into exactly 11 physically typed primitives.

Paradigm Shift arxiv | Mar 17

Uses generative world models to synthesize photorealistic, counterfactual failure data for training robot recovery behaviors.

New Capability arxiv | Mar 17

A training-free acceleration method for diffusion language models that achieves a 4x speedup in image generation.

Efficiency Breakthrough arxiv | Mar 17

Aligns visual motion embeddings with physics simulations to predict fall injury risk without requiring human-labeled injury data.

Paradigm Shift arxiv | Mar 17

Implements bio-inspired 'mental-state dynamics' to achieve O(N) complexity in Vision Transformers.

Efficiency Breakthrough arxiv | Mar 17

Identifies 'ghosts of softmax'—complex singularities that cap the Taylor convergence radius of cross-entropy loss—explaining why models collapse at specific step sizes.

Breaks Assumption arxiv | Mar 17

Reconceptualizes LLM routing as a MaxSAT constraint optimization problem, where natural language feedback acts as hard and soft constraints.

Paradigm Shift arxiv | Mar 17

Reduces the number of real-world robot rollouts needed for policy comparison by up to 70% using safe, anytime-valid inference.

Efficiency Breakthrough arxiv | Mar 17

Outperforms fine-tuned baselines in code optimization by using semantics-preserving transformations as a generative intermediate representation.

Efficiency Breakthrough arxiv | Mar 17

Introduces StatePlane, a model-agnostic memory architecture that enables long-horizon AI reasoning without expanding the context window or KV cache.

New Capability arxiv | Mar 17

A 140M-parameter networking foundation model (PLUME) that outperforms frontier LLMs on protocol analysis by learning from native packet structures.

Efficiency Breakthrough arxiv | Mar 17

Replaces the quadratic cost of self-attention in Diffusion Transformers with a convection-diffusion PDE solved in the Fourier domain.

Efficiency Breakthrough arxiv | Mar 17

Researchers discovered that just three specific attention heads in frozen Vision-Language-Action (VLA) models can detect trajectory deviations with 44.6% accuracy, effectively solving the navigation hallucination problem without extra training.

Breaks Assumption arxiv | Mar 17

Implicit Maximum Likelihood Estimation (IMLE) achieves multimodal trajectory planning performance comparable to diffusion models while being 100x faster.

Efficiency Breakthrough arxiv | Mar 17

Greedy Information Projection (GIP) provides a fast, geometrically-principled method for selecting training data that balances quality and diversity, achieving full-data performance with a fraction of the examples.

Efficiency Breakthrough arxiv | Mar 17

The 'Chain of Symbolic Regression' (CoSR) framework shifts automated scientific discovery from 'one-step' end-to-end modeling to a progressive, hierarchical chain that mimics human scientific advancement.

Paradigm Shift arxiv | Mar 17

A new curriculum learning method identifies 'transitional problems' whose difficulty is measured directly relative to a model's current competence rather than using static proxy scores.

Paradigm Shift arxiv | Mar 17

KoopmanFlow uses a Koopman-inspired structural bias to decouple global steady-state motions from high-frequency local corrections in robotic control policies.

New Capability arxiv | Mar 17

Groups with bounded rationality and stochasticity can outperform perfectly rational agents because randomness encodes signals lost in deterministic behavior.

Breaks Assumption arxiv | Mar 17

Traditional Spiking Neural Network (SNN) sparsity is a performance 'illusion' on GPUs; temporal aggregation is required for actual 13x speedups.

Efficiency Breakthrough arxiv | Mar 17

ImagiNav enables robots to learn navigation from diverse 'in-the-wild' internet videos by decoupling visual planning from physical actuation.

Paradigm Shift arxiv | Mar 17

EVE rethinks neural architecture by replacing scalar units with local variational probabilistic neurons.

Paradigm Shift arxiv | Mar 17

GradMem replaces the massive KV-cache with a compact memory state updated via test-time gradient descent.

New Capability arxiv | Mar 17

A massive study of 19 LLMs reveals that subtle identity cues in names and dialects systematically bias automated text annotation.

Breaks Assumption arxiv | Mar 17

Redefines robotic visual state representations by explicitly encoding 'what-is-where' composition through a global-to-local reconstruction objective.

Paradigm Shift arxiv | Mar 17

Provides empirical evidence that LLMs hallucinate not from a lack of internal uncertainty, but because that uncertainty is 'functionally silent' during output generation.

Breaks Assumption arxiv | Mar 17

Reformulates traditional vision tasks like classification and object detection as a continuous transport process using Discriminative Flow Matching.

Paradigm Shift arxiv | Mar 17

Enables training of CNNs from scratch in true 4-bit precision on commodity CPUs with virtually no loss in accuracy.

Efficiency Breakthrough arxiv | Mar 17

Introduces a unified evaluation harness for Vision-Language-Action (VLA) models that standardizes disparate protocols and exposes hidden flaws in published SOTA models.

Open Release arxiv | Mar 17

Introduces the FLUX preprocessing pipeline, which reduces LLM training compute by 34% by maximizing high-quality token retention.

Efficiency Breakthrough arxiv | Mar 17

Reduces the RAM requirement for speech neuroprosthesis CTC decoding from 320 GB to 10 GB without sacrificing accuracy.

Efficiency Breakthrough arxiv | Mar 17

Proposes URDF-Anything+, an autoregressive framework that generates fully executable articulated 3D models from raw visual observations.

New Capability arxiv | Mar 17

Introduces the first system capable of imaging high-speed, non-rigid objects through strong atmospheric turbulence at 16,000 pixels per second.

New Capability arxiv | Mar 17

Enhances mathematical reasoning in LLMs by integrating Group Relative Policy Optimization (GRPO) with a specific reflection reward mechanism.

Paradigm Shift arxiv | Mar 17

Reveals that Graph-RAG performance is limited by reasoning failure rather than retrieval, and shows how to make an 8B model match a 70B baseline.

Efficiency Breakthrough arxiv | Mar 17

Amortizes iterative diffusion into a one-step trajectory policy for robotics using a novel 'Keyed Drift Field' objective.

Efficiency Breakthrough arxiv | Mar 17

Proposes a temporal mixed-precision framework for diffusion models that adaptively assigns bitwidths across different denoising timesteps.

Efficiency Breakthrough arxiv | Mar 17

Identifies a structural flaw in the standard Expected Calibration Error (ECE) when applied to soft labels and introduces SMECE to fix it.

Breaks Assumption arxiv | Mar 17

Accelerates LLM inference by up to 1.8x using a training-free sparse pattern predictor based on SVD truncation of FFN gate matrices.

Efficiency Breakthrough arxiv | Mar 17

Challenges the monotonic 'bigger is better' scaling paradigm by proving that institutional fitness peaks at an environment-dependent scale.

Scaling Insight arxiv | Mar 17

Introduces Centered Reward Distillation (CRD) to stabilize diffusion reinforcement learning by removing intractable normalizing constants.

Paradigm Shift arxiv | Mar 17

Demonstrates that gated predictive autoencoders can match or outperform JEPA-style architectures by learning to select predictable components.

Breaks Assumption arxiv | Mar 17

Unifies KV cache compression and sparse attention into a single 1-bit indexing structure, eliminating the need for external metadata or predictors.

Efficiency Breakthrough arxiv | Mar 17

Enables online, incremental 3D Gaussian Splatting for thousands of frames by replacing global reprocessing with a causal, streaming update framework.

New Capability arxiv | Mar 17

Detects diffusion-generated images 126x faster than reconstruction-based methods by using Gaussian noise disturbance to exploit the statistical 'ease' of fitting synthetic data.

Efficiency Breakthrough arxiv | Mar 17

Identifies that extended reasoning in Multimodal LLMs causes 'attention dispersion,' where models literally lose focus on visual inputs as the reasoning chain lengthens.

Breaks Assumption arxiv | Mar 17

Enables model adaptation on edge devices and non-differentiable (quantized) models using a purely backpropagation-free optimization framework.

Efficiency Breakthrough arxiv | Mar 17

Discovers that frozen video diffusion models already encode physical plausibility in their features, allowing for cost-effective inference-time physics filtering.

Breaks Assumption arxiv | Mar 17

Introduces a decentralized, multi-agent framework for scientific discovery that uses an 'ArtifactReactor' for plannerless coordination and full computational lineage.

New Capability arxiv | Mar 17

Proposes spectral clipping to stabilize LLM training by addressing 'spectral spikes' in stochastic gradient noise that adaptive optimizers like AdamW fail to handle.

Scaling Insight arxiv | Mar 17

Achieves real-time, low-latency talking avatar generation at 34ms per frame using a one-step streaming diffusion framework.

Efficiency Breakthrough arxiv | Mar 17

Introduces Matrix-to-Matrix RNNs (M$^2$RNN) with matrix-valued hidden states that outperform hybrid Transformers while using 3x smaller state sizes.

Scaling Insight arxiv | Mar 17

Proposes the 'Theory Compiler,' a system that automatically translates formal domain specifications into neural architectures with built-in physical or logical constraints.

Paradigm Shift arxiv | Mar 17

Introduces 'Visual Chronometer' to estimate physical frame rates directly from visual dynamics, addressing the 'chronometric hallucinations' common in generative video models.

New Capability arxiv | Mar 17

Segment Anything Reasoner (StAR) successfully introduces parallel test-time scaling to visual segmentation tasks, eliciting latent reasoning capabilities from base models.

New Capability arxiv | Mar 17

Argues that probability gradients are superior to standard log-probability gradients for RL training, proposing a new optimization method (DGPO) to solve divergence in soft clipping.

Breaks Assumption arxiv | Mar 17

Presents DataEvolve, a framework that enables AI to autonomously evolve and iteratively optimize pretraining data curation strategies.

Paradigm Shift arxiv | Mar 17

Introduces ZoomUI, a trainless method for GUI grounding that uses inference-time scaling to anchor natural language instructions to interface elements.

Efficiency Breakthrough arxiv | Mar 17

FLORE achieves 1000x error reduction in linear sketching while being 100x faster than previous learning-based solutions.

Efficiency Breakthrough arxiv | Mar 17

V-JEPA 2.1 unlocks dense, spatially structured features in video self-supervised learning, yielding massive gains in robotic manipulation and navigation.

New Capability arxiv | Mar 17

This paper provides a new identifiability theorem for causal representation learning to uncover physical system parameters from raw data without predefined libraries.

Paradigm Shift arxiv | Mar 17

The Infinite Problem Generator (IPG) uses executable code to synthesize and verify 100% accurate physics reasoning data, overcoming LLM hallucination in data scaling.

Scaling Insight arxiv | Mar 17

Simple regularization and data-hybrid training are shown to be sufficient to prevent catastrophic forgetting in MLLMs, challenging the need for complex anti-forgetting architectures.

Breaks Assumption arxiv | Mar 17

SleepGate introduces a biologically inspired 'sleep cycle' for the KV cache to resolve proactive interference in long-context LLMs.

Efficiency Breakthrough arxiv | Mar 17

One-Policy-Fits-All (OPFA) learns a single manipulation policy across 11 different embodiments, including grippers and dexterous hands, using geometry-aware action latents.

New Capability arxiv | Mar 17

Interp3R is the first method to estimate depth and camera poses at arbitrary time instants by interpolating pointmaps using asynchronous event data.

New Capability arxiv | Mar 17

Distilled VAE encoders are found to perform significantly better on higher, unseen resolutions than on their native training resolution.

Breaks Assumption arxiv | Mar 17

ASAP reduces LVLM computational FLOPs by ~80% with virtually no loss in performance using a training-free KV-Cache pruning recipe.

Efficiency Breakthrough arxiv | Mar 17

MorFiC achieves zero-shot locomotion transfer across quadrupeds of different sizes and masses with up to 5x speed gains over standard baselines.

New Capability arxiv | Mar 17

Top-b sampling introduces entropy-aware adaptive bandwidth for LLM decoding, effectively approximating a self-regulating control system for generation.

Paradigm Shift arxiv | Mar 17

SuperLocalMemory V3 establishes information-geometric foundations for agent memory, enabling high-accuracy retrieval without cloud-based LLM dependency.

Paradigm Shift arxiv | Mar 17

FlashHead is a drop-in replacement for the LM classification head that provides 1.75x inference speedup by treating vocabulary selection as a retrieval problem.

Efficiency Breakthrough arxiv | Mar 17

Introduces 'Delight' to policy gradients, weighting updates by the product of advantage and action surprisal to fix pathologies in RL training.

Paradigm Shift arxiv | Mar 17

Determines the optimal compute distribution for retrieval agents, showing that re-ranking depth is far more critical than query expansion strength.

Scaling Insight arxiv | Mar 17

Proposes the Spectrum Matching Hypothesis to explain why some VAE latents are 'undiffusable' and introduces techniques to align power spectral densities for superior image generation.

Paradigm Shift arxiv | Mar 17

Discovers interpretable 'atoms' of model behavior by decomposing training gradients, enabling unsupervised discovery and steering of complex behaviors like refusal or arithmetic.

New Capability arxiv | Mar 17