AI & ML

1625 papers · Page 2 of 17

Reconstructs authentic LiDAR point clouds under jamming attacks with a 92% success rate by exploiting raw full-waveform representations.

New Capability arxiv | Apr 2

Identifies a fundamental quality-exploration dilemma in Diffusion Language Models where remasking improves single-sample quality but kills reasoning diversity.

Paradigm Shift arxiv | Apr 2

Gradient-based data valuation (TracIn) outperforms all human-crafted metadata heuristics for ordering curriculum learning in motion planners.

Scaling Insight arxiv | Apr 2

Introduces training-free and model-free trajectory planning by computing diffusion score functions directly from data libraries via kernel-weighted estimation.

Paradigm Shift arxiv | Apr 2

Foundational deep networks consistently assign higher density to simpler images, regardless of training data or architecture complexity.

Breaks Assumption arxiv | Apr 2

A cross-graph tuning-free prompting framework for GNNs that achieves massive gains on unseen graphs without retraining.

Efficiency Breakthrough arxiv | Apr 2

Proposes a decision-centric architecture that separates signal estimation from control policy to make LLM system decisions explicit and inspectable.

Paradigm Shift arxiv | Apr 2

Enables zero-shot humanoid navigation in unseen environments using only 5 hours of human walking data and no robot-specific data.

New Capability arxiv | Apr 2

A white-box membership inference attack using 'gradient-induced feature drift' to outperform all existing confidence-based methods.

New Capability arxiv | Apr 2

Self-Routing removes the need for learned routers in Mixture-of-Experts (MoE) by using hidden states directly for expert assignment.

Efficiency Breakthrough arxiv | Apr 2

Improves Qwen2.5-7B performance on AIME2024 by 137% through test-time iterative rethinking and majority-voted pseudo-labels.

Efficiency Breakthrough arxiv | Apr 2

Automates mathematical optimization modeling using reinforcement learning with solver-derived rewards instead of human process supervision.

Efficiency Breakthrough arxiv | Apr 2

Reveals that many 'polysemantic' neurons in LLMs are actually firing for shared word forms (lexical) rather than compressed semantic concepts.

Breaks Assumption arxiv | Apr 2

Truth Anchoring (TAC) provides a post-hoc calibration method to align LLM uncertainty metrics with actual factual correctness.

Paradigm Shift arxiv | Apr 2

Demonstrates that LLM judge panels follow power-law discovery curves, where panel size and persona diversity are critical for uncovering edge-case failures.

Scaling Insight arxiv | Apr 2

Identifies 'diversity collapse' in the popular GRPO reinforcement learning method and introduces MUPO to maintain broad reasoning paths.

Paradigm Shift arxiv | Apr 2

Introduces the first auto-regressive framework for Gaussian Splatting, enabling parallel, progressive next-scale 3D generation.

New Capability arxiv | Apr 2

Optimizes LLM inference scheduling by treating output length as a heavy-tailed distribution rather than a point estimate.

Efficiency Breakthrough arxiv | Apr 2

Introduces negative early exit and adaptive boosting to make Monte Carlo Tree Search (MCTS) practical for real-time LLM inference.

Efficiency Breakthrough arxiv | Apr 2

Achieves a major breakthrough in dataset distillation, reaching 60% accuracy on ImageNet-1K using only a handful of synthetic images.

Efficiency Breakthrough arxiv | Apr 2

Enables 'Elastic Inference' where a single trained model can be converted to multiple lower-precision formats on-the-fly without retraining.

Efficiency Breakthrough arxiv | Apr 2

Proposes a parameter-efficient LLM adaptation method that enables rapid specialization on non-stationary streams while preventing catastrophic forgetting.

New Capability arxiv | Apr 2

Replaces manual rubric-tuning for synthetic data with an automated gradient-guided optimization framework based on influence estimation.

Paradigm Shift arxiv | Apr 2

Rebuilds the Agent-Computer Interaction (ACI) stack for scientific discovery, solving the fragility of JSON tool-calling and execution sandboxes.

New Capability arxiv | Apr 2

Scales imitation learning data efficiency by generating synthetic 'multi-view' demonstrations from a single expert trajectory.

Efficiency Breakthrough arxiv | Apr 2

Introduces SIGN, a framework capable of discovering governing symbolic equations for networked systems with over 100,000 nodes.

New Capability arxiv | Apr 2

Discovers 'Quality Corruption,' an adversarial failure mode where accuracy collapses while detection counts remain stable, proving robustness is substrate-dependent.

Breaks Assumption arxiv | Apr 2

Proposes Physical Imitation Learning (PIL) to offload up to 87% of a control policy's mechanical power to passive robotic joints.

Efficiency Breakthrough arxiv | Apr 2

OmniVoice is an open-source TTS model scaling to over 600 languages using a novel diffusion language model architecture.

Open Release arxiv | Apr 2

TTA-Vid enables video reasoning models to adapt to new domains at test-time using label-free reinforcement learning on a single sample.

New Capability arxiv | Apr 2

Introduces HiLL, a framework that jointly trains a 'hinter' and 'reasoner' to prevent advantage collapse in reinforcement learning for hard tasks.

Paradigm Shift arxiv | Apr 2

Establishes a three-dimensional scaling law for RAG-pretraining, modeling the optimal data budget allocation between model parameters, tokens, and retrieval store size.

Scaling Insight arxiv | Apr 2

CircuitProbe identifies reasoning circuits in Transformers 1000x faster than brute-force methods and predicts the efficacy of layer duplication.

Efficiency Breakthrough arxiv | Apr 2

LangMARL introduces agent-level credit assignment and policy gradient evolution directly in the natural language space for multi-agent coordination.

Paradigm Shift arxiv | Apr 2

Provides the first controlled study of Silent Data Corruption (SDC) in GPUs and its catastrophic impact on LLM pretraining stability.

Breaks Assumption arxiv | Apr 2

Spectral Compact Training (SCT) enables training 70B-parameter architectures on consumer hardware like the Steam Deck (8GB RAM) via permanent SVD factors.

Efficiency Breakthrough arxiv | Apr 2

Stochastic Attention achieves a global receptive field in O(log n) layers by using randomized routing inspired by the fruit fly connectome.

Paradigm Shift arxiv | Apr 2

ThoughtSteer demonstrates the first successful backdoor attack on continuous latent reasoning models that leave no token-based audit trail.

New Capability arxiv | Apr 2

Mechanistic analysis reveals that LLMs fail at character counting not because they lack the information, but because 'negative circuits' in the final layers actively suppress the correct answer.

Breaks Assumption arxiv | Apr 2

This paper achieves O(1) complexity for multimillion-class classification by leveraging predefined vector systems in the latent space.

Efficiency Breakthrough arxiv | Apr 2

Routing-Free MoE replaces centralized routing with individual expert-level activation, eliminating the need for Softmax and Top-K load balancing.

Paradigm Shift arxiv | Apr 2

Molecular Memory allows MoE systems to recover previously learned domain expertise 9-11x faster by utilizing cost-penalized fitness metrics that preserve dormant experts.

Efficiency Breakthrough arxiv | Apr 2

OBD-LLM uses second-order Hessian information to achieve 20-40% better low-rank decomposition accuracy than the current state-of-the-art SVD-LLM.

Efficiency Breakthrough arxiv | Apr 2

Policy Improvement Reinforcement Learning (PIRL) shifts the training objective from reward maximization to explicit maximization of policy progress across iterations.

Paradigm Shift arxiv | Apr 2

PixelPrune identifies and removes pixel-level redundancy before the Vision Transformer encoder, delivering up to 4.2x inference speedup for high-resolution VLM tasks.

Efficiency Breakthrough arxiv | Apr 2

An autonomous research pipeline discovered a lifelong multimodal memory framework by diagnosing and fixing its own architectural bugs and data pipeline issues.

New Capability arxiv | Apr 2

EmbedPart achieves a 100x speedup over Metis for graph partitioning by clustering node embeddings rather than operating on raw graph structures.

Efficiency Breakthrough arxiv | Apr 2

A lightweight probing method predicts LLM downstream task performance from internal representations during training, reducing evaluation latency from one hour to three minutes.

Efficiency Breakthrough arxiv | Apr 2

Canonical Correlation Analysis (CCA) can reduce image representation dimensionality by 75% while actually improving downstream performance through cross-model agreement.

Efficiency Breakthrough arxiv | Apr 2

WARP provides provable, guaranteed repairs for inner layers of Transformers, overcoming the limitation of previous methods restricted to the final layer.

New Capability arxiv | Apr 2

Proposes dense point trajectories as universal 'visual tokens' for behavior that generalize across different species and non-rigid objects.

Paradigm Shift arxiv | Apr 2

Releases the GPT-NL Public Corpus, the largest permissively licensed (CC-BY) Dutch-first dataset for LLM pre-training.

Open Release arxiv | Apr 2

Decouples weather forecasting from spatial resolution by using Flow Matching to super-resolve coarse trajectories as a post-processing step.

Efficiency Breakthrough arxiv | Apr 2

Solves highly intractable (#P-hard) multi-objective optimization problems with tight approximation guarantees using a novel SAT-oracle approach.

New Capability arxiv | Apr 2

Demonstrates that covert collusion between multi-agent LLM systems can be detected zero-shot using internal model activations.

New Capability arxiv | Apr 2

Achieves 'zero forgetting' in continual learning by stacking frozen domain-specific MoE-LoRA adapters with a meta-router.

Paradigm Shift arxiv | Apr 2

First humanoid robot system to achieve consecutive ping-pong strikes using only onboard egocentric vision and whole-body coordination.

New Capability arxiv | Apr 2

Reveals a 'Reasoning Shift' where increased context length silently causes models to skip self-verification and shorten their reasoning traces by up to 50%.

Breaks Assumption arxiv | Apr 2

Introduces S0 tuning for hybrid RNN-attention models, outperforming LoRA by 10.8% with zero inference overhead.

Efficiency Breakthrough arxiv | Apr 2

Reduces the compute cost of LLM test-time scaling by up to 67% using conformal prediction to calibrate reasoning paths.

Efficiency Breakthrough arxiv | Apr 2

Replaces standard relative Softmax attention with 'Multiscreening' to allow absolute query-key relevance, yielding 3.2x faster inference at 100K context.

Paradigm Shift arxiv | Apr 2

Simple Self-Distillation (SSD) improves LLM code generation (e.g., Qwen3-30B) by 13% Pass@1 without any external verifiers or teacher models.

Scaling Insight arxiv | Apr 2

Provides causal evidence that reasoning models often decide on an action (like a tool call) before they even start generating their 'Chain-of-Thought'.

Breaks Assumption arxiv | Apr 2

Combines the YOCO architecture with recursive computation to scale representational depth without inflating the KV cache.

Efficiency Breakthrough arxiv | Apr 2

Solves the long-standing trade-off in low-rank matrix recovery by achieving both optimal sample complexity and fast convergence.

Efficiency Breakthrough arxiv | Apr 2

Provides a theoretical explanation for why Transformers often fail compared to linear models in financial time series forecasting.

Breaks Assumption arxiv | Apr 2

Enables Gaussian Processes to scale on modern parallel hardware by removing the need for Cholesky decompositions.

Efficiency Breakthrough arxiv | Apr 2

Introduces 'deconfounding scores' to enable reliable causal effect estimation even when treatment and control groups have very little overlap.

New Capability arxiv | Apr 2

Delivers a state-of-the-art universal phone recognition model across 100+ languages with full open-source release.

Open Release arxiv | Apr 2

Researchers have designed a new internet protocol specifically for a 10-node colony network spanning Earth, the Moon, and Mars.

Cosmic Scale arxiv | Apr 1

Everyday 5G cell towers can be repurposed as a massive radar system capable of tracking drones hidden in urban noise.

Practical Magic arxiv | Apr 1

AI voice assistants can be tricked into 'hearing' voices and events that never actually happened with near-perfect accuracy.

Nature Is Weird arxiv | Apr 1

Future wireless signals could be boosted by walls that physically shift and morph their shape to bounce waves toward your phone.

Practical Magic arxiv | Apr 1

Researchers have mapped out all 19.3 million chords the human hand can play on a piano to reveal why some sound 'clear' and others 'muddy.'

Paradigm Challenge arxiv | Apr 1

Interfaces LLMs with Wikidata-scale graphs for multi-hop reasoning without any retraining of the model or the query executor.

New Capability arxiv | Apr 1

A unified, open-source framework that converts complex post-training quantization workflows into a single-line, hardware-aware pipeline.

Open Release arxiv | Apr 1

Decouples data mixture ratio selection from continual pre-training by optimizing distribution vectors post-hoc with 15-35x lower compute cost.

Efficiency Breakthrough arxiv | Apr 1

Achieves an 80x improvement in stable generation length for occupancy world models, enabling 4km+ autonomous driving simulations from a single frame.

New Capability arxiv | Apr 1

Replaces the heuristic constant momentum (0.9) with a parameter-free, physics-inspired schedule that speeds up convergence by nearly 2x.

Paradigm Shift arxiv | Apr 1

Leverages model reprogramming as an 'active signal amplifier' to proactively audit privacy leakage in LLMs and Diffusion models.

New Capability arxiv | Apr 1

Combines differentiable optimization with exact ILP solvers to achieve a 10x performance gain in solving NP-hard combinatorial scheduling problems.

Efficiency Breakthrough arxiv | Apr 1

Proposes a mathematical framework where 'spectral gaps' in parameter updates control phase transitions like grokking and loss plateaus.

Paradigm Shift arxiv | Apr 1

Large-scale experiments reveal that self-organizing LLM agents spontaneously outperform manually designed hierarchical structures by 14%.

Breaks Assumption arxiv | Apr 1

A fabricated 16nm SoC that performs real-time 3D occupancy mapping under 6 mW, reducing query energy by over 80%.

Efficiency Breakthrough arxiv | Apr 1

Proposes a neuroscience-grounded memory architecture that makes interactions cheaper and more accurate with experience, rather than relying on expanding context windows.

Paradigm Shift arxiv | Apr 1

Reveals that parallel translated data is surprisingly unnecessary for creating aligned multilingual representations in LLMs.

Breaks Assumption arxiv | Apr 1

Discovers that pretraining Implicit Neural Representations (INRs) on structured $1/f^\alpha$ noise performs as well as data-driven initialization.

Breaks Assumption arxiv | Apr 1

Introduces DASES, a framework that replaces passive validation with active 'falsification' to ensure scientific models learn actual mechanisms rather than just winning benchmarks.

Paradigm Shift arxiv | Apr 1

Generates complete, simulatable analog circuits in milliseconds, outperforming search-based methods by over 600x.

Efficiency Breakthrough arxiv | Apr 1

Demonstrates that integer multiplication is not a long-range dependency problem, and that current architectures like Transformers and Mamba are fundamentally using the wrong 'computational spacetime.'

Breaks Assumption arxiv | Apr 1

Introduces PolarQuant, a quantization method that uses Hadamard rotation to make LLM weights near-lossless at 5-bit without calibration data.

Efficiency Breakthrough arxiv | Apr 1

Demonstrates that the 'modality gap' in CLIP-style models is a feature that can be exploited to increase robustness without retraining.

Breaks Assumption arxiv | Apr 1

Achieves a +48pp accuracy gain in agents using a non-parametric online learning framework that reuses procedural plans without updating model weights.

New Capability arxiv | Apr 1

Scales curvature-aware bilevel optimization to BERT-sized models using KFAC, significantly outperforming standard gradient unrolling.

Efficiency Breakthrough arxiv | Apr 1

Switches the training objective from hard Next-Token Prediction to predicting 'concepts' (sets of semantically related tokens).

Paradigm Shift arxiv | Apr 1

Challenges the assumption that architecture and loss are the primary levers for neural simulators by proving the 'carried state' design is the dominant bottleneck.

Breaks Assumption arxiv | Apr 1

Proves that LLM agent capability (pass@1) and reliability (consistency) diverge systematically, with frontier models often having the highest 'meltdown' rates.

Paradigm Shift arxiv | Apr 1

Introduces a way for diffusion models to generate a single, sharp 'mental average' of a concept rather than blurry pixel-wise averages.

New Capability arxiv | Apr 1

A massive multimodal release for 10 low-resource African languages, reducing SOTA Word Error Rates (WER) by up to 61% relative.

Open Release arxiv | Apr 1

Enables infinite-length video understanding on a single consumer GPU (RTX 3090) through a training-free visual memory mechanism.

Efficiency Breakthrough arxiv | Apr 1