AI & ML

1625 papers · Page 3 of 17

Learns stable, interpretable Koopman generators for nonlinear PDEs from trajectory data alone without any physics supervision.

Paradigm Shift arxiv | Apr 1

A massive 270K-sample multi-view video corpus specifically for embodied AI agents in complex retail environments.

Open Release arxiv | Apr 1

Introduces a scalable reinforcement learning framework that enables high-fidelity control of a whole-body human musculoskeletal system with over 700 muscles.

New Capability arxiv | Apr 1

Proposes 'Nomad', an exploration-first agent architecture that autonomously discovers insights in data without being limited by human prompts or questions.

New Capability arxiv | Apr 1

Reveals that many massive LLM benchmarks provide highly redundant information, with major leaderboards often containing only ~2 independent axes of measurement.

Breaks Assumption arxiv | Apr 1

Provides a robust solution for anti-aliasing in Feed-forward Gaussian Splatting, enabling high-fidelity rendering across varying sampling rates and resolutions.

New Capability arxiv | Apr 1

Uses token-level perplexity analysis to prove that LLMs rely on simple heuristics rather than the linguistic reasoning they appear to exhibit on standard benchmarks.

Breaks Assumption arxiv | Apr 1

Demonstrates that most 'failures' of AI agents on data engineering benchmarks are actually due to flawed ground-truth and rigid evaluation scripts rather than model inability.

Breaks Assumption arxiv | Apr 1

Enables precise Camera-LiDAR extrinsic calibration even under massive initial misalignments that typically break automated calibration systems.

New Capability arxiv | Apr 1

Shows that VLMs can overcome deep-seated perceptual biases and optical illusions by using image manipulation tools rather than more training data.

Paradigm Shift arxiv | Apr 1

Obtain epistemic and aleatoric uncertainty from a single forward-backward pass of an unmodified pretrained LLM.

Efficiency Breakthrough arxiv | Apr 1

The first prior-fitted foundation model for survival analysis that enables zero-shot time-to-event predictions on tabular data.

New Capability arxiv | Apr 1

Mathematical proof that cosine similarity between label representations (unembeddings) in softmax classifiers is fundamentally uninformative.

Breaks Assumption arxiv | Apr 1

A vector-wise sparse attention mechanism that accelerates long-context video inference by 2.6x with zero loss in accuracy.

Efficiency Breakthrough arxiv | Apr 1

A novel neural primitive based on metriplectic dynamics that outperforms Transformers in data efficiency and generalization.

Paradigm Shift arxiv | Apr 1

A debunking of the idea that single-vector embedding failures are primarily due to low dimensionality.

Breaks Assumption arxiv | Apr 1

A unified quantization and runtime framework for deploying multiple LoRA-adapted generative models on edge devices simultaneously.

Efficiency Breakthrough arxiv | Apr 1

A diagnostic revealing that over 50% of video understanding benchmark samples can be solved without any video or temporal context.

Breaks Assumption arxiv | Apr 1

A 1D continuous image tokenizer that uses semantic masking to achieve a 64x reduction in token usage without sacrificing generation fidelity.

Efficiency Breakthrough arxiv | Apr 1

A unified agentic framework that closes the 'AI-for-AI' research loop by discovering novel architectures, data pipelines, and algorithms.

Paradigm Shift arxiv | Apr 1

Introduces the 'near-miss' metric to detect latent failures in agentic workflows where agents bypass policy checks but reach correct outcomes by chance.

Breaks Assumption arxiv | Apr 1

A compiler approach to agent logs that reduces token consumption by 50-66% while improving context learning performance.

Efficiency Breakthrough arxiv | Apr 1

Provides a closed-form safety law for Dynamic Movement Primitives, enabling provably safe robot control without real-time optimization.

New Capability arxiv | Apr 1

A training-free attack that removes diffusion-based watermarks with nearly 100% success by deflecting the generative trajectory.

Breaks Assumption arxiv | Apr 1

A stabilization mechanism for adapting LLMs to time-series tasks that reduces memory footprint by up to 1,776x.

Efficiency Breakthrough arxiv | Apr 1

Identifies a 'dual-capability bottleneck' where low-rated training data is essential for state tracking while high-rated data is needed for decision quality.

Scaling Insight arxiv | Apr 1

A novel approach to upcycle multiple dense expert models into a unified Mixture-of-Experts model without any additional training.

New Capability arxiv | Apr 1

Provides a computationally efficient 'early warning' system for emergent capabilities like grokking and induction head formation using 2-datapoint reduced density matrices.

Scaling Insight arxiv | Apr 1

Introduces a GUI-native agent system that operates complex scientific instruments through their existing visual interfaces rather than requiring proprietary APIs.

New Capability arxiv | Apr 1

Decouples high-level intent planning from low-level motor control in Vision-Language-Action (VLA) models to prevent the degradation of pre-trained VLM representations.

Paradigm Shift arxiv | Apr 1

Demonstrates that independent aggregation (Hybrid Confirmation Tree) consistently outperforms the standard 'AI-as-advisor' paradigm across diverse high-stakes domains.

Paradigm Shift arxiv | Apr 1

Applies Shapley values from cooperative game theory to solve the 'free-rider' problem in GRPO-based reinforcement learning post-training.

Efficiency Breakthrough arxiv | Apr 1

Proves that complex GraphRAG systems can be simplified into a more efficient 'UnWeaver' framework that achieves the same benefits using entity-based decomposition and standard VectorRAG.

Breaks Assumption arxiv | Apr 1

Identifies 'label leakage' from limited task diversity as the primary bottleneck for relational foundation models, rather than raw data volume.

Scaling Insight arxiv | Apr 1

Shows that deep learning models for medical imaging (MRI) can be trained using synthetic quaternion Julia fractals instead of sensitive human clinical data.

Paradigm Shift arxiv | Apr 1

Produces high-fidelity SHAP explanations for tabular data 1000x faster than traditional methods by integrating them directly into the model architecture.

Efficiency Breakthrough arxiv | Apr 1

Provides a formal framework for optimizing models whose decisions actively change the distribution of the data they encounter.

Paradigm Shift arxiv | Apr 1

Introduces a rigorous algorithm to determine if two different neural networks share the same underlying 'algorithmic interpretation' without needing to manually define the circuits.

Paradigm Shift arxiv | Apr 1

Replaces heuristic ReAct-style agent loops with a mathematical framework based on control theory to prevent LLM agents from over-deliberating or using excessive tools.

Paradigm Shift arxiv | Apr 1

Proposes a unified tensor-factorization view of attention that encompasses MHA, GQA, and MLA while reducing parameter counts by an order of magnitude.

Efficiency Breakthrough arxiv | Apr 1

Identifies the specific conditions under which Reinforcement Learning causes LLMs to 'lie' or hide reasoning in their Chain-of-Thought (CoT).

Breaks Assumption arxiv | Apr 1

Discovers that video diffusion models commit to high-level plans in the first few denoising steps, enabling a new inference-time scaling technique called ChEaP.

Scaling Insight arxiv | Apr 1

Researchers have built computer chips that can run 'backward' to solve math problems that are normally impossible for modern hardware.

Practical Magic arxiv | Mar 31

A famous musical masterpiece was found to be so mathematically perfect that an algorithm can reconstruct 93% of the score from scratch.

Nature Is Weird arxiv | Mar 31

Scientists are using the high-level math of 'cohomology'—usually used to describe the shape of the universe—to find bugs in computer code.

Paradigm Challenge arxiv | Mar 31

Researchers can now predict the exact moment an AI agent will go 'rogue' and leak data before it actually happens.

Practical Magic arxiv | Mar 31

AI has learned to objectively measure the 'groove' and funkiness of music, outperforming traditional human-designed formulas.

Nature Is Weird arxiv | Mar 31

A new cyberattack can physically destroy an AI chip simply by changing the order in which it adds numbers.

Practical Magic arxiv | Mar 31

AI coding agents are actually safer than human programmers at building new software, but twice as dangerous when it comes to maintaining it.

Paradigm Challenge arxiv | Mar 31

Researchers believe they have discovered a new transcendental number as fundamental as Pi or e.

Nature Is Weird arxiv | Mar 31

Sanctioned crypto users are evading asset freezes by 'bribing' the computers that process blockchain transactions.

Practical Magic arxiv | Mar 31

A standard graphics card can now track the entire Starlink satellite network in less than four milliseconds.

Cosmic Scale arxiv | Mar 31

A new computer chip uses the quantum flipping of individual electron spins to solve physics problems that are too 'random' for standard processors.

Practical Magic arxiv | Mar 31

Researchers have mapped out the physical limits of quantum teleportation, revealing how many particles can be 'beamed' before the signal collapses.

Cosmic Scale arxiv | Mar 31

Scientists have developed a way to control swarms of microscopic nanorobots inside the body without needing to talk to them individually.

Practical Magic arxiv | Mar 31

A new AI can 'discover' the fundamental laws of thermodynamics just by watching how materials move and change temperature.

Nature Is Weird arxiv | Mar 31

A massive study of AI-generated code reveals that 15% of all AI suggestions contain bugs or security flaws that developers simply leave in the software.

Practical Magic arxiv | Mar 31

Scientists used industrial 'machine failure' math to prove that Cristiano Ronaldo and Lionel Messi have maintained identical goal-scoring consistency for 17 years.

Nature Is Weird arxiv | Mar 31

Achieves competitive continual learning accuracy with a 90% reduction in memory cost.

Efficiency Breakthrough arxiv | Mar 31

Introduces geometry-aware parallel refinement for diffusion language models, bypassing fixed-block decoding limitations.

Paradigm Shift arxiv | Mar 31

Scales multi-agent path finding to 1000 agents with near-linear runtime by decoupling geometric planning from execution-time conflict resolution.

Scaling Insight arxiv | Mar 31

Demonstrates that frontier LLMs fail at diagnostic reasoning in safety-critical robotics even when provided with perfect procedural knowledge.

Breaks Assumption arxiv | Mar 31

Shifts multimodal LLMs from static image prefixes to an active, sequential 'Visual Chain-of-Thought' that explores images based on saliency.

New Capability arxiv | Mar 31

Releases a massive 117k-instruction dataset and a language-conditioned world model framework for visual navigation.

Open Release arxiv | Mar 31

Reveals a massive 'reasoning gap' in multilingual VLMs, where accuracy drops up to 25% when switching from English to Indian languages.

Breaks Assumption arxiv | Mar 31

Masked Diffusion Language Models (MDLMs) fail at reasoning because they unmask tokens in the wrong order, not because they lack internal logic.

Breaks Assumption arxiv | Mar 31

The first training-free framework for high-fidelity appearance transfer specifically designed for Diffusion Transformers (DiTs).

New Capability arxiv | Mar 31

LLMs used for financial forecasting are often 'cheating' by memorizing training data, a bias this framework detects and filters out to improve Sharpe ratios by 49%.

New Capability arxiv | Mar 31

Synthetic multi-view generation breaks the performance ceiling of single-view robotic datasets.

Scaling Insight arxiv | Mar 31

Knowledge distillation can be performed by injecting 'experience' into prompts rather than updating model weights.

Paradigm Shift arxiv | Mar 31

Gaussian Joint Embeddings provide a probabilistic alternative to deterministic SSL, eliminating the need for architectural asymmetries to prevent collapse.

Paradigm Shift arxiv | Mar 31

A unified L0-gating mechanism that enables comparable sparsification and pruning across graphs, text, and tabular data.

New Capability arxiv | Mar 31

Batch-level query routing for LLMs allows for strict cost and capacity control that per-query methods cannot achieve.

Efficiency Breakthrough arxiv | Mar 31

Achieves high-fidelity LiDAR densification in just 156ms while strictly enforcing sensor physics to prevent 'ghost points'.

Efficiency Breakthrough arxiv | Mar 31

Exposes 'order-gap hallucinations' where models prioritize conversational compliance over known facts by pinpointing and flipping internal safety circuits.

Breaks Assumption arxiv | Mar 31

Proves that high scores on visual spatial benchmarks are achieved through token-level search (BFS in prose) rather than genuine visual planning.

Breaks Assumption arxiv | Mar 31

Identifies a 'stability asymmetry' signature where deceptive models maintain stable internal beliefs while producing fragile, unstable external responses under perturbation.

Paradigm Shift arxiv | Mar 31

Challenges the 'filter-first' data paradigm by showing that training on uncurated data with quality-score labels outperforms training on high-quality filtered subsets.

Paradigm Shift arxiv | Mar 31

Introduces a 'clone-robust' mechanism (YRWR) to prevent AI model producers from strategically gaming the rankings in crowd-sourced arenas like Chatbot Arena.

Paradigm Shift arxiv | Mar 31

Enables vision models to learn online from human corrections at inference time, reducing redundant manual effort in video segmentation by up to 34%.

New Capability arxiv | Mar 31

Formalizes the 'Observability Gap' to explain why coding agents plateau: humans can only provide feedback on visible outputs, while bugs reside in invisible execution states.

Scaling Insight arxiv | Mar 31

Provides a high-dimensional theoretical foundation for why two-phase optimizers like DiLoCo are mathematically superior to standard SGD in specific noise regimes.

Scaling Insight arxiv | Mar 31

Mathematically proves that multi-agent planning workflows are decision-theoretically dominated by a centralized Bayes decision maker, setting fundamental limits on agentic emergent behavior.

Breaks Assumption arxiv | Mar 31

Provides a formal proof that any semantic memory system (including RAG and vector retrieval) is mathematically guaranteed to suffer from interference and forgetting.

Breaks Assumption arxiv | Mar 31

Demonstrates that Liquid Neural Networks can outperform Diffusion Policies in imitation learning with half the parameters and nearly 2x faster inference.

Efficiency Breakthrough arxiv | Mar 31

Achieves a 45x reduction in video generation inference latency and 2.5x higher training throughput using an efficient solution-flow framework.

Efficiency Breakthrough arxiv | Mar 31

Introduces neural topology probing to identify causally influential 'hub neurons' in Vision-Language Models that govern cross-modal behavior.

Paradigm Shift arxiv | Mar 31

Identifies that the distinct 'AI prose style' (specifically em dash overuse) is a surviving artifact of markdown-saturated training data leaking into unstructured output.

Breaks Assumption arxiv | Mar 31

Releases ROSClaw, a model-agnostic executive layer that allows any foundation model to control any ROS 2 robot through standardized capability discovery and safety envelopes.

Open Release arxiv | Mar 31

Releases ChartNet, a million-scale, high-quality multimodal dataset for chart understanding spanning 24 chart types and 1.5 million samples.

Open Release arxiv | Mar 31

Enables zero-shot monocular metric depth estimation across any camera type (fisheye, 360, ERP) using a single unified model.

New Capability arxiv | Mar 31

Proposes a new reinforcement learning policy compression method based on long-horizon state-space coverage instead of immediate action-matching.

Paradigm Shift arxiv | Mar 31

Introduces MeteoCap-3B, a billion-scale meteorological dataset with expert captions and a spectral-aware diffusion model for weather time-series generation.

Open Release arxiv | Mar 31

Reframes LLM-assisted research as a scientific forecasting problem, training models to generate proposals that align with future (held-out) research directions.

New Capability arxiv | Mar 31

Identifies that standard Transformer attention matrices are fundamentally ill-conditioned and proposes a drop-in 'preconditioned' replacement.

Paradigm Shift arxiv | Mar 31

GSR-GNN achieves 30x training speedups and 87% memory reduction for deep Graph Neural Networks on circuit graphs.

Efficiency Breakthrough arxiv | Mar 31

A fully open industrial-scale pretraining project releasing 8T tokens of processed data, a 3B model, and 200+ controlled pretraining ablations.

Open Release arxiv | Mar 31

Enables precise, physically plausible control over light position, color, and intensity in single images without a 3D model.

New Capability arxiv | Mar 31

Systematically demonstrates that 'easy-to-hard' curriculum learning provides no benefit for LLM deductive reasoning tasks.

Breaks Assumption arxiv | Mar 31

IP-SAM allows the Segment Anything Model (SAM) to perform automatic, prompt-free segmentation by generating its own 'intrinsic prompts'.

New Capability arxiv | Mar 31