AI & Machine Learning

2,557 papers · Page 49 of 52

Machine learning, AI systems, alignment, interpretability, agents, foundation models, and applied AI papers where the core contribution is computational intelligence.

Filter by category: Paradigm Challenge Breaks Assumption First Ever Nature Is Weird Practical Magic Cosmic Scale Life Origin Open Release Efficiency Leap New Capability Scaling Insight

Breaks Assumption

Settles the long-standing practitioner debate over whether to use training or holdout data for interpreting black-box models with PD/ALE plots.

Enables Bayesian model selection and joint posterior inference over combinatorial spaces of up to billions of simulator model instantiations.

Efficiency Breakthrough

Achieves 1,000x speedups in Bayesian inverse problems by replacing repeated MCMC sampling with one-step preconditioned generative transport.

Practical Magic

Imagine a paper-thin sticker you can slap on a wall to listen to the room next door, and get this—it doesn't even need a battery.

Paradigm Challenge

Future 6G antennas are going to literally slide around on your phone to grab a signal so sharp it shouldn't even be possible.

Efficiency Breakthrough

ActTail achieves 80% activation sparsity in LLMs with significantly lower perplexity degradation than uniform methods by using Heavy-Tailed Self-Regularization theory.

This paper proposes a method to align and personalize LLMs directly from raw user interactions using self-distillation, bypassing the need for explicit human labels or RLHF.

Breaks Assumption

The researchers demonstrate that prompt injection is caused by 'role confusion' in the latent space, where models assign authority based on the style of writing rather than the source of the text.

Breaks Assumption

This theoretical work refutes the 'Garbage In, Garbage Out' mantra for modern ML, proving that high-dimensional model capacity can asymptotically overcome predictor error and structural uncertainty.

Introduces the Budget-Sensitive Discovery Score (BSDS), a formally verified metric machine-checked in Lean 4 for evaluating AI-guided scientific candidate selection.

Efficiency Breakthrough

ReBalance is a training-free framework that dynamically modulates 'thinking' length in reasoning models to prune redundancy during overthinking and promote exploration during underthinking.

Breaks Assumption

This study proves that reasoning traces (Chain-of-Thought) causally shape model behavior and generalization, even when the final answer is held constant.

Breaks Assumption

SpectralGuard identifies a 'memory collapse' vulnerability in State Space Models (like Mamba) where adversarial inputs can drive the transition operator's spectral radius to zero.

Surg-R1 is a specialized surgical reasoning model released alongside the largest surgical Chain-of-Thought dataset (320,000 pairs).

This paper establishes a systematic protocol for 'stitching' heterogeneous Vision Foundation Models (e.g., CLIP and DINOv2) to share early layers while retaining specialized capabilities.

Efficiency Breakthrough

Achieves 100x speedup in robotic action generation by distilling iterative flow/diffusion models into a one-step policy without a pre-trained teacher.

Introduces Modal Logical Neural Networks (MLNNs) as a differentiable logic layer that bridges deep learning with symbolic Kripke semantics for regulated AI.

Demonstrates a robot that improves its own locomotion by identifying and physically 'self-destructing' redundant or inhibiting limbs during its lifetime.

Enables training-free infinite video generation (hour-scale) by using evolving memory tokens to solve identity drift and motion stagnation.

Breaks Assumption

Reveals that standard global correlation metrics for LLM judges fail to predict success in 'best-of-n' selection tasks due to within-prompt signal loss.

Efficiency Breakthrough

Reduces Chain-of-Thought (CoT) compute costs by 14-55% by learning the optimal 'early-exit' points for Large Reasoning Models.

Scaling Insight

Discovers that as LLMs scale, their complex non-linear depth dynamics converge into accurate, low-order linear surrogates.

Derives an exact, unbiased policy gradient for Reinforcement Learning on Diffusion LLMs, bypassing the need for sequence-level likelihood approximations.

Breaks Assumption

Shows that tool-augmented agents suffer from 'recommendation drift' where they provide unsafe advice under tool corruption while maintaining high ranking scores.

Efficiency Breakthrough

Accelerates Diffusion Transformers (DiTs) by 2x using a training-free framework that selectively reduces computation in non-aesthetic image regions.

Breaks Assumption

Challenges the standard practice of deep PPO training by proving that consensus aggregation of 'wider' parallel runs is 8x more sample efficient than multiple epochs.

Releases Feynman, an agentic pipeline and 100k-sample dataset for generating high-quality, knowledge-rich diagrams with grounded captions.

Introduces the largest-ever multi-modal CAD dataset with 10 million annotations for 1 million models to enable geometric deep learning on BRep data.

Unlocks Maximum Entropy RL for high-dimensional humanoid control, matching or doubling the performance of dominant deterministic baselines.

Efficiency Breakthrough

Introduces a training-free framework that allows LLM agents to dynamically scale their reasoning depth based on a pre-defined token/tool budget.

Efficiency Breakthrough

Achieves a 98x speedup in LLM routing on AMD hardware using Flash Attention and prompt compression, enabling high-context classification without a dedicated GPU.

Proposes modeling the world in the feature space of frozen geometry foundation models instead of pixels, achieving 5x faster depth forecasting.

A retrosynthesis model that explicitly learns strategic bond-disconnection reasoning via reinforcement learning with a round-trip accuracy reward.

Scaling Insight

Longitudinal evidence reveals that successive ChatGPT versions are converging in output diversity, suggesting potential model collapse from synthetic data saturation.

A new system enables humanoid robots to play competitive tennis rallies with humans by learning from imperfect, fragmented motion data.

Scaling Insight

Adversarial test case evolution improves code reinforcement learning by creating harder, more discriminative verification signals that drive better model performance.

Efficiency Breakthrough

Modality-level disaggregation enables cost-optimal MLLM serving across heterogeneous GPUs over commodity PCIe, bypassing the need for expensive NVLink interconnects.

Breaks Assumption

Probing of Vision-Language-Action (VLA) models reveals that the action decoder largely ignores the reasoning logic in Chain-of-Thought, relying almost exclusively on object names.

SciDesignBench provides a massive simulator-grounded environment for scientific inverse design, revealing that current LLMs struggle significantly with iterative refinement.

Efficiency Breakthrough

A hardware-algorithm co-design for Spiking Neural Networks achieves up to 69x energy efficiency gains using an SRAM-based Compute-in-Memory accelerator.

Breaks Assumption

The TaoBench benchmark proves that state-of-the-art math LLMs fail on equivalent logic problems when presented outside of the standard 'MathLib' framework.

A self-supervised robotic system detects novel objects by training bespoke detectors on-the-fly from human video demonstrations, bypassing language-based prompts.

AIM enables post-training modulation of large models to change utility levels or focus features without any retraining or additional data.

Efficiency Breakthrough

Achieves 4x visual token compression and 80% lower training cost while unifying multimodal comprehension and generation.

First training-free method for debiasing reward models using Sparse Autoencoder (SAE) interventions.

Breaks Assumption

Breaks the long-standing accuracy-robustness trade-off in VLMs by localizing adversarial robustness to shallow layers.

A flow-based navigation policy that achieves zero-shot sim-to-real transfer across wheeled, quadrupedal, and humanoid platforms.

A small-scale molecular reasoning model that outperforms ultra-large foundation models via structured chain-of-thought and RL.

Efficiency Breakthrough

Adaptive VLM Routing reduces inference costs for Computer Use Agents by up to 78% with negligible accuracy loss.

Efficiency Breakthrough

Distills a 2B Vision-Language Retriever into a 70M text-only encoder for visual document retrieval with 50x lower latency.