SeriesFusion
Science, curated & edited by AI

AI & Machine Learning

2,557 papers  ·  Page 21 of 52

Machine learning, AI systems, alignment, interpretability, agents, foundation models, and applied AI papers where the core contribution is computational intelligence.

Paradigm Challenge
If you change just one tiny ingredient in an AI’s training, you can break the whole thing without a single warning light going off.
Apr 3
Practical Magic
Forget weighing yourself every morning—recording a quick voice memo could be way better at spotting a heart failure flare-up before it happens.
Apr 2
Practical Magic
Imagine headphones that let you 'mute' a crying baby or a leaf blower while keeping the rest of the world sounding perfectly clear.
Apr 2
Paradigm Challenge
If you mash two 'safe' AI models together, you can accidentally create a dangerous one—turns out you can hide a trap by splitting it across separate files.
Apr 2
Nature Is Weird
A top AI coding tool leaked its own secret source code because the developers got lazy and just trusted the code the AI wrote for its own setup.
Apr 2
Paradigm Challenge
We found a way to send data faster than the 'speed limit' of physics that everyone thought was impossible to break.
Apr 2
Paradigm Challenge
The math formula the World Bank has used for 40 years to measure global poverty has been proven to be logically impossible.
Apr 2
Practical Magic
We found a way to run stats in 'superposition,' so a computer can check every possible version of a dataset at the same time.
Apr 2
Efficiency Breakthrough
Recovers short-text performance in context-extended LLMs using 60x less data than current state-of-the-art distillation methods.
Apr 2
Paradigm Shift
First foundation model to unify text, image, audio, and video using native masked diffusion instead of autoregressive serialization.
Apr 2
Breaks Assumption
Discovers that post-training reasoning models mask rather than delete safety mechanisms, allowing their restoration with lightweight adapters.
Apr 2
Efficiency Breakthrough
Introduces entropy-guided adaptive decoding that gives small models reasoning performance comparable to frontier models at a fraction of the cost.
Apr 2
Breaks Assumption
Proves that 'inverse scaling' on many benchmarks is a prompt-dependent artifact caused by verbosity, which can be reversed by forcing brevity.
Apr 2
New Capability
Enables reinforcement learning for long-horizon robots across diverse tasks without requiring manual reward engineering.
Apr 2
Efficiency Breakthrough
Proposes a 'no-backprop' stochastic process memory for edge agents that solves the retention-forgetting tradeoff with fixed compute.
Apr 2
Breaks Assumption
Mathematically and empirically proves that classifier-based safety gates are fundamentally incapable of monitoring self-improving AI.
Apr 2
New Capability
First generative model capable of synthesizing physically consistent 'raw' camera sensor data from text prompts or sRGB images.
Apr 2
New Capability
A production-ready adaptive router for LLM portfolios that manages cost-quality trade-offs in real-time under strict dollar budgets.
Apr 2
Breaks Assumption
Masked Image Modeling (MIM) representations are fundamentally polluted with non-semantic noise, which can be fixed with a zero-cost post-hoc linear projection.
Apr 2
Breaks Assumption
Standard alignment metrics like CKA and RSA systematically fail when comparing networks in superposition, often leading to false conclusions about model similarity.
Apr 2
Scaling Insight
Neural collapse is triggered by a predictable 'feature-norm threshold' (fn*) that is invariant to training conditions, serving as a new diagnostic for training progress.
Apr 2
Efficiency Breakthrough
MAC-Attention achieves 14x attention-phase speedups and reduces KV cache accesses by 99% for long-context LLMs by reusing computation from semantically similar queries.
Apr 2
Efficiency Breakthrough
A modified 110M parameter ColBERT model can identify fine-grained evidence spans as accurately as a 27B parameter LLM, but at a fraction of the cost.
Apr 2
Paradigm Shift
LLM-guided program evolution has discovered a new data-shuffling rule for SGD that provably and empirically outperforms standard Random Reshuffling.
Apr 2
Breaks Assumption
Self-reflective prompting (self-correction) fails to improve accuracy in safety-critical medical QA, frequently introducing new errors rather than fixing old ones.
Apr 2
Breaks Assumption
The 'modality gap' in Vision-Language Models is composed of two distinct geometric components, and the commonly used 'raw gap' is a misleading metric for cross-modal quality.
Apr 2
New Capability
High-quality oversight of massive proprietary LLM agents can be achieved by small, open-source 'critics' that intervene in real-time within the same interaction.
Apr 2
New Capability
Reduces multimodal jailbreak success rates by 97% using a simple conditional decoding strategy without task-specific fine-tuning.
Apr 2
Paradigm Shift
A comprehensive analysis of AI safety vulnerabilities including automated circuit discovery, latent adversarial training, and power-law scaling of jailbreak success.
Apr 2
Efficiency Breakthrough
A lightweight framework for triaging agentic trajectories post-deployment without the cost of human review or auxiliary LLM calls.
Apr 2
Open Release
Independently reproduces OpenAI's gpt-oss-20b scores by reverse-engineering undisclosed tool-calling formats and agent harnesses.
Apr 2
New Capability
Reconstructs authentic LiDAR point clouds under jamming attacks with a 92% success rate by exploiting raw full-waveform representations.
Apr 2
Paradigm Shift
Identifies a fundamental quality-exploration dilemma in Diffusion Language Models where remasking improves single-sample quality but kills reasoning diversity.
Apr 2
Scaling Insight
Gradient-based data valuation (TracIn) outperforms all human-crafted metadata heuristics for ordering curriculum learning in motion planners.
Apr 2
Paradigm Shift
Introduces training-free and model-free trajectory planning by computing diffusion score functions directly from data libraries via kernel-weighted estimation.
Apr 2
Breaks Assumption
Foundational deep networks consistently assign higher density to simpler images, regardless of training data or architecture complexity.
Apr 2
Efficiency Breakthrough
A cross-graph tuning-free prompting framework for GNNs that achieves massive gains on unseen graphs without retraining.
Apr 2
Paradigm Shift
Proposes a decision-centric architecture that separates signal estimation from control policy to make LLM system decisions explicit and inspectable.
Apr 2
New Capability
Enables zero-shot humanoid navigation in unseen environments using only 5 hours of human walking data and no robot-specific data.
Apr 2
New Capability
A white-box membership inference attack using 'gradient-induced feature drift' to outperform all existing confidence-based methods.
Apr 2
Efficiency Breakthrough
Self-Routing removes the need for learned routers in Mixture-of-Experts (MoE) by using hidden states directly for expert assignment.
Apr 2
Efficiency Breakthrough
Improves Qwen2.5-7B performance on AIME2024 by 137% through test-time iterative rethinking and majority-voted pseudo-labels.
Apr 2
Efficiency Breakthrough
Automates mathematical optimization modeling using reinforcement learning with solver-derived rewards instead of human process supervision.
Apr 2
Breaks Assumption
Reveals that many 'polysemantic' neurons in LLMs are actually firing for shared word forms (lexical) rather than compressed semantic concepts.
Apr 2
Paradigm Shift
Truth Anchoring (TAC) provides a post-hoc calibration method to align LLM uncertainty metrics with actual factual correctness.
Apr 2
Scaling Insight
Demonstrates that LLM judge panels follow power-law discovery curves, where panel size and persona diversity are critical for uncovering edge-case failures.
Apr 2
Paradigm Shift
Identifies 'diversity collapse' in the popular GRPO reinforcement learning method and introduces MUPO to maintain broad reasoning paths.
Apr 2
New Capability
Introduces the first auto-regressive framework for Gaussian Splatting, enabling parallel, progressive next-scale 3D generation.
Apr 2
Efficiency Breakthrough
Optimizes LLM inference scheduling by treating output length as a heavy-tailed distribution rather than a point estimate.
Apr 2
Efficiency Breakthrough
Introduces negative early exit and adaptive boosting to make Monte Carlo Tree Search (MCTS) practical for real-time LLM inference.
Apr 2