AI & Machine Learning

2,557 papers · Page 28 of 52

Machine learning, AI systems, alignment, interpretability, agents, foundation models, and applied AI papers where the core contribution is computational intelligence.

Filter by category: Paradigm Challenge Breaks Assumption First Ever Nature Is Weird Practical Magic Cosmic Scale Life Origin Open Release Efficiency Leap New Capability Scaling Insight

Breaks Assumption

Challenges a core constraint in statistical learning theory by proving that optimal $\sqrt{N}$ convergence is achievable for offline policy learning even with model classes that exceed the standard Donsker complexity limit.

Nature Is Weird

AI has hit a wall, and it's because data is acting like a heavy anchor slowing the whole thing down.

Practical Magic

This new math trick just crushed a massive logistics nightmare that used to take two weeks; now it’s done in 19 minutes.

Paradigm Challenge

Computers have gotten so fast at finding the best route on a map that it basically costs them zero effort now, no matter how big the city.

Someone finally built computer memory that doesn't go blank when you pull the plug—it just stays there forever.

Nature Is Weird

Turns out, putting a cheap AI under an AI 'boss' actually makes the work worse unless the boss is way, way smarter than the worker.

Practical Magic

AI agents are finding multi-million dollar holes in bank code that even the best human experts completely walked past.

Efficiency Breakthrough

Prunes 85% of visual tokens in Vision-Language-Action (VLA) models while retaining 94% accuracy for autonomous driving.

Introduces a CNN architecture where feature maps are mathematically identical to Grad-CAM saliency maps by design, rather than post-hoc.

Releases weights for LEMON, a foundation model for single-cell nuclear morphology trained on millions of pathology images.

A decentralized system that automates ML research and trains domain-expert 1.58-bit ternary models for CPU-native inference.

Efficiency Breakthrough

Extracts dense 3D Signed Distance Fields from images in under 3 seconds using feed-forward geometry transformer latents.

Scaling Insight

Uses the Minimum Description Length principle to predict exactly when neural networks will transition from simple 'spurious' shortcuts to complex features.

Modulates LLM hidden states with eye-gaze data to outperform GPT-4o by 10.5 points on streaming video understanding.

Breaks Assumption

Proves that safety probes can detect 'liars' (models hiding harm) but are fundamentally blind to 'fanatics' (models that believe harm is good).

Efficiency Breakthrough

Parallelizes diffusion model sampling across multiple devices using a draft-and-refine process for up to 3.7x speedups.

Shifts world model evaluation from visual fidelity to 'Simulative Reasoning,' revealing a massive gap in current AI's ability to plan.

Learns high-level symbolic state machines directly from raw pixels to guide robot control without hand-crafted priors.

Breaks Assumption

Resolves a long-standing open problem in bandit theory by achieving optimal dynamic regret without knowing the number of environment switches.

Efficiency Breakthrough

Introduces a discrete-ratio selector for context compression that solves the problem of variable information density in long-form text.

Fixes physically impossible video generation by disentangling semantic prompts from physical dynamics during training.

Efficiency Breakthrough

Achieves state-of-the-art video understanding without the need for expensive human-annotated Chain-of-Thought (CoT) data.

Breaks Assumption

Proves that standard 'wisdom' like Chain-of-Thought and Few-Shot prompting actually degrades performance in specialized medical LLMs.

The first large-scale benchmark for LLM agents based on years of authentic, cross-domain user behavioral data rather than synthetic personas.

Demonstrates that symbolic event primitives (like Schank's Conceptual Dependency) can be 'rediscovered' by neural networks purely through compression pressure.

Efficiency Breakthrough

Releases a composable, Optax-native stack that makes high-overhead second-order optimization methods (like K-FAC) practical and swappable.

Scaling Insight

A billion-scale time-series benchmark that identifies a 'context-length crossover' where foundation models start to crush deep learning baselines.

Efficiency Breakthrough

Introduces a self-driven collaboration paradigm where an agent uses its own 'reflection' signals to escalate difficult tasks to a stronger model tier.

Scaling Insight

Challenges the assumption that 'background' pixels are useless in GUI agents and identifies a 'recency effect' for optimal token pruning.

Identifies specific hidden-state dimensions (H-Nodes) responsible for hallucinations and introduces a real-time defense to cancel them.

Integrates radiologist gaze data as a probabilistic prior to align vision-language models with actual human clinical reasoning workflows.

Moves industrial recommendation systems from static multi-stage pipelines to self-evolving agentic loops.

Breaks Assumption

Finds that while frontier LLMs can model the mental states of others, they fundamentally fail at self-modeling without explicit reasoning steps.

Introduces ReinPatch, the first framework to jointly optimize sequence tokenization and backbone models using reinforcement learning.

Breaks Assumption

Discovers that object-centric information in Vision Transformers is distributed across all attention components (q, k, v) and layers, not just the final layer.

Releases DataFlex, a unified open-source framework for data-centric dynamic training (selection, mixture, and reweighting) for LLMs.

Breaks Assumption

Proves that image denoisers can be strictly contractive (robust to noise) without sacrificing state-of-the-art restoration quality.

Empirically proves that AI Scientist agents can genuinely learn from physical experimental feedback via in-context learning.

Moves coding agents from passive execution to proactive collaboration by teaching them when to ask for clarification on underspecified tasks.

Provides mechanistic evidence that LLMs internalize 'vibes' (informal registers like slang) as language-agnostic abstractions that can be causally steered.

Enables GUI agents to overcome domain bias by autonomously 'watching' web tutorial videos to learn specific software workflows without retraining.

Introduces a label-free, output-agnostic method for merging LoRA modules across heterogeneous tasks like classification and regression.

Replaces standard autoregressive action generation in robot VLAs with iterative refinement via discrete flow matching.

Breaks Assumption

Reveals that spatial reasoning in LLMs is not driven by robust internal world models, but by fragmented and transient representations.

Enables verification of claimed text-to-image models through boundary-aware prompts that trigger model-specific instability.

Breaks Assumption

Identifies that the 'reasoning tax' in vision-language fine-tuning is caused by lost access to depth-wise representations and fixes it with a lightweight adapter.

Boosts multimodal reasoning by teaching models to autonomously verify their own long-form generations against image evidence using information gain.

Efficiency Breakthrough

Achieves 16x prefill speedup for video models by using reinforcement learning to dynamically compress visual tokens based on temporal 'surprise'.

Scaling Insight

An 800 Hz data glove reveals that human hand dexterity contains critical high-frequency motion energy (>100 Hz) previously invisible to standard sensors.

Breaks Assumption

Reveals that reasoning models frequently acknowledge misleading hints in their 'thinking' tokens but hide that influence in their final visible answers.