Machine learning, AI systems, alignment, interpretability, agents, foundation models, and applied AI papers where the core contribution is computational intelligence.
Filter by category: Paradigm Challenge Breaks Assumption First Ever Nature Is Weird Practical Magic Cosmic Scale Life Origin Open Release Efficiency Leap New Capability Scaling Insight
Scaling Insight
Proves that causal representation learning is possible with far fewer environments and unknown intervention targets than previously assumed.
New Capability
A model-agnostic framework that uses synthetic sampling to provide statistically valid uncertainty quantification and hallucination detection for multimodal models.
Nature Is Weird
We just built a computer chip that acts like a human brain, but it processes info 10,000 times faster than the one in your head.
Practical Magic
Scientists are fixing city-wide traffic jams by treating every car like a quantum particle that can take every possible route at the exact same time.
Practical Magic
The same software tricks that let massive video games like World of Warcraft handle thousands of players at once are now being used to design spaceships.
Paradigm Shift
Shifts AI evaluation from static benchmarks to interactive agentic environments requiring fluid adaptation.
New Capability
Moves medical AI from simplified 2D image classification to agents navigating full 3D clinical studies.
New Capability
Enables semantically precise model editing directly in the weight space without any training data.
Efficiency Breakthrough
Achieves 6x compute reduction in Multimodal LLMs while actually improving accuracy by 2%.
Efficiency Breakthrough
Reconstructs entire Spiking Neural Networks into a single neuron via temporal multiplexing.
Breaks Assumption
Formalizes random cropping as a source of differential privacy, offering 'free' privacy amplification.
New Capability
Estimates lab-grade 3D musculoskeletal forces from a single smartphone video.
Paradigm Shift
Provides the first formal proof and verification framework for agent-tool integration protocols.
Paradigm Shift
Demonstrates that visual hierarchies require Lorentzian causal structure rather than Euclidean space.
Paradigm Shift
Proves that Transformers can internalize complex search algorithms like MCTS directly into their weights.
Efficiency Breakthrough
Introduces a stable backpropagation-free training framework for physical and photonic neural networks.
Efficiency Breakthrough
Achieves state-of-the-art vision-language pretraining using 300x less data than leading methods.
Efficiency Breakthrough
Enables 10x faster robot trajectory generation by distilling diffusion models into movement primitives.
Scaling Insight
Reveals that synthetic rewriting is a quality multiplier for high-grade data, but fails to fix low-quality source data.
Breaks Assumption
Proves that stereo matching can reach state-of-the-art performance without the computationally heavy cost volumes used by almost all modern methods.
Efficiency Breakthrough
Speeds up RL-based reasoning training by 1.7x using an online quality head to prune failing rollouts mid-generation.
Paradigm Shift
Introduces a multi-answer RL objective that trains models to represent a distribution of valid answers in a single forward pass.
Breaks Assumption
Proves platform-determinism is necessary for trustworthy AI and implements an integer-only engine for bitwise identical inference across ARM and x86.
New Capability
Quantifies near-verbatim data extraction risk in LLMs at 1/5000th the computational cost of standard Monte Carlo methods.
New Capability
Enables graph-based retrieval and reranking for RAG without the maintenance overhead of a knowledge graph.
Breaks Assumption
Reduces visual tokens in robot policies by 78% by using inter-layer rank consistency instead of simple attention magnitude.
Breaks Assumption
This paper demonstrates that the order of training examples alone can encode information not present in any individual example, allowing models to bypass established sample complexity bounds.
Scaling Insight
A systematic study reveals that grokking is not an architectural property of Transformers but an interaction between weight decay and optimization stability.
Paradigm Shift
The 'Reasoning Contamination Effect' shows that Chain-of-Thought (CoT) reasoning actually disrupts a model's internal confidence signal, leading to poorer calibration.
Breaks Assumption
Large Language Models process instructions as social acts rather than technical specifications, making 'imperative mood' prompts behave inconsistently across different languages.
New Capability
GeoNDC introduces a queryable neural data cube that compresses 20 years of planetary satellite data by 95x while allowing on-demand continuous-time reconstruction.
Efficiency Breakthrough
Sparton is a specialized Triton kernel that solves the massive memory bottleneck of Learned Sparse Retrieval (LSR) models like Splade.
New Capability
Intern-S1-Pro is the first trillion-parameter scientific multimodal foundation model, outperforming proprietary models on specialized scientific reasoning.
New Capability
AirVLA successfully transfers manipulation-trained Vision-Language-Action (VLA) models to underactuated aerial robots using a payload-aware guidance mechanism.
Paradigm Shift
R1Sim applies the 'Reasoning-RL' paradigm (popularized by DeepSeek-R1) to traffic simulation, achieving superior safety and diversity in multi-agent behaviors.
Paradigm Shift
SIGMA resolves 'trajectory divergence' in molecular string generation by enforcing geometric symmetry recognition through contrastive learning.
Efficiency Breakthrough
A fully differentiable agent-based traffic simulator enables calibration and control of million-vehicle networks 173x faster than real-time.
Efficiency Breakthrough
GIFT is a training-free frame selection framework that uses 'Directed Diversity' to boost Video-LLM performance by up to 12.5%.
New Capability
Z-Erase introduces the first concept erasure method for single-stream diffusion transformers, preventing generation collapse in new unified architectures.
Breaks Assumption
This paper demonstrates that Sparse Autoencoder (SAE) features in multimodal models are not modular, challenging the core assumption of intervention-based steering.
Paradigm Shift
Pixelis shifts VLM reasoning from static description to a 'reasoning in pixels' agentic paradigm that learns via an executable tool grammar.
Paradigm Shift
The AE4E paradigm proposes a 'Social Contract' for multi-agent economies, replacing individual model alignment with an institutional 'Separation of Power'.
Scaling Insight
MSRL scales multimodal reward modeling by transferring reasoning capabilities from text to vision-language tasks without requiring new multimodal preference data.
New Capability
SEVerA enables the synthesis of self-evolving agents with formal guarantees by combining LLM planning with first-order logic rejection samplers.
Paradigm Shift
Using Signal Detection Theory, this work proves that LLM calibration and 'metacognitive efficiency' (knowing what you know) are distinct, dissociable capacities.
Efficiency Breakthrough
Photon enables efficient 3D medical volume understanding through adaptive token scheduling and a novel 'gradient restoration' backpropagation rule.
Paradigm Shift
Vision Hopfield Memory Networks (V-HMN) present a brain-inspired alternative to Transformers and Mamba using hierarchical associative memory mechanisms.
New Capability
Trace2Skill distills lessons from across a 'parallel fleet' of execution trajectories into a unified, conflict-free skill directory for LLM agents.
Efficiency Breakthrough
Pruning low-utility prompts before RL rollouts allows for 10x more efficient training of large reasoning models.
Breaks Assumption
Safety alignment does not have to be a 'tax' on performance; it can actually improve mathematical reasoning accuracy.