Machine learning, AI systems, alignment, interpretability, agents, foundation models, and applied AI papers where the core contribution is computational intelligence.
Filter by category: Paradigm Challenge Breaks Assumption First Ever Nature Is Weird Practical Magic Cosmic Scale Life Origin Open Release Efficiency Leap New Capability Scaling Insight
Breaks Assumption
Reveals that the tight architectural coupling of image generation and understanding in unified models creates a new class of reciprocal safety vulnerabilities.
Paradigm Shift
Introduces a vision model testbed that aligns AI visual attention (scanpaths) with human gaze without sacrificing classification accuracy.
Scaling Insight
Shows that standard task-completion benchmarks fail to distinguish agent capabilities and proposes 'Working Memory Fidelity' as a more predictive metric.
Open Release
The first self-supervised, domain-agnostic model for LiDAR ground segmentation, eliminating the need for per-sensor manual labeling.
New Capability
A production-grade framework that converts LLM/RAG evaluation into a deployment decision workflow using Pareto frontiers and CI gates.
Paradigm Shift
Collapses the standard vision backbone-plus-decoder architecture into a single early-fusion Transformer stack for both perception and task modeling.
Paradigm Shift
Couples visual representations directly into the RL optimization process (RLVR) for vision-language models using a structured reward reweighting mechanism.
Efficiency Breakthrough
A unified framework for neural network recombination that achieves state-of-the-art fine-tuning with fewer than 200 parameters.
New Capability
Enables Active Learning for tabular data without model retraining by iteratively optimizing the 'labeled context' of foundation models.
Breaks Assumption
Harmful intent in LLMs can be detected geometrically even after safety 'refusal' mechanisms have been surgically removed.
Breaks Assumption
For LLM-driven optimization, complex meta-heuristics like simulated annealing are unnecessary; simple greedy hill climbing is a superior default.
Scaling Insight
Mathematical proof that LayerNorm structurally reduces model complexity compared to RMSNorm due to its mean-centering geometry.
Paradigm Shift
Proposes 'Amdahl’s Law for AI,' proving that human effort in AI-assisted work is bottlenecked by the fraction of 'novel' tasks rather than agent capability.
New Capability
Lie Generator Networks enable linear system identification with guaranteed physical stability and dissipation by construction rather than through loss penalties.
Efficiency Breakthrough
GIFT bootstraps image-to-CAD generation by turning inference-time failures into synthetic training data, reducing inference compute by 80%.
Open Release
A modular, JAX-based framework and taxonomy for Reinforcement Learning with Diffusion and Flow policies.
New Capability
Achieves high-quality 3D reconstruction and camera pose estimation from sparse views without any pre-trained priors or ground-truth annotations.
Efficiency Breakthrough
Near-lossless KV cache compression using angular quantization in the Walsh-Hadamard domain at ~3.5 bits per element.
Breaks Assumption
Mechanistic analysis reveals that over-refusal and harmful-intent refusal in LLMs occupy distinct representation subspaces.
New Capability
Introduces 'Hidden Ads,' a new class of semantic backdoor attacks that inject promotional content into VLM responses based on natural user behavior.
Paradigm Shift
Shifts protein fitness optimization from continuous embeddings to discrete Quadratic Unconstrained Binary Optimization (QUBO).
Paradigm Shift
Introduces LongCat-Next, a 'Native Multimodal' model that treats vision and audio as first-class discrete tokens rather than language-centric attachments.
New Capability
Achieves zero-shot, prompt-free object removal in diffusion models purely through self-attention manipulation.
New Capability
VoxAnchor uses mmWave radar to authenticate speech by matching acoustics to physical throat vibrations.
New Capability
RAGent enables training-free, deployment-time human activity recognition for mmWave radar using agentic reasoning.
Paradigm Shift
Proposes SOL-Nav, which replaces raw visual features in navigation with structured language descriptions for LLM-based agents.
New Capability
Bridges the gap between free-form natural language and safety-critical UAV navigation using Signal Temporal Logic (STL) translation and repair.
Paradigm Shift
Sci-Mind introduces an 'Adversarial Cognitive Dialectic' where specialized agents debate to refine mathematical models.
Efficiency Breakthrough
Achieves a 79,000x reduction in energy per inference for insulin dose calculation using Spiking Neural Networks (SNNs).
Paradigm Shift
Introduces 'Umwelt Engineering,' the deliberate constraint of an agent's linguistic environment to improve reasoning.
Breaks Assumption
PRBench reveals that current top-tier coding agents have a 0% success rate in end-to-end physics paper reproduction.
Paradigm Shift
Introduces Composer, a paradigm that generates input-specific parameter adaptations at inference time to enable dynamic per-input model specialization.
Open Release
Kuaishou releases KAT-Coder-V2, an agentic coding model achieving state-of-the-art results on SWE-bench Verified through a 'Specialize-then-Unify' paradigm.
Scaling Insight
Provides empirical evidence and a mechanistic explanation for why LoRA drastically reduces catastrophic forgetting in sequential fine-tuning compared to full fine-tuning.
New Capability
TianJi is the first 'AI meteorologist' system capable of autonomously driving complex numerical models to verify physical hypotheses in atmospheric science.
Scaling Insight
A controlled study proving that the temporal organization (curriculum) of multimodal data is a first-order variable in balancing reasoning vs. OCR capabilities.
Paradigm Shift
SkyNet extends MuZero to partially-observable stochastic games by adding auxiliary belief-aware heads, significantly outperforming baselines in complex card games.
New Capability
Heracles uses a state-conditioned diffusion middleware to bridge precise motion tracking with generative recovery for humanoid robots.
New Capability
Sortify is the first fully autonomous LLM agent deployed in production for closed-loop recommendation ranking optimization.
New Capability
AutoStan demonstrates a CLI coding agent that autonomously builds and iteratively improves interpretable Bayesian models in Stan.
Breaks Assumption
Identifies emergent social risks in multi-agent systems, such as spontaneous collusion and conformity, that occur even when agents are not explicitly instructed to do so.
Efficiency Breakthrough
Uses spectral decomposition of inverse dynamics to enable real-time planning of long-horizon robotic manipulation tasks (10+ contact modes).
New Capability
Introduces SCOUT, a routing framework that intelligently selects which Image-to-3D reconstruction model to use based on input difficulty and cost constraints.
New Capability
GraySense enables geospatial object tracking using only encrypted network packet sizes without any access to raw video streams.
Efficiency Breakthrough
KVSculpt moves beyond simple eviction/merging to optimize unconstrained KV pairs in continuous space for extreme cache compression.
Breaks Assumption
A rigorous analysis of the AIMO 3 math competition reveals that raw model capability dominates inference-time prompt optimization by an order of magnitude.
New Capability
Wan-R1 successfully applies Group Relative Policy Optimization (GRPO) to flow-based video models to enable verifiable spatial reasoning.
Scaling Insight
The eigenvalue tail index of a neural network's weight matrices serves as a near-perfect (R^2 = 0.984) diagnostic for label noise in the training data.
New Capability
Poppy provides a training-free way to refine monocular surface normals using single-shot polarization measurements at test time.
Efficiency Breakthrough
SAGE mitigates multimodal hallucinations by monitoring 'attention sinks' and dynamically modulating self-attention during the decoding process.