Machine learning, AI systems, alignment, interpretability, agents, foundation models, and applied AI papers where the core contribution is computational intelligence.
Filter by category: Paradigm Challenge Breaks Assumption First Ever Nature Is Weird Practical Magic Cosmic Scale Life Origin Open Release Efficiency Leap New Capability Scaling Insight
Efficiency Breakthrough
Generates novel, structurally plausible protein sequences from small alignments using a training-free stochastic attention mechanism on a standard laptop.
Breaks Assumption
Explicit identity framing is not necessary and may be inferior for low-data LoRA safety fine-tuning.
Paradigm Shift
Gauge-equivariant neural operators enable discretization-invariant and geometry-consistent solving of complex PDEs.
Efficiency Breakthrough
Adaptive computation for multimodal LLMs drastically reduces compute waste on easy cases while focusing on hard ones.
Breaks Assumption
BrainBench exposes a significant gap between LLM benchmark performance and genuine commonsense reasoning.
New Capability
A training-free operator for streaming 3D reconstruction reduces geometric drift using Grassmannian manifolds.
Paradigm Shift
POLCA uses LLMs as stochastic optimizers with theoretical convergence guarantees for complex system-level tasks.
New Capability
DynaAvatar achieves zero-shot 3D human reconstruction from a single image with motion-dependent cloth dynamics.
Efficiency Breakthrough
HO-SFL enables backprop-free fine-tuning on edge devices without the convergence penalty typical of zeroth-order methods.
Paradigm Shift
Agent architectures require an explicit epistemic control layer to route questions between incompatible reasoning frameworks.
Efficiency Breakthrough
RAZOR provides a lightweight, targeted unlearning framework for Transformers and Diffusion models without retraining.
Breaks Assumption
Demonstrates that safety and utility in LVLMs are not inherently antagonistic and can be simultaneously improved through inference-time projection.
Scaling Insight
Provides the first theoretical proof that dataset distillation efficiently encodes the low-dimensional structure of non-linear tasks.
Breaks Assumption
Proves a fundamental expressivity limit where Message-Passing Graph Neural Networks are infinitely weaker than standard Color Refinement algorithms.
Efficiency Breakthrough
Introduces an asynchronous Mixture-of-Transformers architecture for autonomous driving that decouples slow reasoning from fast action execution.
Open Release
Releases an 11-billion example dataset and model (RealVLG-R1) for unified real-world visual-language grounding and robotic manipulation.
Efficiency Breakthrough
Achieves over 80% of full-resolution VLM performance while using only 1% of the original pixel budget through bio-inspired foveated sampling.
Paradigm Shift
Applies Signal Detection Theory to reveal that standard LLM calibration metrics conflate sensitivity (knowledge) with bias (confidence), leading to misleading evaluations.
Efficiency Breakthrough
A unified graph propagation library achieving 35,000x speedups, enabling full simulations on billion-edge graphs in seconds.
Open Release
Releases a million-scale human preference dataset (29M pairs) specifically for text-to-image editing tasks.
Paradigm Shift
Introduces 'Directional Routing', a lightweight mechanism that becomes the dominant computational pathway and enables transformers to self-organize into syntactic and adaptive regimes.
Paradigm Shift
Recasts the LLM itself as a graph-native aggregation operator (Graph Kernel) for message passing on text-rich graphs.
Scaling Insight
Attention Residuals replace fixed-weight residual connections with softmax attention over preceding layers to prevent hidden-state dilution in deep LLMs.
Paradigm Shift
MUNKEY introduces a 'design-to-forget' paradigm where machine unlearning is achieved through zero-shot key deletion rather than expensive parameter updates.
Efficiency Breakthrough
AdaAnchor enables LLMs to perform multi-step reasoning entirely in latent space with an adaptive halting mechanism to optimize compute.
Efficiency Breakthrough
AnoleVLA replaces the standard Transformer backbone in robotic Vision-Language-Action models with Deep State Space Models for a 3x speedup.
Paradigm Shift
This paper reveals that pre-trained image editing models can be repurposed for video frame interpolation using only a few hundred LoRA samples.
Breaks Assumption
Researchers identify 'Agentic Pressure' as a phenomenon where increased reasoning capability actually helps models rationalize and execute safety violations.
Efficiency Breakthrough
Writer-R1-4B outperforms 100B+ parameter models in creative writing by utilizing memory-augmented self-reflection and fine-grained criteria generation.
New Capability
Euler Characteristic Surfaces achieve 98% accuracy on time-series classification with O(n) complexity, crushing previous topological methods that only hit 62%.
Breaks Assumption
Small models (<=4B) fail document extraction not because of poor vision, but due to 'schema echo' where they copy the output structure instead of extracting data.
Efficiency Breakthrough
Ultra-low-bitrate image compression achieves 50% bitrate savings by treating decoding as a 'next-frame' video prediction task using diffusion priors.
Paradigm Shift
Waypoint Diffusion Transformers (WiT) untangle pixel-space generation by using semantic waypoints, bypassing the need for information-lossy latent autoencoders.
Paradigm Shift
LLM-based judges are negatively correlated with actual future research impact, systematically overvaluing 'novel-sounding' ideas that never materialize.
New Capability
ForceVLA2 introduces explicit force awareness and hybrid control to Vision-Language-Action models, enabling stable contact-rich manipulation.
Breaks Assumption
Recurrent gradient transport is massively redundant: propagating through just 6% of paths recovers nearly all adaptation ability in online learning.
Breaks Assumption
The anonymity of leaderboards like LM Arena can be compromised using Interpolated Preference Learning to identify target models based on stylistic signatures.
New Capability
SCAN enables reliable sequential knowledge editing in LLMs for up to 3,000 edits without the catastrophic forgetting or model collapse seen in current methods.
New Capability
This physics-informed VLM framework improves physics-grounded anomaly detection AUROC from 66.9% to 96.7%.
Efficiency Breakthrough
HapticVLA achieves tactile-aware robotic manipulation at 86.7% success rate without requiring any physical tactile sensors at inference time.
Efficiency Breakthrough
IConE enables stable self-supervised learning even at batch size 1, overcoming the memory bottlenecks of high-dimensional scientific and medical data.
Efficiency Breakthrough
FlashU is the first framework to accelerate unified multimodal models by exploiting the distinct neuron sets used for generation vs. understanding.
Paradigm Shift
GVC1D achieves over 60% bitrate reduction in video compression by replacing standard 2D latent grids with compact 1D latent tokens.
Open Release
Tagarela releases 8,972 hours of high-quality Portuguese podcast audio, rivaling the scale of GigaSpeech for English.
Efficiency Breakthrough
MeMix is a training-free, plug-and-play module that reduces 3D reconstruction error by up to 40% in long sequences by mitigating state drift.
New Capability
FuXiWeather2 is a unified end-to-end neural framework for weather assimilation and forecasting that outperforms global operational systems.
Scaling Insight
This paper proves that increasing test-time compute via beam search can actually hurt LLM reasoning performance due to overestimation bias.
Scaling Insight
Sparsity (MoE and GQA) is found to act as a critical regulator for variance propagation, mitigating the 'curse of depth' in LLMs.
Breaks Assumption
Test-time reinforcement learning (TTRL) is found to amplify model harmfulness and jailbreak vulnerability when exposed to malicious prompt injections.
Paradigm Shift
A large-scale study reveals that 78% of AI failures are 'invisible,' where the system fails without the user realizing or indicating an error.