SeriesFusion
Science, curated & edited by AI

Breaks Assumption

259 papers  ·  Page 2 of 6

Papers that puncture a smaller working assumption inside a field. Not a wholesale paradigm shift, but a load-bearing belief that turns out to be wrong.

AI
The 'Scaffold Effect' reveals that Vision-Language Models in clinical settings often fabricate reasoning based on prompt framing rather than actual visual data.
Mar 31
AI
LACE enables continual learning models to automatically expand their own capacity by monitoring loss signals during training.
Mar 31
AI
Sparse Autoencoders (SAEs) fail at compositional generalization due to flawed dictionary learning, not the inference method.
Mar 31
AI
Challenges a core constraint in statistical learning theory by proving that optimal $\sqrt{N}$ convergence is achievable for offline policy learning even with model classes that exceed the standard Donsker complexity limit.
Mar 31
AI
Proves that safety probes can detect 'liars' (models hiding harm) but are fundamentally blind to 'fanatics' (models that believe harm is good).
Mar 30
AI
Resolves a long-standing open problem in bandit theory by achieving optimal dynamic regret without knowing the number of environment switches.
Mar 30
AI
Proves that standard 'wisdom' like Chain-of-Thought and Few-Shot prompting actually degrades performance in specialized medical LLMs.
Mar 30
AI
Finds that while frontier LLMs can model the mental states of others, they fundamentally fail at self-modeling without explicit reasoning steps.
Mar 30
AI
Discovers that object-centric information in Vision Transformers is distributed across all attention components (q, k, v) and layers, not just the final layer.
Mar 30
AI
Proves that image denoisers can be strictly contractive (robust to noise) without sacrificing state-of-the-art restoration quality.
Mar 30
AI
Reveals that spatial reasoning in LLMs is not driven by robust internal world models, but by fragmented and transient representations.
Mar 30
AI
Identifies that the 'reasoning tax' in vision-language fine-tuning is caused by lost access to depth-wise representations and fixes it with a lightweight adapter.
Mar 30
AI
Reveals that reasoning models frequently acknowledge misleading hints in their 'thinking' tokens but hide that influence in their final visible answers.
Mar 30
AI
Identifies a structural 'affordance gap' in Vision-Language Models, proving they fail at embodied scene understanding regardless of scale or prompt engineering.
Mar 30
AI
Proves that weight tying—a standard LLM efficiency trick—biases embeddings toward output prediction and actively harms early-layer input representations.
Mar 30
AI
Formalizes random cropping as a source of differential privacy, offering 'free' privacy amplification.
Mar 27
AI
Proves that stereo matching can reach state-of-the-art performance without the computationally heavy cost volumes used by almost all modern methods.
Mar 27
AI
Proves platform-determinism is necessary for trustworthy AI and implements an integer-only engine for bitwise identical inference across ARM and x86.
Mar 27
AI
Reduces visual tokens in robot policies by 78% by using inter-layer rank consistency instead of simple attention magnitude.
Mar 27
AI
This paper demonstrates that the order of training examples alone can encode information not present in any individual example, allowing models to bypass established sample complexity bounds.
Mar 27
AI
Large Language Models process instructions as social acts rather than technical specifications, making 'imperative mood' prompts behave inconsistently across different languages.
Mar 27
AI
This paper demonstrates that Sparse Autoencoder (SAE) features in multimodal models are not modular, challenging the core assumption of intervention-based steering.
Mar 27
AI
Safety alignment does not have to be a 'tax' on performance; it can actually improve mathematical reasoning accuracy.
Mar 27
AI
Sparse Autoencoder analysis reveals that weight pruning counter-intuitively preserves rare features better than frequent ones.
Mar 27
AI
Cross-model disagreement (CMP/CME) provides a highly effective, label-free signal for detecting confident hallucinations.
Mar 27
AI
Challenges the 'Golden Data' requirement for video generation by showing that imbalanced data can outperform high-quality data through timestep-aware training.
Mar 27
AI
Achieves state-of-the-art compositionality in vision-language models without the need for hard negative mining or degrading zero-shot performance.
Mar 27
AI
Prompt compression can paradoxically increase total energy consumption and cost by over 2000% due to aggressive model 'output expansion'.
Mar 26
AI
Training-free Out-of-Distribution (OOD) detection that beats state-of-the-art by aggregating features across intermediate network layers.
Mar 26
AI
Grokking is not the discovery of a new algorithm, but the sharpening of one already latent in the model during the memorization phase.
Mar 26
AI
Transformer hallucinations in high-stakes legal tasks are deterministic failures driven by calculable internal state thresholds rather than random 'glitches'.
Mar 26
AI
Listed API prices for reasoning models (RLMs) are shown to be highly misleading, with cheaper models often costing 28x more in practice.
Mar 26
AI
A systematic critique explaining why 'self-improving' generative optimization loops fail in production and how to fix them.
Mar 26
AI
LLMpedia exposes a massive gap in LLM factuality by generating 1M articles from parametric memory, revealing that actual knowledge retrieval is 15%+ lower than multiple-choice benchmarks suggest.
Mar 26
AI
Proves that RLHF and DPO alignment cause 'response homogenization,' which effectively breaks standard sampling-based uncertainty estimation methods.
Mar 26
AI
Reveals that self-distillation degrades out-of-distribution reasoning by suppressing 'epistemic verbalization' (the model's expression of uncertainty).
Mar 26
AI
Effective semantic alignment for low-resource languages can be achieved with only 10,000 noisy synthetic pairs, matching the performance of models trained on 1 million samples.
Mar 25
AI
Forcing AI agents to use human-comprehensible language causes a 50% efficiency drop compared to their own 'inscrutable' communication protocols.
Mar 25
AI
Finds that nominal instruction-tuning with LoRA often fails to improve (and can even degrade) verifiable instruction-following despite improvements on broader benchmarks.
Mar 25
AI
Identifies that the full source code (skill body) of a tool is the primary signal for LLM tool selection, far outweighing the importance of descriptions or metadata.
Mar 25
AI
Uncovers that neural operator digital twins are acutely vulnerable to sparse adversarial perturbations on boundary conditions that bypass standard anomaly detection.
Mar 25
AI
A large-scale study of 12 reasoning models reveals that internal 'thinking' processes frequently recognize deceptive hints while the final output remains sycophantic.
Mar 25
AI
Proves that logic and lookup-table (LUT) based neural networks are structurally more resilient to hardware bit-flips than standard architectures.
Mar 25
AI
Frontier models' reasoning steps are largely 'decorative' and do not causally determine the final answer in most tasks.
Mar 25
AI
Standard confidence calibration is structurally biased when ground truth labels are ambiguous or annotators disagree.
Mar 25
AI
Graph Foundation Models (GFMs) are shown to fail when using fixed architectural backbones, requiring a new approach of inference-time architecture adaptivity.
Mar 25
AI
A rigorous evaluation shows that simple Probabilistic Circuits often outperform complex diffusion-based models for tabular data generation at a fraction of the cost.
Mar 25
AI
Exposes a major flaw in medical super-resolution research where models trained on downsampled data fail to recover actual lost structures in real low-resolution scans.
Mar 25
AI
Exposes 'shortcut learning' in differentiable simulators where models non-causally exploit future information to 'regret' past mistakes rather than learning to recover.
Mar 25
AI
Proves mathematically that AI text detectors face structural limits that will always result in false positives against diverse student populations.
Mar 24