SeriesFusion
Science, curated & edited by AI

AI & Machine Learning

2,557 papers  ·  Page 22 of 52

Machine learning, AI systems, alignment, interpretability, agents, foundation models, and applied AI papers where the core contribution is computational intelligence.

Efficiency Breakthrough
Achieves a major breakthrough in dataset distillation, reaching 60% accuracy on ImageNet-1K using only a handful of synthetic images.
Apr 2
Efficiency Breakthrough
Enables 'Elastic Inference' where a single trained model can be converted to multiple lower-precision formats on-the-fly without retraining.
Apr 2
New Capability
Proposes a parameter-efficient LLM adaptation method that enables rapid specialization on non-stationary streams while preventing catastrophic forgetting.
Apr 2
Paradigm Shift
Replaces manual rubric-tuning for synthetic data with an automated gradient-guided optimization framework based on influence estimation.
Apr 2
New Capability
Rebuilds the Agent-Computer Interaction (ACI) stack for scientific discovery, solving the fragility of JSON tool-calling and execution sandboxes.
Apr 2
Efficiency Breakthrough
Scales imitation learning data efficiency by generating synthetic 'multi-view' demonstrations from a single expert trajectory.
Apr 2
New Capability
Introduces SIGN, a framework capable of discovering governing symbolic equations for networked systems with over 100,000 nodes.
Apr 2
Breaks Assumption
Discovers 'Quality Corruption,' an adversarial failure mode where accuracy collapses while detection counts remain stable, proving robustness is substrate-dependent.
Apr 2
Efficiency Breakthrough
Proposes Physical Imitation Learning (PIL) to offload up to 87% of a control policy's mechanical power to passive robotic joints.
Apr 2
Open Release
OmniVoice is an open-source TTS model scaling to over 600 languages using a novel diffusion language model architecture.
Apr 2
New Capability
TTA-Vid enables video reasoning models to adapt to new domains at test-time using label-free reinforcement learning on a single sample.
Apr 2
Paradigm Shift
Introduces HiLL, a framework that jointly trains a 'hinter' and 'reasoner' to prevent advantage collapse in reinforcement learning for hard tasks.
Apr 2
Scaling Insight
Establishes a three-dimensional scaling law for RAG-pretraining, modeling the optimal data budget allocation between model parameters, tokens, and retrieval store size.
Apr 2
Efficiency Breakthrough
CircuitProbe identifies reasoning circuits in Transformers 1000x faster than brute-force methods and predicts the efficacy of layer duplication.
Apr 2
Paradigm Shift
LangMARL introduces agent-level credit assignment and policy gradient evolution directly in the natural language space for multi-agent coordination.
Apr 2
Breaks Assumption
Provides the first controlled study of Silent Data Corruption (SDC) in GPUs and its catastrophic impact on LLM pretraining stability.
Apr 2
Efficiency Breakthrough
Spectral Compact Training (SCT) enables training 70B-parameter architectures on consumer hardware like the Steam Deck (8GB RAM) via permanent SVD factors.
Apr 2
Paradigm Shift
Stochastic Attention achieves a global receptive field in O(log n) layers by using randomized routing inspired by the fruit fly connectome.
Apr 2
New Capability
ThoughtSteer demonstrates the first successful backdoor attack on continuous latent reasoning models that leave no token-based audit trail.
Apr 2
Breaks Assumption
Mechanistic analysis reveals that LLMs fail at character counting not because they lack the information, but because 'negative circuits' in the final layers actively suppress the correct answer.
Apr 2
Efficiency Breakthrough
This paper achieves O(1) complexity for multimillion-class classification by leveraging predefined vector systems in the latent space.
Apr 2
Paradigm Shift
Routing-Free MoE replaces centralized routing with individual expert-level activation, eliminating the need for Softmax and Top-K load balancing.
Apr 2
Efficiency Breakthrough
Molecular Memory allows MoE systems to recover previously learned domain expertise 9-11x faster by utilizing cost-penalized fitness metrics that preserve dormant experts.
Apr 2
Efficiency Breakthrough
OBD-LLM uses second-order Hessian information to achieve 20-40% better low-rank decomposition accuracy than the current state-of-the-art SVD-LLM.
Apr 2
Paradigm Shift
Policy Improvement Reinforcement Learning (PIRL) shifts the training objective from reward maximization to explicit maximization of policy progress across iterations.
Apr 2
Efficiency Breakthrough
PixelPrune identifies and removes pixel-level redundancy before the Vision Transformer encoder, delivering up to 4.2x inference speedup for high-resolution VLM tasks.
Apr 2
New Capability
An autonomous research pipeline discovered a lifelong multimodal memory framework by diagnosing and fixing its own architectural bugs and data pipeline issues.
Apr 2
Efficiency Breakthrough
EmbedPart achieves a 100x speedup over Metis for graph partitioning by clustering node embeddings rather than operating on raw graph structures.
Apr 2
Efficiency Breakthrough
A lightweight probing method predicts LLM downstream task performance from internal representations during training, reducing evaluation latency from one hour to three minutes.
Apr 2
Efficiency Breakthrough
Canonical Correlation Analysis (CCA) can reduce image representation dimensionality by 75% while actually improving downstream performance through cross-model agreement.
Apr 2
New Capability
WARP provides provable, guaranteed repairs for inner layers of Transformers, overcoming the limitation of previous methods restricted to the final layer.
Apr 2
Paradigm Shift
Proposes dense point trajectories as universal 'visual tokens' for behavior that generalize across different species and non-rigid objects.
Apr 2
Open Release
Releases the GPT-NL Public Corpus, the largest permissively licensed (CC-BY) Dutch-first dataset for LLM pre-training.
Apr 2
Efficiency Breakthrough
Decouples weather forecasting from spatial resolution by using Flow Matching to super-resolve coarse trajectories as a post-processing step.
Apr 2
New Capability
Solves highly intractable (#P-hard) multi-objective optimization problems with tight approximation guarantees using a novel SAT-oracle approach.
Apr 2
New Capability
Demonstrates that covert collusion between multi-agent LLM systems can be detected zero-shot using internal model activations.
Apr 2
Paradigm Shift
Achieves 'zero forgetting' in continual learning by stacking frozen domain-specific MoE-LoRA adapters with a meta-router.
Apr 2
New Capability
First humanoid robot system to achieve consecutive ping-pong strikes using only onboard egocentric vision and whole-body coordination.
Apr 2
Breaks Assumption
Reveals a 'Reasoning Shift' where increased context length silently causes models to skip self-verification and shorten their reasoning traces by up to 50%.
Apr 2
Efficiency Breakthrough
Introduces S0 tuning for hybrid RNN-attention models, outperforming LoRA by 10.8% with zero inference overhead.
Apr 2
Efficiency Breakthrough
Reduces the compute cost of LLM test-time scaling by up to 67% using conformal prediction to calibrate reasoning paths.
Apr 2
Paradigm Shift
Replaces standard relative Softmax attention with 'Multiscreening' to allow absolute query-key relevance, yielding 3.2x faster inference at 100K context.
Apr 2
Scaling Insight
Simple Self-Distillation (SSD) improves LLM code generation (e.g., Qwen3-30B) by 13% Pass@1 without any external verifiers or teacher models.
Apr 2
Breaks Assumption
Provides causal evidence that reasoning models often decide on an action (like a tool call) before they even start generating their 'Chain-of-Thought'.
Apr 2
Efficiency Breakthrough
Combines the YOCO architecture with recursive computation to scale representational depth without inflating the KV cache.
Apr 2
Efficiency Breakthrough
Solves the long-standing trade-off in low-rank matrix recovery by achieving both optimal sample complexity and fast convergence.
Apr 2
Breaks Assumption
Provides a theoretical explanation for why Transformers often fail compared to linear models in financial time series forecasting.
Apr 2
Efficiency Breakthrough
Enables Gaussian Processes to scale on modern parallel hardware by removing the need for Cholesky decompositions.
Apr 2
New Capability
Introduces 'deconfounding scores' to enable reliable causal effect estimation even when treatment and control groups have very little overlap.
Apr 2
Open Release
Delivers a state-of-the-art universal phone recognition model across 100+ languages with full open-source release.
Apr 2