SeriesFusion
Science, curated & edited by AI

Efficiency Breakthrough

375 papers  ·  Page 3 of 8
AI
Photon enables efficient 3D medical volume understanding through adaptive token scheduling and a novel 'gradient restoration' backpropagation rule.
Mar 27
AI
Pruning low-utility prompts before RL rollouts allows for 10x more efficient training of large reasoning models.
Mar 27
AI
Simple image sharpening serves as a surrogate-free, zero-cost preemptive defense against adversarial attacks.
Mar 27
AI
A new tokenization architecture reduces the 'Token Tax' for complex non-Latin scripts by over 60%.
Mar 27
AI
GlowQ introduces group-shared low-rank approximations to speed up quantized LLM inference by up to 37%.
Mar 27
AI
Reduces LLM inference energy by 40% (and up to 81%) using a distillation-based router to skip unnecessary reasoning steps.
Mar 27
AI
Unlocks full-body musculoskeletal humanoid training by achieving order-of-magnitude speedups via massively parallel GPU simulation.
Mar 27
AI
Achieves 45% performance gains in robotics using 5-10x fewer real-world demonstrations through high-dimensional factorization.
Mar 27
AI
Achieves up to 4.7x speedup for Diffusion LLMs using a training-free self-speculative decoding framework.
Mar 27
AI
Generates 2-minute 480p videos on a single H200 GPU through a hierarchical KV-cache strategy that compresses context by 32x.
Mar 27
AI
Enables 4K novel view synthesis in a feed-forward manner by decoupling geometric complexity from rendering resolution.
Mar 27
AI
Demonstrates that general-purpose coding agents can achieve 20x speedups in hardware design optimization without domain-specific training.
Mar 27
AI
A training-free enhancement that unlocks multi-scale synergies in Vision Foundation Models (VFMs) to boost performance across various tasks.
Mar 27
AI
Memory Sparse Attention (MSA) enables LLMs to scale to 100 million tokens with linear complexity and less than 9% precision degradation.
Mar 26
AI
The first sorting-free stochastic formulation for 3D Gaussian Splatting that matches rasterization speed while enabling full ray-traced effects.
Mar 26
AI
AI agent benchmarks can be slashed by ~50% in cost by only evaluating on tasks with intermediate historical pass rates.
Mar 26
AI
Hybrid Distillation Policy Optimization (HDPO) overcomes the 'vanishing gradient' problem for hard mathematical prompts that RL agents cannot solve.
Mar 26
AI
A self-distillation method for Multi-Token Prediction (MTP) that yields a 220% inference speedup with minimal training cost.
Mar 26
AI
AttentionPack achieves up to 8x memory efficiency during decoding for large vision-language models (VLMs).
Mar 26
AI
SLAT-Phys predicts spatially varying material property fields directly from single RGB images with a 120x speedup.
Mar 26
AI
Reduces Text-to-SQL input tokens by 99% by internalizing the database schema into the model weights through a two-phase fine-tuning approach.
Mar 26
AI
MoE-Sieve reduces Mixture-of-Experts LoRA fine-tuning parameters and training time by ~70% by only adapting the most-frequently activated 'hot' experts.
Mar 26
AI
Achieves up to 400x speedup and 64x memory reduction for open-vocabulary 3D scene understanding compared to current Gaussian Splatting methods.
Mar 26
AI
Enables 1000x faster on-chip training for Weightless Neural Networks (WNNs) on FPGAs with drastically lower power consumption.
Mar 26
AI
A 5M-parameter OCR model that rivals billion-parameter vision-language models, proving data-centric curation can beat raw parameter scale.
Mar 26
AI
Achieves high-fidelity sub-seasonal weather forecasting with a 276M parameter model that matches 1.6B parameter baselines in accuracy and speed.
Mar 26
AI
Agentic Variation Operators (AVO) replace fixed evolutionary heuristics with coding agents to discover GPU kernels that outperform FlashAttention-4 by 10.5%.
Mar 26
AI
DreamerAD accelerates imagination-based training for autonomous driving by 80x, compressing 100-step diffusion sampling down to a single step.
Mar 26
AI
The Multilevel Euler-Maruyama (ML-EM) method allows diffusion models to perform sampling at the computational cost of a single model evaluation.
Mar 26
AI
Sparse Feature Attention (SFA) reduces attention costs from quadratic in sequence length and linear in dimension to a fraction based on feature sparsity, enabling 2.5x speedups.
Mar 25
AI
Standard quantization destroys the small parameter 'deltas' that encode post-training knowledge; Delta-Aware Quantization (DAQ) fixes this by optimizing for sign preservation.
Mar 25
AI
Hybrid Associative Memory (HAM) layers allow the KV cache to grow dynamically based only on information that an internal RNN cannot predict.
Mar 25
AI
Proposes an agentic architecture that achieves O(1) token complexity relative to dataset size by strictly separating intent parsing from deterministic data execution.
Mar 25
AI
Achieves high-fidelity diffusion generation in just 3 steps by distilling layer-wise time embeddings from reference trajectories.
Mar 25
AI
Introduces a verifier that operates directly on the latent hidden states of Diffusion Transformers, avoiding the need for costly pixel-space decoding during inference-time scaling.
Mar 25
AI
A 0.26M parameter model using continuous dynamics outperforms 27M parameter recursive models on complex logic tasks like Sudoku-Extreme.
Mar 25
AI
Agile-VLA enables high-frequency robot control on edge devices by decoupling perception from action through implicit affordance anchoring.
Mar 25
AI
EchoKV introduces a reversible KV cache compression scheme that allows LLMs to switch back to full-precision inference on-demand.
Mar 25
AI
ForestPrune achieves up to 90% token reduction in video MLLMs with minimal accuracy loss using a training-free spatial-temporal forest modeling approach.
Mar 25
AI
Optimizing autoregressive image models with Group Relative Policy Optimization (GRPO) achieves competitive quality without the 2x inference cost of Classifier-Free Guidance.
Mar 25
AI
DILLO enables 14x faster safety-critical agent steering by predicting action consequences from latent states instead of heavy visual simulations.
Mar 25
AI
ImplicitRM enables unbiased reward modeling from 'messy' implicit feedback (clicks/copies), drastically reducing the cost of RLHF data collection.
Mar 25
AI
Introduces custom CUDA kernels and a sparse packing format that enables Transformers to maintain performance with over 99% feedforward sparsity.
Mar 25
AI
Upgrades video Diffusion Transformers to ultra-high-resolution synthesis using a two-stage 'Relay LoRA' adaptation on pure images.
Mar 25
AI
Challenges the dominance of on-policy RL for LLMs by introducing a practical off-policy value-based framework that enables data reuse.
Mar 25
AI
An online length-aware scheduling strategy that eliminates training 'bubbles' during the rollout phase of LLM reinforcement learning.
Mar 25
AI
Leverages human gaze tracking to assign non-uniform token density in diffusion models, creating perceptually perfect images with significantly less compute.
Mar 25
AI
Replaces visual token compression with sparse, dynamically selected vision-language interactions in VLLMs.
Mar 25
AI
Introduces on-the-fly quantization that calibrates to individual prompts during inference, solving the 'domain shift' problem where standard quantization fails on unseen data.
Mar 25
AI
Achieves over 10x faster sampling for diffusion language models by shifting the process into continuous semantic space.
Mar 24