Open-sources a high-fidelity foundation model that jointly generates synchronized video and audio using a unified single-stream Transformer.
Open Release arxiv | Mar 24
Introduces a learnable bridge between GELU and ReLU activations to enable deployment-friendly piecewise-linear networks.
Efficiency Breakthrough arxiv | Mar 24
Achieves a 75x parameter reduction in 3D medical image segmentation by hybridizing Mamba and Transformer modules.
Efficiency Breakthrough arxiv | Mar 24
Decouples high-level reasoning from low-level motor control in robotics using a visual prompting interface.
Paradigm Shift arxiv | Mar 24
Releases the first large-scale family of learned sparse retrieval (LSR) models specialized for code (up to 8B parameters).
Open Release arxiv | Mar 24
Introduces a streaming detection head that stops Large Reasoning Models (LRMs) from 'overthinking' redundant steps.
Efficiency Breakthrough arxiv | Mar 24
Proposed a test-time scaling paradigm for image restoration that allows compute-to-quality trade-offs during inference.
Paradigm Shift arxiv | Mar 24
Releases the hardware design and training environment for MEVIUS2, an open-source, Spot-scale quadruped robot.
Open Release arxiv | Mar 24
Proves that 'topic-matched' contrast pairs are ineffective for extracting refusal directions in LLM abliteration research.
Breaks Assumption arxiv | Mar 24
Provides a strictly controlled comparison of autoregressive vs. masked diffusion language models on identical compute budgets.
Scaling Insight arxiv | Mar 24
Ensures safe Vision-Language Model generation without over-refusal by steering activations within the null-space of benign inputs.
New Capability arxiv | Mar 24
Identifies that the direction of log-probability change is more critical than magnitude for improving LLM reasoning via RL.
Paradigm Shift arxiv | Mar 24
Integrates LLMs as closed-loop tuning experts for manufacturing robots to achieve 0% failure in complex 3D printing tasks.
New Capability arxiv | Mar 24
Reduces the token count of Stable Diffusion 3.5 by 4x for high-resolution generation with minimal fine-tuning.
Efficiency Breakthrough arxiv | Mar 24
Provides causal evidence that LLMs use internal confidence signals to drive behavioral decisions like abstention, rather than just as a side-effect of output generation.
Breaks Assumption arxiv | Mar 24
Identifies 'Visual Anchor Collapse' in DPO-aligned VLMs and introduces an asymmetric constraint to prevent models from ignoring visual evidence in favor of language priors.
Paradigm Shift arxiv | Mar 24
A predictive scheduling system for multi-agent workflows that optimizes serving across heterogeneous LLM clusters (mixing large and small models).
Efficiency Breakthrough arxiv | Mar 24
Introduces 'Noise Titration' to prove that current time-series foundation models often fail at structural inference, behaving instead as 'context parrots' during non-stationary shifts.
Breaks Assumption arxiv | Mar 24
Integrates auction bids and monetization logic directly into generative recommender systems (like TIGER) via bid-aware decoding.
New Capability arxiv | Mar 24
MemDLM embeds a simulated denoising process into training to create 'Parametric Memory,' narrowing the train-inference gap for Diffusion Language Models.
New Capability arxiv | Mar 24
An open foundation suite for universal dexterous robot control trained on over 50k trajectories across eight different robotic hand architectures.
Open Release arxiv | Mar 24
Bypasses Reinforcement Learning during the exploration phase by using uncertainty-guided tree search to discover informative data.
Paradigm Shift arxiv | Mar 24
Enables high-rank (r=384) DoRA training on single GPUs through factored norms and fused Triton kernels.
Efficiency Breakthrough arxiv | Mar 24
Introduces a parallel reasoning mechanism for Vision-Language-Action (VLA) models that eliminates the latency bottleneck of autoregressive Chain-of-Thought.
Efficiency Breakthrough arxiv | Mar 24
UNITE enables single-stage joint training of the tokenizer and the diffusion model from scratch, removing the need for frozen VAEs.
Paradigm Shift arxiv | Mar 24
A training-free feature caching framework that achieves 2.3x speedup for video world models while maintaining 99.4% quality.
Efficiency Breakthrough arxiv | Mar 24
A transformer-based meta-amortized framework that allows simulation-based inference to remain valid across different model structures without retraining.
New Capability arxiv | Mar 24
LassoFlexNet matches or beats leading tree-based models on tabular data while maintaining Lasso-like interpretability through per-feature embeddings and a group Lasso mechanism.
Paradigm Shift arxiv | Mar 24
Proves that rotation-invariant algorithms like standard Gradient Descent are fundamentally suboptimal for sparse targets when trained on hard labels.
Breaks Assumption arxiv | Mar 24
A grid-free probabilistic framework for nonrigid registration of high-dimensional vector-valued functions on irregular manifolds.
New Capability arxiv | Mar 24
A unified discrete diffusion framework that outperforms autoregressive models on large-scale discrete generation tasks for the first time.
Efficiency Breakthrough arxiv | Mar 24
The math we've used for 50 years to figure out how fast the internet should be is actually missing a giant piece of the puzzle.
Paradigm Challenge arxiv | Mar 23
You can get a whole crowd to agree on something even if everyone only knows what the person right next to them is thinking.
Nature Is Weird arxiv | Mar 23
Over 10% of new medical papers are being written by AI now—three years ago, that number was zero.
Nature Is Weird arxiv | Mar 23
We can now spot Alzheimer's early by looking at the brain like a building that’s literally buckling under the weight of toxic sludge.
Practical Magic arxiv | Mar 23
Massive wealth gaps might just be a math problem: if you always pick the better of two random options, inequality is basically guaranteed.
Nature Is Weird arxiv | Mar 23
Introduces a statistical alternative to the standard frequency-based BPE tokenization used in nearly all modern LLMs.
Paradigm Shift arxiv | Mar 23
Discovers a multiplicative scaling law governing how LLMs revise their beliefs during iterative reasoning (CoT, reflection).
Scaling Insight arxiv | Mar 23
Achieves state-of-the-art LLM distillation using 10-25% of the data required by standard fine-tuning.
Efficiency Breakthrough arxiv | Mar 23
Formally proves that a causal Transformer is mathematically equivalent to a stateless Differentiable Neural Computer.
Paradigm Shift arxiv | Mar 23
Accelerates MoE inference by speculating future experts to overlap CPU-GPU memory transfers with computation.
Efficiency Breakthrough arxiv | Mar 23
A self-improvement framework (MIPO) that improves LLM personalization and reasoning with zero additional data or human labels.
New Capability arxiv | Mar 23
Achieve 97% of Oracle reward performance using only 20% of the training labels for complex LLM reasoning.
Efficiency Breakthrough arxiv | Mar 23
The first Joint Embedding Predictive Architecture (JEPA) to train stably end-to-end from raw pixels with massive planning speedups.
Efficiency Breakthrough arxiv | Mar 23
Solves the compositional generalization failure of neural networks (0% to 100% accuracy) by embedding algebraic semiring constraints.
Paradigm Shift arxiv | Mar 23
A massive controlled study reveals that post-training algorithm rankings (DPO, SimPO, etc.) completely invert as models scale.
Scaling Insight arxiv | Mar 23
DAPA speeds up GELU computation by 16x and reduces hardware DSP utilization by 16x for on-device Transformer deployment.
Efficiency Breakthrough arxiv | Mar 23
Spectral Tempering achieves near-oracle embedding compression for dense retrieval without requiring any labeled data or grid searching.
Efficiency Breakthrough arxiv | Mar 23
Challenges the 80-year-old assumption that neurons must use weighted summation as their primary aggregation mechanism.
Paradigm Shift arxiv | Mar 23
Empirically proves that most Transformer layers are redundant, enabling a 54% training cost reduction through non-uniform budget allocation.
Efficiency Breakthrough arxiv | Mar 23
Warm-Start Flow Matching provides a guaranteed speedup for image/text generation by using lightweight models as initial priors.
Efficiency Breakthrough arxiv | Mar 23
VAMPO optimizes visual dynamics in video models using policy gradients to fix precision-critical errors in robotic manipulation.
New Capability arxiv | Mar 23
Debunks recent 'evaluation awareness' findings in LLMs by showing that linear probes are actually just tracking formatting artifacts.
Breaks Assumption arxiv | Mar 23
Introduces Hyperagents: self-referential systems where the meta-level modification logic is itself an editable program.
Paradigm Shift arxiv | Mar 23
Adaptive Layerwise Perturbation (ALP) solves the training-inference mismatch and importance ratio blowup in LLM reinforcement learning.
Efficiency Breakthrough arxiv | Mar 23
Fine-tunes Large Vision Language Models for medical tasks using only image-description pairs, bypassing the need for expensive expert-curated instructions.
Paradigm Shift arxiv | Mar 23
Introduces Any-Subgroup Equivariant Networks (ASEN), a single model that can adapt to multiple different symmetry groups via input modulation.
New Capability arxiv | Mar 23
ICLAD enables unified, in-context anomaly detection for tabular data across unsupervised, semi-supervised, and one-class regimes without weight updates.
New Capability arxiv | Mar 23
Expands formal reasoning beyond proof construction to the generation and formal verification of counterexamples in Lean 4.
New Capability arxiv | Mar 23
EvidenceRL uses reinforcement learning (GRPO) to explicitly optimize for evidence adherence, reducing hallucinations in high-stakes RAG pipelines.
Efficiency Breakthrough arxiv | Mar 23
MoCA3D predicts 3D bounding boxes from monocular images without requiring any camera intrinsics at inference time.
Breaks Assumption arxiv | Mar 23
Reveals that complex reasoning strategies like Chain-of-Thought (CoT) and Tree-of-Thought (ToT) provide negligible or even negative gains for text classification tasks.
Breaks Assumption arxiv | Mar 23
Formalizes the 'Neural Uncertainty Principle,' linking adversarial vulnerability in vision and hallucinations in LLMs to a shared geometric and information-theoretic origin.
Paradigm Shift arxiv | Mar 23
Accelerates diffusion-based image decoders by an order of magnitude using multi-scale sampling and one-step distillation.
Efficiency Breakthrough arxiv | Mar 23
CurveStream implements a curvature-aware hierarchical memory to handle streaming video in MLLMs without Out-of-Memory (OOM) errors.
New Capability arxiv | Mar 23
Proves the Key-Value (KV) cache is entirely redundant and can be bit-identically recomputed from the residual stream.
Breaks Assumption arxiv | Mar 23
Reduces covariance tracking error by 30x by reformulating the problem as rigid-body motion on Lie groups.
Efficiency Breakthrough arxiv | Mar 23
A massive field study (9,000+ users) proves that algorithmic shifts can reduce affective polarization without sacrificing user engagement.
Paradigm Shift arxiv | Mar 23
Achieves a 19x reduction in inference cost and 16x in latency for agentic workflows by evolving hybrid LLM-and-code pipelines.
Efficiency Breakthrough arxiv | Mar 23
Reduces long-context inference latency by 26.4x using a training-free, structure-aware prompt compression framework.
Efficiency Breakthrough arxiv | Mar 23
Boosts open-model agent performance on web navigation tasks from 6.4% to 43%, surpassing proprietary models like GPT-4o.
New Capability arxiv | Mar 23
Proves that intuitive task similarity is a poor predictor of training data value for MLLMs and offers a highly accurate training-free alternative.
Breaks Assumption arxiv | Mar 23
Enables zero-shot humanoid robot interaction by generating robot-centric 'dream' videos instead of relying on human-to-robot motion retargeting.
Paradigm Shift arxiv | Mar 23
Introduces the first reinforcement learning framework to compress implicit reasoning steps in looped language models.
Efficiency Breakthrough arxiv | Mar 23
Replaces fixed context compression ratios with a performance-floor constraint to ensure reliable LLM deployment.
Paradigm Shift arxiv | Mar 23
Achieves O(1) time complexity for dense component attribution in SwiGLU Transformers using a single forward-backward pass.
Efficiency Breakthrough arxiv | Mar 23
First unified pipeline to reconstruct complete geometry, materials, and lighting from sparse views in under one second.
New Capability arxiv | Mar 23
Introduces the first inherently scalable primitive for radiance fields, allowing real-time Level-of-Detail (LOD) rendering by simply truncating Fourier coefficients.
New Capability arxiv | Mar 23
FIPO overcomes reasoning length stagnation in LLMs by using Future-KL divergence to create dense rewards, extending Chain-of-Thought lengths to over 10,000 tokens.
Paradigm Shift arxiv | Mar 23
A training-free method to fix intra-modal misalignment in CLIP by decomposing projectors into an isotropic aligned subspace.
Efficiency Breakthrough arxiv | Mar 23
NASimJax provides a 100x throughput increase for autonomous penetration testing simulators by reimplementing the environment in JAX.
Efficiency Breakthrough arxiv | Mar 23
SCRL introduces the first negative supervision mechanism for Test-Time Reinforcement Learning, preventing LLMs from reinforcing 'consensus lies'.
New Capability arxiv | Mar 23
SAGE achieves state-of-the-art translation for low-resource languages while reducing training data requirements by 97.1% via RL-guided curation.
Efficiency Breakthrough arxiv | Mar 23
Memori reduces agent token costs by 20x by replacing raw conversation history with a persistent layer of semantic triples and summaries.
Efficiency Breakthrough arxiv | Mar 23
2K Retrofit enables 2K-resolution inference for any 3D geometric foundation model without modifying or retraining the backbone.
Efficiency Breakthrough arxiv | Mar 23
X-World is a controllable, action-conditioned multi-camera world model that simulates realistic future video observations for end-to-end driving.
New Capability arxiv | Mar 23
Breaking the 'capability ceiling' in LLM post-training by replacing full-history dependencies with explicit Markov states.
Paradigm Shift arxiv | Mar 23
A k-means variant that is up to 7x faster than FAISS and Scikit-Learn on CPUs and 4x faster than cuVS on GPUs.
Efficiency Breakthrough arxiv | Mar 23
Reduces the computational cost of Neural Architecture Search for ensembles from O(M) to O(1).
Efficiency Breakthrough arxiv | Mar 23
Enables LLMs to explore beyond their current distribution during RL by treating failed trajectories as hindsight guidance.
New Capability arxiv | Mar 23
Identifies 'critical times' in diffusion generation where targeted guidance pulses significantly improve image control.
Paradigm Shift arxiv | Mar 23
Exposes fundamental flaws in using LLM-based agents to evaluate automated interpretability and model circuits.
Breaks Assumption arxiv | Mar 23
Replaces unstable free-form recursive LLM code with a typed functional runtime grounded in lambda-calculus.
New Capability arxiv | Mar 23
Derives a variational ELBO for the Joint-Embedding Predictive Architecture (JEPA), unifying it with generative modeling.
Paradigm Shift arxiv | Mar 23
Enables zero-shot, directed protein generation by applying a simple scalar bias to stochastic attention samplers.
New Capability arxiv | Mar 23
Demonstrates that LLM reasoning capabilities drop sharply when tasks are framed within multi-turn dialogues vs isolated benchmarks.
Breaks Assumption arxiv | Mar 23
A comprehensive end-to-end workflow for humanoid loco-manipulation that standardizes sim-to-real transfer.
New Capability arxiv | Mar 23
Quantifies LLM uncertainty in a single generation pass without auxiliary models or repeated sampling.
Efficiency Breakthrough arxiv | Mar 23
Demonstrates that current 'faithfulness' metrics for Chain-of-Thought reasoning are highly subjective and vary wildly depending on the choice of classifier.
Breaks Assumption arxiv | Mar 23
Introduces a long-horizon video agent that uses 93% fewer frames than GPT-5/standalone LMMs while achieving higher accuracy.
Efficiency Breakthrough arxiv | Mar 23