You can now use a banana or a teddy bear as a digital puppet to make professional 3D animations.
Practical Magic arxiv | Mar 19
A study of 300,000 gym sets shows the old formulas for predicting max strength are completely wrong.
Paradigm Challenge arxiv | Mar 19
The first dedicated foundation model for electrodermal activity (EDA) data, released alongside the largest public dataset for physiological signal modeling.
Open Release arxiv | Mar 19
Introduces Capability-Priced Micro-Markets (CPMM), a micro-economic framework for autonomous AI agent transactions over HTTP 402.
Paradigm Shift arxiv | Mar 19
HoloByte is a tokenizer-free framework that projects byte sequences into a continuous hyperspherical manifold to bypass the morphological limits of discrete tokens.
Efficiency Breakthrough arxiv | Mar 19
Proposes Modulated Hazard-aware Policy Optimization (MHPO) to solve the instability and mode collapse common in GRPO-based reinforcement learning.
Paradigm Shift arxiv | Mar 19
AwaRes enables low-resolution Vision-Language Models to retrieve only the high-resolution image crops needed for a specific query via tool-calling.
Efficiency Breakthrough arxiv | Mar 19
Minimum-Action Learning achieves a 10,000x reduction in noise variance for symbolic physical law identification from observational data.
New Capability arxiv | Mar 19
Learns task-specific dense reward functions directly from images using vision foundation models, without requiring privileged simulator states.
New Capability arxiv | Mar 19
Uses SMT solvers to formally verify the physical consistency of tree-based ML models across their entire input domain.
Breaks Assumption arxiv | Mar 19
Provides a systematic profiling of VLM inference bottlenecks and releases 'recipes' that cut time-to-first-token by up to 93%.
Efficiency Breakthrough arxiv | Mar 19
Provides a formal proof and empirical evidence that Transformers can learn symbolic rules entirely absent from training, debunking the 'stochastic parrot' interpolation-only hypothesis.
Breaks Assumption arxiv | Mar 19
Introduces HopChain, a framework for synthesizing multi-hop vision-language reasoning data that yields generalizable gains across 20+ diverse benchmarks.
New Capability arxiv | Mar 19
Identifies a fundamental conflict in Direct Preference Optimization (DPO) for unified models, where image generation quality resists alignment while understanding improves.
Breaks Assumption arxiv | Mar 19
Mathematically proves that the Transformer architecture is functionally equivalent to a Bayesian Network performing loopy belief propagation.
Paradigm Shift arxiv | Mar 19
Democratizes dexterous robot data collection by enabling high-fidelity 21-DoF teleoperation using only a standard smartphone.
Open Release arxiv | Mar 19
Reveals that cross-lingual knowledge failure in large reasoning models is primarily a script-translation barrier rather than a linguistic or reasoning deficit.
Breaks Assumption arxiv | Mar 19
Shows that 'Mid-Training' on high-quality reasoning data is the primary driver of model capability, whereas RL only succeeds as a sparse refinement step.
Scaling Insight arxiv | Mar 19
Leverages cross-lingual inconsistencies to pinpoint exactly which experts in a Mixture-of-Experts (MoE) model store specific factual knowledge.
New Capability arxiv | Mar 19
Exposes 'hidden clones' in VLM ensembles, where models from the same family share correlated errors that naive voting mechanisms fail to detect.
Breaks Assumption arxiv | Mar 19
Proposes REAL, a Reinforcement Learning framework tailored for regression and ordinal scoring rather than simple binary accuracy.
New Capability arxiv | Mar 19
Introduces a framework for LLM agents to autonomously evolve their policies and skill libraries during system idle time without retraining downtime.
New Capability arxiv | Mar 19
A backbone-agnostic denoising objective that allows small GNNs to outperform large models pretrained on much larger supervised datasets in physical sciences.
Efficiency Breakthrough arxiv | Mar 19
Achieves high-performance online continual learning without the massive memory overhead of traditional experience replay buffers.
Paradigm Shift arxiv | Mar 19
Internal activation probing detects LLM 'rationalization' more reliably than monitoring the model's own Chain-of-Thought (CoT).
Breaks Assumption arxiv | Mar 19
A dynamic data pruning framework that cuts dense retriever training time by 50% while actually improving retrieval accuracy.
Efficiency Breakthrough arxiv | Mar 19
Automates the generation of synthetic machine learning challenges to train agents that can genuinely learn research skills from doing.
New Capability arxiv | Mar 19
Alignment processes induce a 'normative bias' that makes LLMs worse at predicting real human behavior in strategic scenarios.
Breaks Assumption arxiv | Mar 19
Enables reliable, training-free emotion steering in speech-generative audio models via direct manipulation of specific emotion-sensitive neurons.
New Capability arxiv | Mar 19
A formal, graph-native memory architecture that treats agent memory as a versioned asset, dramatically outperforming Gemini 2.5 Pro on complex recall.
Paradigm Shift arxiv | Mar 19
A framework to quantify and fix 'task steerability,' the common failure of robots to respond to new instructions while mid-task.
New Capability arxiv | Mar 19
Achieves up to a 1,000x gain in RLHF data efficiency by using information-directed exploration and epistemic neural networks.
Efficiency Breakthrough arxiv | Mar 19
Introduces a reward framework that reduces LLM reasoning verbosity by optimizing for 'Information Density' via entropy reduction per step.
Efficiency Breakthrough arxiv | Mar 19
Shifts retrieval from static contrastive vector alignment to dynamic reasoning trajectories using a generative model (T1) and GRPO.
Paradigm Shift arxiv | Mar 19
Identifies that reasoning-induced safety failures occur *during* Chain-of-Thought and proposes a shift to 'decide-then-reason' architectures.
Breaks Assumption arxiv | Mar 19
Generates 9 million grid points of 3D spatiotemporal physical fields in seconds, a 10,000x speedup over traditional physics simulations.
Efficiency Breakthrough arxiv | Mar 19
Proposes a world model that jointly generates appearance and binocular geometry using an epipolar-aware attention mechanism.
New Capability arxiv | Mar 19
Introduces FineViT and a 450M local caption dataset to solve the 'coarse perception' bottleneck in current CLIP-based encoders.
Open Release arxiv | Mar 19
Provides a sheaf-theoretic proof that local causal consistency in generative models does not guarantee global counterfactual coherence.
Paradigm Shift arxiv | Mar 19
Replaces quadratic self-attention with $O(N \log N)$ phase-native coupling for time-series, enabling massive context windows.
Efficiency Breakthrough arxiv | Mar 19
Introduces a paradigm for vision-language navigation that uses ubiquitously available semantic floor plans as global spatial priors.
New Capability arxiv | Mar 19
Embeds invisible, agent-specific 'watermarks' into token distributions to enable forensic attribution and topology reconstruction in multi-agent systems.
New Capability arxiv | Mar 19
Achieves an 80% reduction in Chain-of-Thought (CoT) tokens while slightly increasing reasoning accuracy.
Efficiency Breakthrough arxiv | Mar 19
Extends LLM context from 32K to 128K by teaching models to selectively skip global attention for ~80% of tokens.
Efficiency Breakthrough arxiv | Mar 19
Reduces hallucinations by teaching models 'epistemological humility'—the ability to admit they don't know something—using synthetic non-existent terms.
New Capability arxiv | Mar 19
Develops a zero-watermarking framework that survives AI editing by leveraging invariant relations between image patches.
Breaks Assumption arxiv | Mar 19
Unifies large-scale search, recommendation, and reasoning into a single self-contained LLM by treating item IDs as a distinct modality.
Paradigm Shift arxiv | Mar 19
Video fine-tuning consistently degrades static image understanding in multimodal LLMs, revealing a zero-sum trade-off between spatial and temporal capabilities.
Scaling Insight arxiv | Mar 19
Introduces a Prompt-Free Universal Region Proposal Network (PF-RPN) that identifies objects in any domain without needing text or image exemplars.
New Capability arxiv | Mar 19
FrescoDiffusion enables coherent, 4K image-to-video generation using a training-free, tiled diffusion method with precomputed latent priors.
New Capability arxiv | Mar 19
Knowledge-Aware Active Learning (KA2L) uses latent space probing to identify what an LLM doesn't know and generates targeted synthetic questions.
Efficiency Breakthrough arxiv | Mar 19
Dense retrieval architectures are fundamentally flawed at detecting negation and contradictions due to 'Semantic Collapse' in vector space.
Breaks Assumption arxiv | Mar 19
Edit-As-Act reframes 3D scene editing as a goal-regressive planning problem using symbolic action languages rather than purely generative pixel manipulation.
Paradigm Shift arxiv | Mar 19
ARES demonstrates high-fidelity data reconstruction from large Federated Learning batches without requiring any architectural modifications to the model.
Breaks Assumption arxiv | Mar 19
Mechanistic probing reveals a directional asymmetry in how LLMs encode hierarchy: hypernymy is redundant and resilient, while hyponymy is fragile and compact.
Scaling Insight arxiv | Mar 19
S-VGGT introduces structure-aware subscene decomposition to break the quadratic scaling bottleneck of 3D foundation models.
Efficiency Breakthrough arxiv | Mar 19
Introduces a framework to generate complex, non-linear environments with mathematically guaranteed ground-truth optimal policies for RL benchmarking.
New Capability arxiv | Mar 19
DSS-GAN is the first generative adversarial network to use a Mamba (State Space Model) backbone for high-quality image synthesis.
Efficiency Breakthrough arxiv | Mar 19
VectorWorld enables stable, real-time 1km+ closed-loop world model rollouts for autonomous driving using diffusion flow on vector graphs.
New Capability arxiv | Mar 19
REAL achieves extreme quadruped parkour agility that is robust even to a 1-meter visual blind zone.
New Capability arxiv | Mar 19
FINER discovers that MLLMs are highly prone to hallucination when images contain fine-grained mismatches co-occurring with real elements.
Breaks Assumption arxiv | Mar 19
Synthetic videos of simple geometric shapes are more effective than massive real-world datasets for teaching video-language models fundamental temporal reasoning.
Efficiency Breakthrough arxiv | Mar 19
Lifting 2D features into a volumetric representation for robot manipulation policies yields a 14.8% success rate improvement by solving the 2D-3D spatial reasoning mismatch.
New Capability arxiv | Mar 19
A new self-refining surrogate framework enables neural models to simulate complex dynamical systems over arbitrarily long horizons without the standard failure mode of compounding error.
Paradigm Shift arxiv | Mar 19
Massive activation outliers in Transformers are an adaptive response to 'gradient sinks' during training, rather than just an inference-time quirk.
Breaks Assumption arxiv | Mar 19
The 'consensus trap' in label-free RL—where models reinforce their own systematic errors—can be broken by co-evolving the model in alternating generator and verifier roles.
Paradigm Shift arxiv | Mar 19
In-context memory for LLMs is fundamentally unreliable due to compaction loss and goal drift, but structured 'Knowledge Objects' provide a 252x cheaper and 100% accurate alternative.
Breaks Assumption arxiv | Mar 19
Anomaly detection can be performed directly using a primary model's internal neuron output ranges, eliminating the need for expensive external AD models.
Efficiency Breakthrough arxiv | Mar 19
Truncated backpropagation for video decoding reduces the memory cost of fine-tuning video diffusion models from linear to constant.
Efficiency Breakthrough arxiv | Mar 19
Concept erasure in text-to-image models is largely a facade that can be bypassed using text-free inversion attacks.
Breaks Assumption arxiv | Mar 19
LLMs compute and cache confidence scores automatically during answer generation, well before they are prompted to verbalize them.
Paradigm Shift arxiv | Mar 19
ProbeFlow achieves 14.8x faster action decoding in Vision-Language-Action (VLA) models without any retraining.
Efficiency Breakthrough arxiv | Mar 19
DebugLM allows developers to trace an LLM's specific behaviors back to individual training data sources.
New Capability arxiv | Mar 19
Measuring the distance between human languages can now be done quantitatively using the attention mechanisms of multilingual transformers.
Paradigm Shift arxiv | Mar 19
Large Language Models can maintain performance with only 16-64 unique weight values per matrix, as only the relative rank of weights matters.
Breaks Assumption arxiv | Mar 19
Parallel multi-token prediction can be achieved in standard LLMs without training auxiliary models or modifying weights.
Efficiency Breakthrough arxiv | Mar 19
CARE provides a recipe for converting standard GQA models into high-efficiency Multi-head Latent Attention (MLA) architectures.
Efficiency Breakthrough arxiv | Mar 19
VideoAtlas enables navigation and reasoning over long-form video using compute that scales only logarithmically with video length.
Efficiency Breakthrough arxiv | Mar 19
Enforce formal safety and Signal Temporal Logic (STL) constraints on robotics foundation models without retraining.
New Capability arxiv | Mar 19
MUD provides a faster, lower-overhead alternative to Muon for transformer training, achieving up to 2.6x higher throughput.
Efficiency Breakthrough arxiv | Mar 19
LoST introduces a semantic-first 3D tokenizer that reduces the token count for 3D shape generation by up to 99.9%.
Efficiency Breakthrough arxiv | Mar 19
AgentFactory shifts agent evolution from unreliable textual 'reflections' to a library of verifiable, executable Python subagents.
Paradigm Shift arxiv | Mar 19
SkeletonLLM allows frozen Multimodal LLMs to reason about human motion by rendering skeleton sequences into their native visual modality.
New Capability arxiv | Mar 19
DAPS++ reinterprets diffusion inverse problems as a decoupled EM-style initialization, significantly increasing restoration speed and stability.
Paradigm Shift arxiv | Mar 19
Motion-MLLM integrates IMU egomotion data into Video-LLMs to solve the fundamental scale and spatial reasoning ambiguities of purely visual models.
New Capability arxiv | Mar 19
Provides the first theoretical proof that Graph Transformers structurally prevent the 'oversmoothing' failure mode inherent to deep GCNs.
Scaling Insight arxiv | Mar 19
Imagine an AI virus that doesn't just sit there—it copies itself and jumps from one AI to the next all on its own.
First Ever arxiv | Mar 18
A new VR headset uses mirrors to kill the lag that makes you want to puke.
Practical Magic arxiv | Mar 18
These tiny sliding antennas are hacking the laws of physics to give you a perfect signal where your phone usually dies.
Nature Is Weird arxiv | Mar 18
New AI can peer into a computer chip's microscopic guts to find "spy tech" hidden by sketchy manufacturers.
Practical Magic arxiv | Mar 18
Researchers built a "ghost mode" for robots that calculates the exact path to sneak around without being seen.
Practical Magic arxiv | Mar 18
Turns out the long lines at airport security were secretly keeping the whole U.S. flight network from crashing for the last decade.
Paradigm Challenge arxiv | Mar 18
RSM achieves 20x faster training for recursive reasoning models and enables test-time scaling for up to 20,000 refinement steps.
Efficiency Breakthrough arxiv | Mar 18
A factorial study on EHR foundation models reveals that joint encoding of code-attribute pairs (local binding) is the primary driver of performance and efficiency.
Scaling Insight arxiv | Mar 18
Alternating Reinforcement Learning with Rubric Rewards (ARL-RR) replaces brittle scalar reward aggregation with a semantic meta-class optimization framework.
Paradigm Shift arxiv | Mar 18
Self-reflective program search matches or outperforms recursive language models for long-context tasks, suggesting recursion itself is not the primary driver of performance.
Breaks Assumption arxiv | Mar 18
Dynamic Representational Circuit Breaking (DRCB) introduces an architectural defense against steganographic collusion in multi-agent RL by monitoring and shuffling latent communication bottlenecks.
New Capability arxiv | Mar 18
Theoretical and empirical evidence suggests that the 'Key' mechanism in Attention may be redundant, proposing a 'QV' paradigm that simplifies Transformer architectures.
Breaks Assumption arxiv | Mar 18
Atlas introduces 'Compiled Memory,' which rewrites an agent's system prompt with distilled task experience rather than using RAG or fine-tuning.
Paradigm Shift arxiv | Mar 18
Latent Posterior Factors (LPF) bridge neural representations with structured probabilistic reasoning by converting VAE posteriors into factors for Sum-Product Networks.
New Capability arxiv | Mar 18