AI & ML

1625 papers · Page 12 of 17

You can now use a banana or a teddy bear as a digital puppet to make professional 3D animations.

Practical Magic arxiv | Mar 19

A study of 300,000 gym sets shows the old formulas for predicting max strength are completely wrong.

Paradigm Challenge arxiv | Mar 19

The first dedicated foundation model for electrodermal activity (EDA) data, released alongside the largest public dataset for physiological signal modeling.

Open Release arxiv | Mar 19

Introduces Capability-Priced Micro-Markets (CPMM), a micro-economic framework for autonomous AI agent transactions over HTTP 402.

Paradigm Shift arxiv | Mar 19

HoloByte is a tokenizer-free framework that projects byte sequences into a continuous hyperspherical manifold to bypass the morphological limits of discrete tokens.

Efficiency Breakthrough arxiv | Mar 19

Proposes Modulated Hazard-aware Policy Optimization (MHPO) to solve the instability and mode collapse common in GRPO-based reinforcement learning.

Paradigm Shift arxiv | Mar 19

AwaRes enables low-resolution Vision-Language Models to retrieve only the high-resolution image crops needed for a specific query via tool-calling.

Efficiency Breakthrough arxiv | Mar 19

Minimum-Action Learning achieves a 10,000x reduction in noise variance for symbolic physical law identification from observational data.

New Capability arxiv | Mar 19

Learns task-specific dense reward functions directly from images using vision foundation models, without requiring privileged simulator states.

New Capability arxiv | Mar 19

Uses SMT solvers to formally verify the physical consistency of tree-based ML models across their entire input domain.

Breaks Assumption arxiv | Mar 19

Provides a systematic profiling of VLM inference bottlenecks and releases 'recipes' that cut time-to-first-token by up to 93%.

Efficiency Breakthrough arxiv | Mar 19

Provides a formal proof and empirical evidence that Transformers can learn symbolic rules entirely absent from training, debunking the 'stochastic parrot' interpolation-only hypothesis.

Breaks Assumption arxiv | Mar 19

Introduces HopChain, a framework for synthesizing multi-hop vision-language reasoning data that yields generalizable gains across 20+ diverse benchmarks.

New Capability arxiv | Mar 19

Identifies a fundamental conflict in Direct Preference Optimization (DPO) for unified models, where image generation quality resists alignment while understanding improves.

Breaks Assumption arxiv | Mar 19

Mathematically proves that the Transformer architecture is functionally equivalent to a Bayesian Network performing loopy belief propagation.

Paradigm Shift arxiv | Mar 19

Democratizes dexterous robot data collection by enabling high-fidelity 21-DoF teleoperation using only a standard smartphone.

Open Release arxiv | Mar 19

Reveals that cross-lingual knowledge failure in large reasoning models is primarily a script-translation barrier rather than a linguistic or reasoning deficit.

Breaks Assumption arxiv | Mar 19

Shows that 'Mid-Training' on high-quality reasoning data is the primary driver of model capability, whereas RL only succeeds as a sparse refinement step.

Scaling Insight arxiv | Mar 19

Leverages cross-lingual inconsistencies to pinpoint exactly which experts in a Mixture-of-Experts (MoE) model store specific factual knowledge.

New Capability arxiv | Mar 19

Exposes 'hidden clones' in VLM ensembles, where models from the same family share correlated errors that naive voting mechanisms fail to detect.

Breaks Assumption arxiv | Mar 19

Proposes REAL, a Reinforcement Learning framework tailored for regression and ordinal scoring rather than simple binary accuracy.

New Capability arxiv | Mar 19

Introduces a framework for LLM agents to autonomously evolve their policies and skill libraries during system idle time without retraining downtime.

New Capability arxiv | Mar 19

A backbone-agnostic denoising objective that allows small GNNs to outperform large models pretrained on much larger supervised datasets in physical sciences.

Efficiency Breakthrough arxiv | Mar 19

Achieves high-performance online continual learning without the massive memory overhead of traditional experience replay buffers.

Paradigm Shift arxiv | Mar 19

Internal activation probing detects LLM 'rationalization' more reliably than monitoring the model's own Chain-of-Thought (CoT).

Breaks Assumption arxiv | Mar 19

A dynamic data pruning framework that cuts dense retriever training time by 50% while actually improving retrieval accuracy.

Efficiency Breakthrough arxiv | Mar 19

Automates the generation of synthetic machine learning challenges to train agents that can genuinely learn research skills from doing.

New Capability arxiv | Mar 19

Alignment processes induce a 'normative bias' that makes LLMs worse at predicting real human behavior in strategic scenarios.

Breaks Assumption arxiv | Mar 19

Enables reliable, training-free emotion steering in speech-generative audio models via direct manipulation of specific emotion-sensitive neurons.

New Capability arxiv | Mar 19

A formal, graph-native memory architecture that treats agent memory as a versioned asset, dramatically outperforming Gemini 2.5 Pro on complex recall.

Paradigm Shift arxiv | Mar 19

A framework to quantify and fix 'task steerability,' the common failure of robots to respond to new instructions while mid-task.

New Capability arxiv | Mar 19

Achieves up to a 1,000x gain in RLHF data efficiency by using information-directed exploration and epistemic neural networks.

Efficiency Breakthrough arxiv | Mar 19

Introduces a reward framework that reduces LLM reasoning verbosity by optimizing for 'Information Density' via entropy reduction per step.

Efficiency Breakthrough arxiv | Mar 19

Shifts retrieval from static contrastive vector alignment to dynamic reasoning trajectories using a generative model (T1) and GRPO.

Paradigm Shift arxiv | Mar 19

Identifies that reasoning-induced safety failures occur *during* Chain-of-Thought and proposes a shift to 'decide-then-reason' architectures.

Breaks Assumption arxiv | Mar 19

Generates 9 million grid points of 3D spatiotemporal physical fields in seconds, a 10,000x speedup over traditional physics simulations.

Efficiency Breakthrough arxiv | Mar 19

Proposes a world model that jointly generates appearance and binocular geometry using an epipolar-aware attention mechanism.

New Capability arxiv | Mar 19

Introduces FineViT and a 450M local caption dataset to solve the 'coarse perception' bottleneck in current CLIP-based encoders.

Open Release arxiv | Mar 19

Provides a sheaf-theoretic proof that local causal consistency in generative models does not guarantee global counterfactual coherence.

Paradigm Shift arxiv | Mar 19

Replaces quadratic self-attention with $O(N \log N)$ phase-native coupling for time-series, enabling massive context windows.

Efficiency Breakthrough arxiv | Mar 19

Introduces a paradigm for vision-language navigation that uses ubiquitously available semantic floor plans as global spatial priors.

New Capability arxiv | Mar 19

Embeds invisible, agent-specific 'watermarks' into token distributions to enable forensic attribution and topology reconstruction in multi-agent systems.

New Capability arxiv | Mar 19

Achieves an 80% reduction in Chain-of-Thought (CoT) tokens while slightly increasing reasoning accuracy.

Efficiency Breakthrough arxiv | Mar 19

Extends LLM context from 32K to 128K by teaching models to selectively skip global attention for ~80% of tokens.

Efficiency Breakthrough arxiv | Mar 19

Reduces hallucinations by teaching models 'epistemological humility'—the ability to admit they don't know something—using synthetic non-existent terms.

New Capability arxiv | Mar 19

Develops a zero-watermarking framework that survives AI editing by leveraging invariant relations between image patches.

Breaks Assumption arxiv | Mar 19

Unifies large-scale search, recommendation, and reasoning into a single self-contained LLM by treating item IDs as a distinct modality.

Paradigm Shift arxiv | Mar 19

Video fine-tuning consistently degrades static image understanding in multimodal LLMs, revealing a zero-sum trade-off between spatial and temporal capabilities.

Scaling Insight arxiv | Mar 19

Introduces a Prompt-Free Universal Region Proposal Network (PF-RPN) that identifies objects in any domain without needing text or image exemplars.

New Capability arxiv | Mar 19

FrescoDiffusion enables coherent, 4K image-to-video generation using a training-free, tiled diffusion method with precomputed latent priors.

New Capability arxiv | Mar 19

Knowledge-Aware Active Learning (KA2L) uses latent space probing to identify what an LLM doesn't know and generates targeted synthetic questions.

Efficiency Breakthrough arxiv | Mar 19

Dense retrieval architectures are fundamentally flawed at detecting negation and contradictions due to 'Semantic Collapse' in vector space.

Breaks Assumption arxiv | Mar 19

Edit-As-Act reframes 3D scene editing as a goal-regressive planning problem using symbolic action languages rather than purely generative pixel manipulation.

Paradigm Shift arxiv | Mar 19

ARES demonstrates high-fidelity data reconstruction from large Federated Learning batches without requiring any architectural modifications to the model.

Breaks Assumption arxiv | Mar 19

Mechanistic probing reveals a directional asymmetry in how LLMs encode hierarchy: hypernymy is redundant and resilient, while hyponymy is fragile and compact.

Scaling Insight arxiv | Mar 19

S-VGGT introduces structure-aware subscene decomposition to break the quadratic scaling bottleneck of 3D foundation models.

Efficiency Breakthrough arxiv | Mar 19

Introduces a framework to generate complex, non-linear environments with mathematically guaranteed ground-truth optimal policies for RL benchmarking.

New Capability arxiv | Mar 19

DSS-GAN is the first generative adversarial network to use a Mamba (State Space Model) backbone for high-quality image synthesis.

Efficiency Breakthrough arxiv | Mar 19

VectorWorld enables stable, real-time 1km+ closed-loop world model rollouts for autonomous driving using diffusion flow on vector graphs.

New Capability arxiv | Mar 19

REAL achieves extreme quadruped parkour agility that is robust even to a 1-meter visual blind zone.

New Capability arxiv | Mar 19

FINER discovers that MLLMs are highly prone to hallucination when images contain fine-grained mismatches co-occurring with real elements.

Breaks Assumption arxiv | Mar 19

Synthetic videos of simple geometric shapes are more effective than massive real-world datasets for teaching video-language models fundamental temporal reasoning.

Efficiency Breakthrough arxiv | Mar 19

Lifting 2D features into a volumetric representation for robot manipulation policies yields a 14.8% success rate improvement by solving the 2D-3D spatial reasoning mismatch.

New Capability arxiv | Mar 19

A new self-refining surrogate framework enables neural models to simulate complex dynamical systems over arbitrarily long horizons without the standard failure mode of compounding error.

Paradigm Shift arxiv | Mar 19

Massive activation outliers in Transformers are an adaptive response to 'gradient sinks' during training, rather than just an inference-time quirk.

Breaks Assumption arxiv | Mar 19

The 'consensus trap' in label-free RL—where models reinforce their own systematic errors—can be broken by co-evolving the model in alternating generator and verifier roles.

Paradigm Shift arxiv | Mar 19

In-context memory for LLMs is fundamentally unreliable due to compaction loss and goal drift, but structured 'Knowledge Objects' provide a 252x cheaper and 100% accurate alternative.

Breaks Assumption arxiv | Mar 19

Anomaly detection can be performed directly using a primary model's internal neuron output ranges, eliminating the need for expensive external AD models.

Efficiency Breakthrough arxiv | Mar 19

Truncated backpropagation for video decoding reduces the memory cost of fine-tuning video diffusion models from linear to constant.

Efficiency Breakthrough arxiv | Mar 19

Concept erasure in text-to-image models is largely a facade that can be bypassed using text-free inversion attacks.

Breaks Assumption arxiv | Mar 19

LLMs compute and cache confidence scores automatically during answer generation, well before they are prompted to verbalize them.

Paradigm Shift arxiv | Mar 19

ProbeFlow achieves 14.8x faster action decoding in Vision-Language-Action (VLA) models without any retraining.

Efficiency Breakthrough arxiv | Mar 19

DebugLM allows developers to trace an LLM's specific behaviors back to individual training data sources.

New Capability arxiv | Mar 19

Measuring the distance between human languages can now be done quantitatively using the attention mechanisms of multilingual transformers.

Paradigm Shift arxiv | Mar 19

Large Language Models can maintain performance with only 16-64 unique weight values per matrix, as only the relative rank of weights matters.

Breaks Assumption arxiv | Mar 19

Parallel multi-token prediction can be achieved in standard LLMs without training auxiliary models or modifying weights.

Efficiency Breakthrough arxiv | Mar 19

CARE provides a recipe for converting standard GQA models into high-efficiency Multi-head Latent Attention (MLA) architectures.

Efficiency Breakthrough arxiv | Mar 19

VideoAtlas enables navigation and reasoning over long-form video using compute that scales only logarithmically with video length.

Efficiency Breakthrough arxiv | Mar 19

Enforce formal safety and Signal Temporal Logic (STL) constraints on robotics foundation models without retraining.

New Capability arxiv | Mar 19

MUD provides a faster, lower-overhead alternative to Muon for transformer training, achieving up to 2.6x higher throughput.

Efficiency Breakthrough arxiv | Mar 19

LoST introduces a semantic-first 3D tokenizer that reduces the token count for 3D shape generation by up to 99.9%.

Efficiency Breakthrough arxiv | Mar 19

AgentFactory shifts agent evolution from unreliable textual 'reflections' to a library of verifiable, executable Python subagents.

Paradigm Shift arxiv | Mar 19

SkeletonLLM allows frozen Multimodal LLMs to reason about human motion by rendering skeleton sequences into their native visual modality.

New Capability arxiv | Mar 19

DAPS++ reinterprets diffusion inverse problems as a decoupled EM-style initialization, significantly increasing restoration speed and stability.

Paradigm Shift arxiv | Mar 19

Motion-MLLM integrates IMU egomotion data into Video-LLMs to solve the fundamental scale and spatial reasoning ambiguities of purely visual models.

New Capability arxiv | Mar 19

Provides the first theoretical proof that Graph Transformers structurally prevent the 'oversmoothing' failure mode inherent to deep GCNs.

Scaling Insight arxiv | Mar 19

Imagine an AI virus that doesn't just sit there—it copies itself and jumps from one AI to the next all on its own.

First Ever arxiv | Mar 18

A new VR headset uses mirrors to kill the lag that makes you want to puke.

Practical Magic arxiv | Mar 18

These tiny sliding antennas are hacking the laws of physics to give you a perfect signal where your phone usually dies.

Nature Is Weird arxiv | Mar 18

New AI can peer into a computer chip's microscopic guts to find "spy tech" hidden by sketchy manufacturers.

Practical Magic arxiv | Mar 18

Researchers built a "ghost mode" for robots that calculates the exact path to sneak around without being seen.

Practical Magic arxiv | Mar 18

Turns out the long lines at airport security were secretly keeping the whole U.S. flight network from crashing for the last decade.

Paradigm Challenge arxiv | Mar 18

RSM achieves 20x faster training for recursive reasoning models and enables test-time scaling for up to 20,000 refinement steps.

Efficiency Breakthrough arxiv | Mar 18

A factorial study on EHR foundation models reveals that joint encoding of code-attribute pairs (local binding) is the primary driver of performance and efficiency.

Scaling Insight arxiv | Mar 18

Alternating Reinforcement Learning with Rubric Rewards (ARL-RR) replaces brittle scalar reward aggregation with a semantic meta-class optimization framework.

Paradigm Shift arxiv | Mar 18

Self-reflective program search matches or outperforms recursive language models for long-context tasks, suggesting recursion itself is not the primary driver of performance.

Breaks Assumption arxiv | Mar 18

Dynamic Representational Circuit Breaking (DRCB) introduces an architectural defense against steganographic collusion in multi-agent RL by monitoring and shuffling latent communication bottlenecks.

New Capability arxiv | Mar 18

Theoretical and empirical evidence suggests that the 'Key' mechanism in Attention may be redundant, proposing a 'QV' paradigm that simplifies Transformer architectures.

Breaks Assumption arxiv | Mar 18

Atlas introduces 'Compiled Memory,' which rewrites an agent's system prompt with distilled task experience rather than using RAG or fine-tuning.

Paradigm Shift arxiv | Mar 18

Latent Posterior Factors (LPF) bridge neural representations with structured probabilistic reasoning by converting VAE posteriors into factors for Sum-Product Networks.

New Capability arxiv | Mar 18