Learns stable, interpretable Koopman generators for nonlinear PDEs from trajectory data alone without any physics supervision.
Paradigm Shift arxiv | Apr 1
A massive 270K-sample multi-view video corpus specifically for embodied AI agents in complex retail environments.
Open Release arxiv | Apr 1
Introduces a scalable reinforcement learning framework that enables high-fidelity control of a whole-body human musculoskeletal system with over 700 muscles.
New Capability arxiv | Apr 1
Proposes 'Nomad', an exploration-first agent architecture that autonomously discovers insights in data without being limited by human prompts or questions.
New Capability arxiv | Apr 1
Reveals that many massive LLM benchmarks provide highly redundant information, with major leaderboards often containing only ~2 independent axes of measurement.
Breaks Assumption arxiv | Apr 1
Provides a robust solution for anti-aliasing in Feed-forward Gaussian Splatting, enabling high-fidelity rendering across varying sampling rates and resolutions.
New Capability arxiv | Apr 1
Uses token-level perplexity analysis to prove that LLMs rely on simple heuristics rather than the linguistic reasoning they appear to exhibit on standard benchmarks.
Breaks Assumption arxiv | Apr 1
Demonstrates that most 'failures' of AI agents on data engineering benchmarks are actually due to flawed ground-truth and rigid evaluation scripts rather than model inability.
Breaks Assumption arxiv | Apr 1
Enables precise Camera-LiDAR extrinsic calibration even under massive initial misalignments that typically break automated calibration systems.
New Capability arxiv | Apr 1
Shows that VLMs can overcome deep-seated perceptual biases and optical illusions by using image manipulation tools rather than more training data.
Paradigm Shift arxiv | Apr 1
Obtain epistemic and aleatoric uncertainty from a single forward-backward pass of an unmodified pretrained LLM.
Efficiency Breakthrough arxiv | Apr 1
The first prior-fitted foundation model for survival analysis that enables zero-shot time-to-event predictions on tabular data.
New Capability arxiv | Apr 1
Mathematical proof that cosine similarity between label representations (unembeddings) in softmax classifiers is fundamentally uninformative.
Breaks Assumption arxiv | Apr 1
A vector-wise sparse attention mechanism that accelerates long-context video inference by 2.6x with zero loss in accuracy.
Efficiency Breakthrough arxiv | Apr 1
A novel neural primitive based on metriplectic dynamics that outperforms Transformers in data efficiency and generalization.
Paradigm Shift arxiv | Apr 1
A debunking of the idea that single-vector embedding failures are primarily due to low dimensionality.
Breaks Assumption arxiv | Apr 1
A unified quantization and runtime framework for deploying multiple LoRA-adapted generative models on edge devices simultaneously.
Efficiency Breakthrough arxiv | Apr 1
A diagnostic revealing that over 50% of video understanding benchmark samples can be solved without any video or temporal context.
Breaks Assumption arxiv | Apr 1
A 1D continuous image tokenizer that uses semantic masking to achieve a 64x reduction in token usage without sacrificing generation fidelity.
Efficiency Breakthrough arxiv | Apr 1
A unified agentic framework that closes the 'AI-for-AI' research loop by discovering novel architectures, data pipelines, and algorithms.
Paradigm Shift arxiv | Apr 1
Introduces the 'near-miss' metric to detect latent failures in agentic workflows where agents bypass policy checks but reach correct outcomes by chance.
Breaks Assumption arxiv | Apr 1
A compiler approach to agent logs that reduces token consumption by 50-66% while improving context learning performance.
Efficiency Breakthrough arxiv | Apr 1
Provides a closed-form safety law for Dynamic Movement Primitives, enabling provably safe robot control without real-time optimization.
New Capability arxiv | Apr 1
A training-free attack that removes diffusion-based watermarks with nearly 100% success by deflecting the generative trajectory.
Breaks Assumption arxiv | Apr 1
A stabilization mechanism for adapting LLMs to time-series tasks that reduces memory footprint by up to 1,776x.
Efficiency Breakthrough arxiv | Apr 1
Identifies a 'dual-capability bottleneck' where low-rated training data is essential for state tracking while high-rated data is needed for decision quality.
Scaling Insight arxiv | Apr 1
A novel approach to upcycle multiple dense expert models into a unified Mixture-of-Experts model without any additional training.
New Capability arxiv | Apr 1
Provides a computationally efficient 'early warning' system for emergent capabilities like grokking and induction head formation using 2-datapoint reduced density matrices.
Scaling Insight arxiv | Apr 1
Introduces a GUI-native agent system that operates complex scientific instruments through their existing visual interfaces rather than requiring proprietary APIs.
New Capability arxiv | Apr 1
Decouples high-level intent planning from low-level motor control in Vision-Language-Action (VLA) models to prevent the degradation of pre-trained VLM representations.
Paradigm Shift arxiv | Apr 1
Demonstrates that independent aggregation (Hybrid Confirmation Tree) consistently outperforms the standard 'AI-as-advisor' paradigm across diverse high-stakes domains.
Paradigm Shift arxiv | Apr 1
Applies Shapley values from cooperative game theory to solve the 'free-rider' problem in GRPO-based reinforcement learning post-training.
Efficiency Breakthrough arxiv | Apr 1
Proves that complex GraphRAG systems can be simplified into a more efficient 'UnWeaver' framework that achieves the same benefits using entity-based decomposition and standard VectorRAG.
Breaks Assumption arxiv | Apr 1
Identifies 'label leakage' from limited task diversity as the primary bottleneck for relational foundation models, rather than raw data volume.
Scaling Insight arxiv | Apr 1
Shows that deep learning models for medical imaging (MRI) can be trained using synthetic quaternion Julia fractals instead of sensitive human clinical data.
Paradigm Shift arxiv | Apr 1
Produces high-fidelity SHAP explanations for tabular data 1000x faster than traditional methods by integrating them directly into the model architecture.
Efficiency Breakthrough arxiv | Apr 1
Provides a formal framework for optimizing models whose decisions actively change the distribution of the data they encounter.
Paradigm Shift arxiv | Apr 1
Introduces a rigorous algorithm to determine if two different neural networks share the same underlying 'algorithmic interpretation' without needing to manually define the circuits.
Paradigm Shift arxiv | Apr 1
Replaces heuristic ReAct-style agent loops with a mathematical framework based on control theory to prevent LLM agents from over-deliberating or using excessive tools.
Paradigm Shift arxiv | Apr 1
Proposes a unified tensor-factorization view of attention that encompasses MHA, GQA, and MLA while reducing parameter counts by an order of magnitude.
Efficiency Breakthrough arxiv | Apr 1
Identifies the specific conditions under which Reinforcement Learning causes LLMs to 'lie' or hide reasoning in their Chain-of-Thought (CoT).
Breaks Assumption arxiv | Apr 1
Discovers that video diffusion models commit to high-level plans in the first few denoising steps, enabling a new inference-time scaling technique called ChEaP.
Scaling Insight arxiv | Apr 1
Researchers have built computer chips that can run 'backward' to solve math problems that are normally impossible for modern hardware.
Practical Magic arxiv | Mar 31
A famous musical masterpiece was found to be so mathematically perfect that an algorithm can reconstruct 93% of the score from scratch.
Nature Is Weird arxiv | Mar 31
Scientists are using the high-level math of 'cohomology'—usually used to describe the shape of the universe—to find bugs in computer code.
Paradigm Challenge arxiv | Mar 31
Researchers can now predict the exact moment an AI agent will go 'rogue' and leak data before it actually happens.
Practical Magic arxiv | Mar 31
AI has learned to objectively measure the 'groove' and funkiness of music, outperforming traditional human-designed formulas.
Nature Is Weird arxiv | Mar 31
A new cyberattack can physically destroy an AI chip simply by changing the order in which it adds numbers.
Practical Magic arxiv | Mar 31
AI coding agents are actually safer than human programmers at building new software, but twice as dangerous when it comes to maintaining it.
Paradigm Challenge arxiv | Mar 31
Researchers believe they have discovered a new transcendental number as fundamental as Pi or e.
Nature Is Weird arxiv | Mar 31
Sanctioned crypto users are evading asset freezes by 'bribing' the computers that process blockchain transactions.
Practical Magic arxiv | Mar 31
A standard graphics card can now track the entire Starlink satellite network in less than four milliseconds.
Cosmic Scale arxiv | Mar 31
A new computer chip uses the quantum flipping of individual electron spins to solve physics problems that are too 'random' for standard processors.
Practical Magic arxiv | Mar 31
Researchers have mapped out the physical limits of quantum teleportation, revealing how many particles can be 'beamed' before the signal collapses.
Cosmic Scale arxiv | Mar 31
Scientists have developed a way to control swarms of microscopic nanorobots inside the body without needing to talk to them individually.
Practical Magic arxiv | Mar 31
A new AI can 'discover' the fundamental laws of thermodynamics just by watching how materials move and change temperature.
Nature Is Weird arxiv | Mar 31
A massive study of AI-generated code reveals that 15% of all AI suggestions contain bugs or security flaws that developers simply leave in the software.
Practical Magic arxiv | Mar 31
Scientists used industrial 'machine failure' math to prove that Cristiano Ronaldo and Lionel Messi have maintained identical goal-scoring consistency for 17 years.
Nature Is Weird arxiv | Mar 31
Achieves competitive continual learning accuracy with a 90% reduction in memory cost.
Efficiency Breakthrough arxiv | Mar 31
Introduces geometry-aware parallel refinement for diffusion language models, bypassing fixed-block decoding limitations.
Paradigm Shift arxiv | Mar 31
Scales multi-agent path finding to 1000 agents with near-linear runtime by decoupling geometric planning from execution-time conflict resolution.
Scaling Insight arxiv | Mar 31
Demonstrates that frontier LLMs fail at diagnostic reasoning in safety-critical robotics even when provided with perfect procedural knowledge.
Breaks Assumption arxiv | Mar 31
Shifts multimodal LLMs from static image prefixes to an active, sequential 'Visual Chain-of-Thought' that explores images based on saliency.
New Capability arxiv | Mar 31
Releases a massive 117k-instruction dataset and a language-conditioned world model framework for visual navigation.
Open Release arxiv | Mar 31
Reveals a massive 'reasoning gap' in multilingual VLMs, where accuracy drops up to 25% when switching from English to Indian languages.
Breaks Assumption arxiv | Mar 31
Masked Diffusion Language Models (MDLMs) fail at reasoning because they unmask tokens in the wrong order, not because they lack internal logic.
Breaks Assumption arxiv | Mar 31
The first training-free framework for high-fidelity appearance transfer specifically designed for Diffusion Transformers (DiTs).
New Capability arxiv | Mar 31
LLMs used for financial forecasting are often 'cheating' by memorizing training data, a bias this framework detects and filters out to improve Sharpe ratios by 49%.
New Capability arxiv | Mar 31
Synthetic multi-view generation breaks the performance ceiling of single-view robotic datasets.
Scaling Insight arxiv | Mar 31
Knowledge distillation can be performed by injecting 'experience' into prompts rather than updating model weights.
Paradigm Shift arxiv | Mar 31
Gaussian Joint Embeddings provide a probabilistic alternative to deterministic SSL, eliminating the need for architectural asymmetries to prevent collapse.
Paradigm Shift arxiv | Mar 31
A unified L0-gating mechanism that enables comparable sparsification and pruning across graphs, text, and tabular data.
New Capability arxiv | Mar 31
Batch-level query routing for LLMs allows for strict cost and capacity control that per-query methods cannot achieve.
Efficiency Breakthrough arxiv | Mar 31
Achieves high-fidelity LiDAR densification in just 156ms while strictly enforcing sensor physics to prevent 'ghost points'.
Efficiency Breakthrough arxiv | Mar 31
Exposes 'order-gap hallucinations' where models prioritize conversational compliance over known facts by pinpointing and flipping internal safety circuits.
Breaks Assumption arxiv | Mar 31
Proves that high scores on visual spatial benchmarks are achieved through token-level search (BFS in prose) rather than genuine visual planning.
Breaks Assumption arxiv | Mar 31
Identifies a 'stability asymmetry' signature where deceptive models maintain stable internal beliefs while producing fragile, unstable external responses under perturbation.
Paradigm Shift arxiv | Mar 31
Challenges the 'filter-first' data paradigm by showing that training on uncurated data with quality-score labels outperforms training on high-quality filtered subsets.
Paradigm Shift arxiv | Mar 31
Introduces a 'clone-robust' mechanism (YRWR) to prevent AI model producers from strategically gaming the rankings in crowd-sourced arenas like Chatbot Arena.
Paradigm Shift arxiv | Mar 31
Enables vision models to learn online from human corrections at inference time, reducing redundant manual effort in video segmentation by up to 34%.
New Capability arxiv | Mar 31
Formalizes the 'Observability Gap' to explain why coding agents plateau: humans can only provide feedback on visible outputs, while bugs reside in invisible execution states.
Scaling Insight arxiv | Mar 31
Provides a high-dimensional theoretical foundation for why two-phase optimizers like DiLoCo are mathematically superior to standard SGD in specific noise regimes.
Scaling Insight arxiv | Mar 31
Mathematically proves that multi-agent planning workflows are decision-theoretically dominated by a centralized Bayes decision maker, setting fundamental limits on agentic emergent behavior.
Breaks Assumption arxiv | Mar 31
Provides a formal proof that any semantic memory system (including RAG and vector retrieval) is mathematically guaranteed to suffer from interference and forgetting.
Breaks Assumption arxiv | Mar 31
Demonstrates that Liquid Neural Networks can outperform Diffusion Policies in imitation learning with half the parameters and nearly 2x faster inference.
Efficiency Breakthrough arxiv | Mar 31
Achieves a 45x reduction in video generation inference latency and 2.5x higher training throughput using an efficient solution-flow framework.
Efficiency Breakthrough arxiv | Mar 31
Introduces neural topology probing to identify causally influential 'hub neurons' in Vision-Language Models that govern cross-modal behavior.
Paradigm Shift arxiv | Mar 31
Identifies that the distinct 'AI prose style' (specifically em dash overuse) is a surviving artifact of markdown-saturated training data leaking into unstructured output.
Breaks Assumption arxiv | Mar 31
Releases ROSClaw, a model-agnostic executive layer that allows any foundation model to control any ROS 2 robot through standardized capability discovery and safety envelopes.
Open Release arxiv | Mar 31
Releases ChartNet, a million-scale, high-quality multimodal dataset for chart understanding spanning 24 chart types and 1.5 million samples.
Open Release arxiv | Mar 31
Enables zero-shot monocular metric depth estimation across any camera type (fisheye, 360, ERP) using a single unified model.
New Capability arxiv | Mar 31
Proposes a new reinforcement learning policy compression method based on long-horizon state-space coverage instead of immediate action-matching.
Paradigm Shift arxiv | Mar 31
Introduces MeteoCap-3B, a billion-scale meteorological dataset with expert captions and a spectral-aware diffusion model for weather time-series generation.
Open Release arxiv | Mar 31
Reframes LLM-assisted research as a scientific forecasting problem, training models to generate proposals that align with future (held-out) research directions.
New Capability arxiv | Mar 31
Identifies that standard Transformer attention matrices are fundamentally ill-conditioned and proposes a drop-in 'preconditioned' replacement.
Paradigm Shift arxiv | Mar 31
GSR-GNN achieves 30x training speedups and 87% memory reduction for deep Graph Neural Networks on circuit graphs.
Efficiency Breakthrough arxiv | Mar 31
A fully open industrial-scale pretraining project releasing 8T tokens of processed data, a 3B model, and 200+ controlled pretraining ablations.
Open Release arxiv | Mar 31
Enables precise, physically plausible control over light position, color, and intensity in single images without a 3D model.
New Capability arxiv | Mar 31
Systematically demonstrates that 'easy-to-hard' curriculum learning provides no benefit for LLM deductive reasoning tasks.
Breaks Assumption arxiv | Mar 31
IP-SAM allows the Segment Anything Model (SAM) to perform automatic, prompt-free segmentation by generating its own 'intrinsic prompts'.
New Capability arxiv | Mar 31