New Capability

333 papers · Page 4 of 7

Papers where something becomes possible that previously was not. New techniques, new instruments, new model behaviors, new measurements at a frontier.

Filter by desk: AI Computing Robotics Math Quantum Physics Space Earth Chemistry Engineering Ecology Biology Neuroscience Health Psychology Economics Society

Develops a differentially private RLHF pipeline that decouples private reward learning from policy optimization, achieving strong alignment on Gemma-2B-IT with privacy guarantees.

Composes pre-trained unimanual robotic policies into complex bimanual tasks without requiring bimanual demonstration data.

Sets a new state-of-the-art for intracortical speech decoding with 14.3% phoneme error rate using a multitask Transformer.

InjectFlow is a training-free method that fixes semantic degradation and bias in Flow Matching models by injecting orthogonal semantics into the velocity field.

BubbleRAG enables high-precision retrieval-augmented generation over black-box Knowledge Graphs where the schema and structure are unknown.

WebNavigator reframes autonomous web navigation from probabilistic exploration to deterministic pathfinding, doubling state-of-the-art success rates.

ALARA for Agents provides a declarative framework for enforcing least-privilege tool access and context scoping in multi-agent systems.

Claude Opus 4.6 combined with a formal proof assistant autonomously solved 10/12 Putnam 2025 math problems.

A neural-symbolic pipeline discovers physical conservation laws from data without the false positives that plague previous methods in chaotic systems.

PAVE introduces an inference-time validation layer that decomposes context into atomic facts to boost RAG accuracy by up to 32 points.

Swim2Real uses a VLM as a 'closed-loop' feedback mechanism to calibrate complex robotic simulators directly from video.

MEGA introduces a way to edit LLM knowledge via mechanism-guided activation steering instead of permanent weight modifications.

BenchBench shifts the focus from model performance to model 'designer' capability by benchmarking automated benchmark generation.

Contrastive Association Learning (CAL) successfully recovers functional gene associations from expression data where standard similarity metrics fail.

Dream Diffusion Policy enables robots to survive severe OOD disturbances by detecting reality-imagination discrepancies and switching to an internal world model.

Cortical Policy introduces a dual-stream view transformer inspired by the human brain's dorsal and ventral pathways to solve complex robotic manipulation.

LiFR-Seg achieves high-frame-rate semantic segmentation using low-frame-rate cameras by propagating features through asynchronous event streams.

ORACLE uses symbolic reasoning engines to verify intermediate reasoning steps in synthetic data generation, moving beyond simple answer-correctness filtering.

AlphaAdj uses a VLM to dynamically adjust Control Barrier Function parameters in real-time for safe and efficient robotic navigation.

SPECTRE-G2 is a unified anomaly detector that uses eight complementary signals to detect 'unknown unknown' structural anomalies.

A training-free system for 3D scene reconstruction and editing from sparse RGB images using 3D-aware diffusion models to fill geometric gaps.

Introduces Reward Sharpness-Aware Fine-Tuning (RSA-FT) to mitigate reward hacking in diffusion models without retraining reward models.

GIDE enables precise, training-free image editing for discrete Diffusion LLMs by introducing a novel Discrete Noise Inversion mechanism.

Enables multimodal models to self-evolve their reasoning without human labels or external reward models.

DRTriton uses large-scale synthetic data and curriculum RL to automatically generate highly optimized Triton kernels, significantly outperforming top-tier LLMs.

Introduces git-inspired primitives to enable truly asynchronous and non-interfering multi-agent software engineering collaboration.

Solves the 'recursive drift' problem in self-improving LLMs by using symbolic verification to gate training data quality.

Transitions MLLMs from reactive planning to 'mental navigation' by forcing the construction of hierarchical cognitive maps from egocentric video.

HumanOmni-Speaker achieves end-to-end speaker diarization and lip-reading by compressing high-frequency motion residuals into just 6 tokens per frame.

Achieves zero-shot, zero-training collaborative navigation between humanoid and quadruped robots.

Introduces a training-free method to visualize and validate the invariances of any feature extractor using diffusion priors.

Reveals that frozen LLMs contain person-specific 'neural signatures' that can predict individual brain activity.

Uses the chronological visitation order of medical scans as a self-supervised signal for disease progression modeling.

Ensures safe Vision-Language Model generation without over-refusal by steering activations within the null-space of benign inputs.

Integrates LLMs as closed-loop tuning experts for manufacturing robots to achieve 0% failure in complex 3D printing tasks.

Integrates auction bids and monetization logic directly into generative recommender systems (like TIGER) via bid-aware decoding.

MemDLM embeds a simulated denoising process into training to create 'Parametric Memory,' narrowing the train-inference gap for Diffusion Language Models.

A transformer-based meta-amortized framework that allows simulation-based inference to remain valid across different model structures without retraining.

A grid-free probabilistic framework for nonrigid registration of high-dimensional vector-valued functions on irregular manifolds.

A self-improvement framework (MIPO) that improves LLM personalization and reasoning with zero additional data or human labels.

VAMPO optimizes visual dynamics in video models using policy gradients to fix precision-critical errors in robotic manipulation.

Introduces Any-Subgroup Equivariant Networks (ASEN), a single model that can adapt to multiple different symmetry groups via input modulation.

ICLAD enables unified, in-context anomaly detection for tabular data across unsupervised, semi-supervised, and one-class regimes without weight updates.

Expands formal reasoning beyond proof construction to the generation and formal verification of counterexamples in Lean 4.

CurveStream implements a curvature-aware hierarchical memory to handle streaming video in MLLMs without Out-of-Memory (OOM) errors.

Boosts open-model agent performance on web navigation tasks from 6.4% to 43%, surpassing proprietary models like GPT-4o.

First unified pipeline to reconstruct complete geometry, materials, and lighting from sparse views in under one second.

Introduces the first inherently scalable primitive for radiance fields, allowing real-time Level-of-Detail (LOD) rendering by simply truncating Fourier coefficients.

SCRL introduces the first negative supervision mechanism for Test-Time Reinforcement Learning, preventing LLMs from reinforcing 'consensus lies'.

X-World is a controllable, action-conditioned multi-camera world model that simulates realistic future video observations for end-to-end driving.