AI & Machine Learning

2,557 papers · Page 31 of 52

Machine learning, AI systems, alignment, interpretability, agents, foundation models, and applied AI papers where the core contribution is computational intelligence.

Filter by category: Paradigm Challenge Breaks Assumption First Ever Nature Is Weird Practical Magic Cosmic Scale Life Origin Open Release Efficiency Leap New Capability Scaling Insight

Breaks Assumption

Frontier models like GPT-5.2 and Claude 4.5 suffer from 'Internal Safety Collapse' where safety alignment fails completely if a task's success necessitates harmful output.

Berta is an open-source, production-proven AI clinical scribe that reduces operating costs by up to 95% compared to commercial alternatives.

Efficiency Breakthrough

Memory Sparse Attention (MSA) enables LLMs to scale to 100 million tokens with linear complexity and less than 9% precision degradation.

Breaks Assumption

Prompt compression can paradoxically increase total energy consumption and cost by over 2000% due to aggressive model 'output expansion'.

Scaling Insight

Synthetic Mixed Training allows an 8B model to finally outperform RAG on long-document comprehension by combining synthetic QAs with rewritten documents.

Logical reasoning in LLMs is causally linked to 'algebraic divergence' in the residual stream, and failure to achieve this geometry explains sycophancy.

Environment Maps nearly double the success rate of long-horizon agents by replacing session-bound context with a persistent, structured graph representation.

A statistical physics framework that predicts the fundamental limits of agentic self-improvement and nested LLM architectures.

Inference-time 'steering' of Code LLMs allows for precise control over programming languages and libraries without prompting or fine-tuning.

Efficiency Breakthrough

The first sorting-free stochastic formulation for 3D Gaussian Splatting that matches rasterization speed while enabling full ray-traced effects.

Bio-inspired visual servoing that achieves low-latency robotic control by processing event-stream flux directly, bypassing traditional state estimation.

Breaks Assumption

Training-free Out-of-Distribution (OOD) detection that beats state-of-the-art by aggregating features across intermediate network layers.

Scaling Insight

Newer LLM architectures like MoE and SSMs are making 'early-exit' decoding significantly less effective than in previous generations.

Efficiency Breakthrough

AI agent benchmarks can be slashed by ~50% in cost by only evaluating on tasks with intermediate historical pass rates.

A universal 'one-shot' medical anomaly detector that outperforms specialized models across nine different datasets.

Breaks Assumption

Grokking is not the discovery of a new algorithm, but the sharpening of one already latent in the model during the memorization phase.

Scaling Insight

Diffusion models can be proven to generalize by capturing manifold geometry long before they achieve density estimation or memorization.

Sparse Autoencoders (SAEs) can successfully decompose opaque medical vision foundation model embeddings into human-interpretable clinical concepts.

A massive empirical study of 177,000 tools reveals a rapid shift in the AI agent ecosystem from 'perception' to 'action' (27% to 65% usage).

A simple perturbation method reveals that representations are not just activation patterns, but conduits that determine how learning 'infects' similar examples.

LLMs can solve planning problems with state spaces as large as 10^165 by acting as program generators rather than direct planners.

Symbolic-KANs bridge the gap between scalable deep learning and interpretable symbolic regression by embedding discrete library primitives directly into the network.

Breaks Assumption

Transformer hallucinations in high-stakes legal tasks are deterministic failures driven by calculable internal state thresholds rather than random 'glitches'.

An 'invariant compiler' uses LLMs to translate physics requirements into Neural ODE architectures that satisfy conservation laws by construction.

Efficiency Breakthrough

Hybrid Distillation Policy Optimization (HDPO) overcomes the 'vanishing gradient' problem for hard mathematical prompts that RL agents cannot solve.

BioVITA releases a massive multimodal biological dataset of 3.6M image-audio-text samples covering 14,000 species.

Efficiency Breakthrough

A self-distillation method for Multi-Token Prediction (MTP) that yields a 220% inference speedup with minimal training cost.

Efficiency Breakthrough

AttentionPack achieves up to 8x memory efficiency during decoding for large vision-language models (VLMs).

POISE demonstrates the first autonomous, evidence-driven discovery of improved policy optimization algorithms for LLMs.

Breaks Assumption

Listed API prices for reasoning models (RLMs) are shown to be highly misleading, with cheaper models often costing 28x more in practice.

Efficiency Breakthrough

SLAT-Phys predicts spatially varying material property fields directly from single RGB images with a 120x speedup.

LLM-generated summaries can produce patient embeddings that are more 'portable' and robust to hospital distribution shifts than specialized clinical models.

Breaks Assumption

A systematic critique explaining why 'self-improving' generative optimization loops fail in production and how to fix them.

SDZE enables the training of 10-million-dimensional Physics-Informed Neural Networks (PINNs) on a single GPU.

Efficiency Breakthrough

Reduces Text-to-SQL input tokens by 99% by internalizing the database schema into the model weights through a two-phase fine-tuning approach.

Solves the 'vanishing gradient' problem in 3D Gaussian Splatting (3DGS) tracking by optimizing in the frequency domain using spectral moments.

Restores editable, semantically layered structures from flattened vector graphics (SVGs/icons) by using generative completion to recover occluded geometries.

Efficiency Breakthrough

MoE-Sieve reduces Mixture-of-Experts LoRA fine-tuning parameters and training time by ~70% by only adapting the most-frequently activated 'hot' experts.

Identifies that 'attention imbalance' across modalities and tokens drives object hallucinations and proposes a decoding-time rectification (AIR) to fix it.

SOMA provides a plug-and-play memory and orchestration system that increases Vision-Language-Action (VLA) robot success rates by over 50% without fine-tuning.

Breaks Assumption

LLMpedia exposes a massive gap in LLM factuality by generating 1M articles from parametric memory, revealing that actual knowledge retrieval is 15%+ lower than multiple-choice benchmarks suggest.

Breaks Assumption

Proves that RLHF and DPO alignment cause 'response homogenization,' which effectively breaks standard sampling-based uncertainty estimation methods.

Formalizes 'likelihood hacking,' a failure mode where RL-trained models learn to generate unnormalized probabilistic programs to artificially inflate rewards.

Efficiency Breakthrough

Achieves up to 400x speedup and 64x memory reduction for open-vocabulary 3D scene understanding compared to current Gaussian Splatting methods.

Efficiency Breakthrough

Enables 1000x faster on-chip training for Weightless Neural Networks (WNNs) on FPGAs with drastically lower power consumption.

Scaling Insight

Provides a systematic blueprint for scaling Reinforcement Learning (RL) in LLMs using multi-turn synthetic data generation and difficulty-based curricula.

A model-agnostic framework to boost time-series forecasting by aligning internal representations with those of pretrained foundation models.

Breaks the resolution and aspect ratio barriers of image diffusion models, enabling the generation of consistent 32K resolution images.

Unifies input and predicted meshes under a shared topological framework to enable high-fidelity 3D reconstruction with sharp features.

Releases a high-quality, 92K-sentence parallel dataset for Hindi-Sanskrit translation focusing on contemporary and spoken language.