Practical Magic / AI

A new gradient-free optimizer can train non-differentiable AI networks to near-perfect accuracy without using any fake gradients.

The Takeaway

Some of the most energy-efficient hardware, like spiking neurons, is incredibly hard to train because you can't use standard calculus. This PolyStep optimizer uses optimal transport math to train these un-trainable networks directly. It achieves performance levels that were previously thought to be impossible for non-differentiable systems. This opens the door for AI that runs on specialized, ultra-low-power chips without needing the power-hungry GPUs we use today. It is a major step toward making AI hardware as efficient as the human brain.

By SeriesFusion Editorial Board · May 5, 2026

Original Paper

Training Non-Differentiable Networks via Optimal Transport

An T. Le

arXiv · 2605.01928

From the abstract

Neural networks increasingly embed non-differentiable components (spiking neurons, quantized layers, discrete routing, blackbox simulators, etc.) where backpropagation is inapplicable and surrogate gradients introduce bias. We present PolyStep, a gradient-free optimizer that updates parameters using only forward passes. Each step evaluates the loss at structured polytope vertices in a compressed subspace, computes softmax-weighted assignments over the resulting cost matrix, and displaces particl

Read the original paper →

← Back to today's papers