We swapped out a piece of an AI’s digital brain for actual light, which lets it think at the literal speed of optics.
April 13, 2026
Original Paper
Integrated electro-optic attention nonlinearities for transformers
arXiv · 2604.09512
The Takeaway
By moving the Softmax function from digital chips to analog lithium niobate modulators, this research could radically slash the latency of AI models. It’s a major step toward hardware that computes as fast as light can travel through it.
From the abstract
Transformers have emerged as the dominant neural-network architecture, achieving state-of-the-art performance in language processing and computer vision. At the core of these models lies the attention mechanism, which requires a nonlinear, non-negative mapping using the Softmax function. However, although Softmax operations account for less than 1% of the total operation count, they can disproportionately bottleneck overall inference latency. Here, we use thin-film lithium niobate (TFLN) Mach-Ze