Traditional Spiking Neural Network (SNN) sparsity is a performance 'illusion' on GPUs; temporal aggregation is required for actual 13x speedups.
March 17, 2026
Original Paper
Collapse or Preserve: Data-Dependent Temporal Aggregation for Spiking Neural Network Acceleration
arXiv · 2603.13810
The Takeaway
Proves that SIMD architectures cannot exploit unstructured binary spikes for speed. Introduces Temporal Aggregated Convolution (TAC) to achieve significant hardware-agnostic speedups by collapsing or preserving temporal dimensions based on data type.
From the abstract
Spike sparsity is widely believed to enable efficient spiking neural network (SNN) inference on GPU hardware. We demonstrate this is an illusion: five distinct sparse computation strategies on Apple M3 Max all fail to outperform dense convolution, because SIMD architectures cannot exploit the fine-grained, unstructured sparsity of i.i.d. binary spikes. Instead, we propose Temporal Aggregated Convolution (TAC), which exploits convolution linearity to pre-aggregate $K$ spike frames before a single