AI & ML Practical Magic

AI has jumped the success rate of deciphering 3,000-year-old ancient script from under 3% to over 54%.

April 14, 2026

Original Paper

Decoding Ancient Oracle Bone Script via Generative Dictionary Retrieval

Yin Wu, Gangjian Zhang, Jiayu Chen, Chang Xu, Yuyu Luo, Nan Tang, Hui Xiong

arXiv · 2604.09668

The Takeaway

This generative dictionary retrieval system turns a 'black box' classification problem into a transparent archaeological tool. It allows researchers to unlock massive volumes of previously undecipherable Oracle Bone Script using modern VLM retrieval techniques.

From the abstract

Understanding humanity's earliest writing systems is crucial for reconstructing civilization's origins, yet many ancient scripts remain undeciphered. Oracle Bone Script (OBS) from China's Shang dynasty exemplifies this challenge: only approximately 1,500 of roughly 4,600 characters have been decoded, and a substantial portion of these 3,000-year-old inscriptions remains only partially understood. Limited by extreme data scarcity, existing computational methods achieve under 3% accuracy on unseen