AI has jumped the success rate of deciphering 3,000-year-old ancient script from under 3% to over 54%.
April 14, 2026
Original Paper
Decoding Ancient Oracle Bone Script via Generative Dictionary Retrieval
arXiv · 2604.09668
The Takeaway
This generative dictionary retrieval system turns a 'black box' classification problem into a transparent archaeological tool. It allows researchers to unlock massive volumes of previously undecipherable Oracle Bone Script using modern VLM retrieval techniques.
From the abstract
Understanding humanity's earliest writing systems is crucial for reconstructing civilization's origins, yet many ancient scripts remain undeciphered. Oracle Bone Script (OBS) from China's Shang dynasty exemplifies this challenge: only approximately 1,500 of roughly 4,600 characters have been decoded, and a substantial portion of these 3,000-year-old inscriptions remains only partially understood. Limited by extreme data scarcity, existing computational methods achieve under 3% accuracy on unseen