Race on to decipher oracle bone characters with AI
Competition to decode ancient scripts expected to advance research, stimulate public interest, spur technological development, Fang Aiqing reports.


With the help of AI, Liu's laboratory has collaborated with Tencent and other academic institutions to develop a collaborative research platform with a set of intelligent tools, with which some 500 duplicate rubbings have been identified.
The set of tools can also detect the presence of a specific glyph across different rubbings or facsimiles, and identify similar pictograms, sorting them by similarity.
Last year, Liu and his colleagues started making use of large language models to decode oracle bone characters. They aim to build a specialized large language model, which is trained on vast amounts of academic literature and vocabulary, and which is capable of proposing well-grounded hypotheses and eliminating improper assumptions, therefore accelerating the process of deciphering.
He says that the development of this specialized large language model requires the participation of more industry and research organizations. By holding the competition, they hope to involve more young people in the sector and stimulate interdisciplinary collaboration, while also looking to the oracle bone script, in turn, to contribute to the development of large language models.
According to its official website, the competition consists of three areas. In the invitational research area, paleographers and computer specialists will work together on studies to decode oracle bone characters and inscriptions.
They will analyze the evolution of glyphs, their calligraphic structures and grammar, before submitting a research report that includes AI algorithm models, the rationale behind these models and an interpretation of the academic value. Participating teams for this area must undergo a preliminary qualification review.
An algorithm challenge constitutes the second area, focusing on using AI to facilitate the collation of oracle bone script data within a given dataset. Participants will be required to develop code either for the automatic rejoining of oracle bone fragments, which will be evaluated by their accuracy rate and proportion of new achievements; or for automatically identifying duplicate rubbings of the same oracle bones, assessed by the accuracy and the number of unpublished duplicate images.

The last area, however, invites public participants to create images, animations or short videos with AI tools that incorporate the pictograms or historical connotations of the oracle bone script.
Shu Zhan, head of Tencent's digital cultural lab, explains that through AI technology they are seeking to enhance research efficiency and raise public awareness of oracle bone script.
He expects the competition to accelerate research progress on the script, inspire intelligent restoration and virtual exhibitions of oracle bone inscriptions and other categories of cultural heritage, and lower the threshold for public participation through AI-driven creation, thereby integrating the script into daily life.
Shu adds that in the future, the company will continue to enhance the text corpus for the specialized oracle bone script large language model and advance the virtual aggregation and exhibition of overseas collections of oracle bones.
