There are about 6,500–7,000 languages currently spoken in the world. But that’s less than a quarter of all the languages people spoke over the course of human history. That total number is around 31,000 languages, according to some linguistic estimates. Every time a language is lost, so goes that way of thinking, of relating to the world. The relationships, the poetry of life uniquely described through that language are lost too. But what if you could figure out how to read the dead languages? Researchers from MIT and Google Brain created an AI-based system that can accomplish just that.
While languages change, many of the symbols and how the words and characters are distributed stay relatively constant over time. Because of that, you could attempt to decode a long-lost language if you understood its relationship to a known progenitor language. This insight is what allowed the team which included Jiaming Luo and Regina Barzilay from MIT and Yuan Cao from Google’s AI lab to use machine learning to decipher the early Greek language Linear B (from 1,400 BC) and a cuneiform Ugaritic (early Hebrew) language that’s also over 3,000 years old.
Comments are closed.