An overview of our framework, which generates the lost texts from smaller units—from characters to tokens and from tokens to inscriptions. Character mappings are first performed on the phonetic alphabet of the known language. Based on these mappings, a token y in the known vocabulary is converted into a token x in the lost language according to the latent alignment variable a. Lastly, all generated tokens, together with characters in unmatched spans, are concatenated to form a lost inscription. Blue boxes display the corresponding linguistic properties associated with each level of modeling.
This site uses cookies. By continuing to use our website, you are agreeing to our privacy policy. No content on this site may be used to train artificial intelligence systems without permission in writing from the MIT Press.