We can extract almost flawless content from challenging sources spanning millennia, in any language, and with any typeface.
These handwritten Vatican reports contain discussions about important theological issues dating from the fifteenth century. We developed custom HTR models with a private client researching the Vatican Archives to recognise idiosyncratic Latin and convert into digestible, fluid prose, like the model output as shown in the image below.
One challenge we had during this project was to eliminate background watermarks to enhance machine reading accuracy. Identifying archaic abbreviations in the text was another.
Osiris employs specialist palaeographers to work with our clients to more accurately transcribe documents and create high-fidelity training data for use in handwritten text recognition models.