Back
Challenge
CUST-3 comprises thousands of pages of historical customs and excise records held at the UK National Archives. These handwritten tables use archaic spellings, abbreviations, and irregular formatting — a challenge for both layout analysis and HTR.
Solution
Osiris-AI built a custom system to:
- Bespoke AI model to identify and segment manuscript tables, achieving ~90% accuracy
- Bespoke HTR model achieving ~95% accuracy
- Extract structured data from inconsistent handwriting and table layouts
- Highlight transcription errors
- Enable rapid human validation of questionable entries with our new cloud-based tool: Horus
Impact
This project produced structured datasets from raw, difficult-to-read scans—supporting economic historians and public sector analysis. The system also now forms part of our standard pipeline for tabulated manuscript data.