Digitising Customs Archives: The CUST-3 Project (1697–1780)
With London School of Economics and University of Oxford

Back

Challenge

CUST-3 comprises thousands of pages of historical customs and excise records held at the UK National Archives. These handwritten tables use archaic spellings, abbreviations, and irregular formatting — a challenge for both layout analysis and HTR.

Solution

Osiris-AI built a custom system to:

  • Bespoke AI model to identify and segment manuscript tables, achieving ~90% accuracy
  • Bespoke HTR model achieving ~95% accuracy
  • Extract structured data from inconsistent handwriting and table layouts
  • Highlight transcription errors
  • Enable rapid human validation of questionable entries with our new cloud-based tool: Horus

Impact

This project produced structured datasets from raw, difficult-to-read scans—supporting economic historians and public sector analysis. The system also now forms part of our standard pipeline for tabulated manuscript data.