Back
Challenge
Capturing accurate data from over 70,000 pages of the English Port Books series (TNA E190) posed a major challenge. These handwritten records of shipping and trade, in English and Latin, varied widely in style, layout, and legibility—making them extremely difficult to access or digitise at scale.
Solution
Osiris-AI delivered a complete image-to-data workflow:
- Supplied specialist archival camera kits for high-resolution capture and onsite training
- Built multilingual HTR models, achieving over 90% transcription accuracy
- Developed a custom tagging model combining AI and regex rules for data structuring
- Parsed and structured complex entries for research-ready export
- Delivered clean, tabular datasets ready for historical analysis
Impact
Over 70,000 pages were digitised at The National Archives (UK), producing one of the most extensive datasets on early modern English trade. The final output included 43 million words and over 1 million structured observations—tracking ships, cargoes, ports, and crew—unlocking vital data for maritime, economic, and social historians.