Arabic manuscript OCR pipeline
Rebuilt a historical Arabic handwriting recognition workflow after the first vendor's output proved untrusted by the cataloging team. Added per-line confidence, human-in-the-loop correction, and preserved original diacritics.
- Collection size
- ~120K folios
- Scripts
- Naskh, Maghrebi, Ruqʿah
- Outcome
- 3x cataloger throughput