Research

Publications from the practice.

We publish what the fieldwork teaches us — grounded in real collections, reviewed by practitioners.

In progress

Four papers in preparation.

Drafted internally, under review, not yet public.

Forthcoming

OCR & HTR

Error-tolerant OCR for historical Arabic archival documents

A case study on per-line confidence, cataloger review, and the cost of trust in archival OCR workflows. Draws on 120K folios across three historical scripts.

Arabic · HTR · Error tolerance

Forthcoming

Description

DACS-compliant archival description at scale via LLMs

How to generate scope notes, biographical sketches, and container lists that pass institutional review — and what fails silently when you don't design for it.

DACS · LLM · Finding aids

Forthcoming

Discovery

Cross-modal archival search across text, audio, and video

A unified discovery layer over five institutional catalogs without schema migration. Translation happens at query time, not ingest.

Search · Cross-modal · Multi-institutional

Forthcoming

Principles

The error-tolerance principle for archival AI

A design principle for AI systems that process records: assume the first pass is wrong in known ways. Revisable without losing provenance.

AI · Archival ethics · Error tolerance

Editorial

Why we publish slowly.

We don't publish for citation count. Every paper is grounded in real fieldwork, reviewed by practitioners who catalog records for a living, and held back until the claims survive that review.

Collaborate

Have a research question?

We take a small number of research collaborations per year. Tell us what you're working on.