Survos Ledger Bundle

Template-aware extraction for typed or printed archival records with handwritten values.

LedgerSpec describes the known structure of a document before AI extraction runs: page fields, fixed regions, repeated tables, column types, row counts, and normalization rules. The extraction task should fill a known structure from prior OCR/layout results rather than ask a model to invent a table.

Initial scope:

JSON-serializable PHP DTOs for LedgerSpec.
A template registry and codec.
A first us-census-1870-schedule-1 template draft.
An extract_ledger task for survos/ai-pipeline-bundle that consumes prior OCR/layout output.

Example pipeline entry:

{
  "url": "file:///path/to/sample.png",
  "title": "1870 census page",
  "pipeline": ["ocr_mistral", "layout", "extract_ledger"],
  "ledger_template": "us-census-1870-schedule-1"
}

extract_ledger is intentionally prior-results-first. Whole-page vision fallback and low-confidence crop verification should be separate later tasks.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
resources		resources
src		src
templates/prompt/extract_ledger		templates/prompt/extract_ledger
.gitignore		.gitignore
PLAN.md		PLAN.md
README.md		README.md
composer.json		composer.json
phpunit.dist.xml		phpunit.dist.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Survos Ledger Bundle

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Survos Ledger Bundle

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages