Create a standardized JSON/CSV output schema for extracted metrics (species, count, stomach content, location, year, etc.), so future extraction work and database integration are consistent and debuggable.
This schema follows:
Classifier → Extraction pipeline
Extraction pipeline → Storage (CSV/JSON/DB)
Acceptance Criteria
- All extraction outputs follow the agreed schema format
- Example real PDF produces a valid structured output file
- Page-mapping fields exist (even if initially partial)
- Future extraction features plug in without changing downstream tooling