Skip to content

core: add map-reduce ingestion engine#47

Open
y-71 wants to merge 2 commits intoflat-map-ingestionfrom
map-reduce-ingestion
Open

core: add map-reduce ingestion engine#47
y-71 wants to merge 2 commits intoflat-map-ingestionfrom
map-reduce-ingestion

Conversation

@y-71
Copy link
Collaborator

@y-71 y-71 commented Mar 1, 2026

Summary

  • Adds ReduceFn and MapReduceOptions types to core/types/ingest.ts
  • Implements mapReduceSections — bottom-up tree walk: leaves are mapped, parents reduce over child metadata
  • Large leaves exceeding maxSectionSize are chunked on paragraph boundaries before mapping
  • Concurrency semaphore acquires per-call (no deadlock at concurrency: 1)
  • 6 tests covering flat leaves, nested reduce, chunking, concurrency bounds, empty content, empty array

Supersedes #44 (which targeted the stale refactor branch).

Test plan

  • bun run typecheck — clean
  • bun test — 12/12 tests pass (6 flat-map + 6 map-reduce)

y-71 and others added 2 commits March 1, 2026 01:34
Bottom-up map/reduce over section trees: leaves get mapped, parents
reduce over child metadata. Large leaves are chunked on paragraph
boundaries before mapping. Concurrency-bounded semaphore prevents
deadlocks at concurrency=1.
Parse markdown into a RawSection[] tree using mdast-util-from-markdown.
Handles preamble content, nested headings, level skips, and correctly
ignores headings inside fenced code blocks.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant