Skip to content

Streaming markdown re-parses entire document on every update — block-level diffing would be a major perf win #391

@Mile-Away

Description

@Mile-Away

Summary

When an app feeds growing markdown to <EnrichedMarkdownText markdown={…} streamingAnimation> (e.g. an LLM chat client), the native side appears to re-parse the full string on every prop update. For a multi-paragraph reply where only the trailing block is changing, this is O(N) per chunk both in md4c work and in the diff/commit needed by the renderer, and it shows up as JS-thread-driven jank on lower-end devices once a message gets long.

Repro

  1. Render <EnrichedMarkdownText markdown={text} streamingAnimation />.
  2. Update text ~60 times/sec by appending characters to the last paragraph (typical LLM streaming cadence).
  3. Profile the native UI thread. Parsing/layout time grows linearly with the full document length, not just the appended delta.

The library's own streamingAnimation doc string says only the tail (new characters) is animated, but parsing itself isn't scoped to the tail.

Why it matters

Most LLM streaming UIs are append-only at the block level — earlier paragraphs/lists/code fences are immutable once a \n\n is past, only the trailing block grows. Once the message is 5 KB+, re-parsing 5 KB on every chunk becomes the dominant cost.

JS-side workarounds are unappealing:

  • Splitting "stable head / streaming tail" at the last \n\n in JS requires reimplementing md4c's block boundary logic (lists, fenced code, blockquotes, tables, HTML blocks, link reference definitions). Brittle and easy to get wrong on partial syntax.
  • Using useDeferredValue on the JS side reduces commit frequency but doesn't change the per-commit cost.

The library, on the other hand, already owns md4c's AST and can do this correctly.

Proposed shape

Two non-exclusive options:

  1. Implicit incremental parse: when markdown only grows (suffix is appended), keep the previous AST, find the last stable block boundary, and re-parse only from there.
  2. Explicit streaming API: streamingConfig={{ appendMode: true }} or a sibling component (<EnrichedMarkdownTextStreaming>) that exposes an append(delta: string) imperative method, so the consumer can promise "this is a strict append" and the lib can skip the prefix-equality check.

Option 1 is friendlier for adoption (drop-in, no API change) and covers the LLM streaming case. Option 2 is faster but pushes API design onto the consumer.

Environment

  • react-native-enriched-markdown 0.6.0
  • React Native 0.7x (New Architecture)
  • iOS 17/18 + Android API 33+

Happy to contribute a PR if there's interest in either of these approaches — would appreciate maintainer guidance on which direction (implicit vs explicit) you'd prefer before I start.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions