Skip to content

feat: renderComposition — Argo × Hyperframes crossover bridge#15

Merged
shreyaskarnik merged 9 commits into
mainfrom
feat/render-composition
May 7, 2026
Merged

feat: renderComposition — Argo × Hyperframes crossover bridge#15
shreyaskarnik merged 9 commits into
mainfrom
feat/render-composition

Conversation

@shreyaskarnik
Copy link
Copy Markdown
Owner

Summary

Adds renderComposition — a primitive that mounts a self-contained HTML composition (Hyperframes-shaped) as a scene in an Argo demo, alongside recorded scenes from the real running app. Established the strategic split between the two tools and ships everything needed to use them together.

The crossover thesis: Argo records the real running app (its USP); compositions frame it (Hyperframes' USP). This branch ships the bridge that lets a single demo mix both in one timeline.

What's new

Core primitive

  • `renderComposition(page, narration, htmlPath, opts)` — loads a self-contained composition, polls for `window.__timelines[scene]` registration (matches Hyperframes' shape exactly), marks the scene, plays the timeline, holds for `data-duration`. Auto-starts `narration.startRecording` if the demo hasn't already, so the recording clock anchors at the first animated frame instead of browser launch.
  • `readCompositionDuration(htmlPath)` helper for ahead-of-time duration lookups.

Composition runtime infrastructure

  • `src/composition-server.ts` — per-scene HTTP server rooted at the project, with `<base href="/">` injection so compositions resolve sibling assets (textures, GLTF, fonts) via project-relative URLs. Replaces `file://` (chromium blocks file:// → file:// fetches for some asset types).
  • Composition contract: `data-composition-id` + `data-duration` + paused GSAP master timeline at `window.__timelines[scene]` + optional `window.__compositionReady` Promise. Identical to Hyperframes' shape so a composition that runs in Hyperframes runs unchanged in Argo.
  • CanvasDrawElement plumbing: `video.experimentalCanvasDrawElement: true` + `video.browserChannel: 'chrome-canary'` for blocks that need Hyperframes' html-in-canvas API (3D / GLTF / WebGL).

Audio sidecar

  • Auto-detects `` children in compositions, writes `.argo//.composition-audio.jsonl`, the pipeline reads it, and ffmpeg amix's each track at scene start with TTS narration. Required because CDP screencast captures video frames only, not browser audio output.

Skill guidance for AI agents

  • New `references/compositions.md` (279 lines): when to reach for a composition vs stay with recording, hand-roll vs import-from-catalog decision rule, contract details, mixed-demo patterns, audio sidecar, common pitfalls.
  • SKILL.md inline section "Mixing in Compositions" + `renderComposition` in the Core APIs table.
  • Decision rule: recording is the default, compositions frame it. Generic polish (logos, titles, lower-thirds) → import from Hyperframes catalog. Product-specific concepts (wedge thesis, custom comparisons) → hand-roll.

Demos / proof

  • `compositions/intro.html` — minimal hand-rolled composition following the contract.
  • `demos/composition-intro` — single-scene proof: 3.4s with GSAP-animated content correctly captured.
  • `demos/composition-trio` — chains `apple-money-count` (5s GSAP counter) + `blue-sweater-intro-video` (12s creator intro) imported via `npx hyperframes add`. 17.2s output with both blocks rendering correctly + audio mixed in via the sidecar.
  • `demos/argo-launch` (Plan A) — 15.1s crossover teaser: hand-rolled intro composition → recorded showcase hero (with spotlight + focus ring on the live page) → hand-rolled outro composition. Single `argo pipeline` invocation, h264 + aac + chapters.
  • `demos/composition-iphone` — Hyperframes' `vfx-iphone-device` (3D iPhone + MacBook with html-in-canvas screen content + morphing glass lens) renders correctly through Argo's pipeline on Chrome Canary with the CanvasDrawElement flag wired.

What's deliberately out of scope (follow-ups)

  • Plan B: the canonical Argo launch video — replace the existing showcase's hand-rolled hero + CTA with composition scenes, keep the recorded middle. Tracked as next milestone, will follow the patterns this branch ships.
  • Word-level STT for narration (project_word_level_stt.md memory) — Hyperframes' website-to-hyperframes pipeline runs whisper on TTS output for word-level alignment. Argo could adopt the same pattern in v0.38 for per-word captions and composition sync to specific words.
  • Wrap Argo recording inside a 3D Hyperframes block (e.g., recording-on-iPhone-screen via `__argoVideoSrc` + a fork of `vfx-iphone-device`). Needs a Hyperframes-side PR or an Argo-shipped adapter block.

Test plan

  • `npm test` — 649 tests pass (4 new tests for `readCompositionDuration`)
  • `npm run build` clean
  • `composition-intro` demo: 3.4s mp4, GSAP content visible at midpoint
  • `composition-trio` demo: 17.2s mp4 with audio mix from both blocks (verified `` SFX plays in final output)
  • `argo-launch` demo: 15.1s mp4 with intro composition + recorded showcase hero + outro composition + audio
  • `composition-iphone` demo: 16s mp4 with 3D iPhone + html-in-canvas content rendered correctly on Chrome Canary
  • Hyperframes-installed assets are gitignored; install commands documented in each demo's header
  • CI matrix on chromium/firefox/webkit (compositions-only paths run on chromium; hyperframes-CLI install only happens on Canary path)

Compatibility

No breaking changes. `renderComposition` is purely additive; existing recorded demos work identically. The new config options (`experimentalCanvasDrawElement`, `browserChannel`, `outputWidth`/`outputHeight`/`watermark.scale` from earlier merges) all default to current behavior when unset.

Branch state — 9 commits stacked

```
docs(skill): add hand-roll vs import-from-catalog decision rule
docs(skill): teach when to reach for a composition over a recording
demo(composition): argo-launch — Argo × Hyperframes crossover (Plan A)
feat(composition): audio sidecar — hyperframes SFX in final mp4
demo(composition): trio of stable-chromium hyperframes blocks
chore: gitignore hyperframes-installed assets
experiment(composition): HTTP server + timeline polling + WebGL — iPhone block renders
experiment(composition): wire CanvasDrawElement flags + import hyperframes block
feat(composition): renderComposition primitive (Argo × Hyperframes bridge)
```

…idge)

Adds the embedding seam between Argo's recording pipeline and Hyperframes-
style compositional rendering. Argo demos can now mix recorded scenes
with self-contained HTML compositions in the same narration timeline.

Composition contract follows hyperframes' shape exactly so a composition
that runs in hyperframes runs unchanged in Argo:

  * Root: <div data-composition-id="X" data-width data-height data-duration>
  * Optional `window.__compositionReady` Promise gate
  * Optional `window.__timelines[scene]` paused GSAP master timeline that
    renderComposition resumes after marking the scene

Argo additions on top of hyperframes' contract:

  * `window.__argoVideoSrc` — set by renderComposition when `opts.videoSrc`
    is provided. Compositions consume this to embed an Argo recording as a
    `<video>` child of an html-in-canvas, then texture it onto a 3D device
    frame, apply WebGL effects, etc. (the bridge that the May 4 html-in-
    canvas probe identified — recording can't be the canvas-as-layout-
    subtree, but it can be a `<video>` inside one.)

Loads via file:// URL so relative asset paths in the composition (textures,
GLTF, fonts) resolve normally — `setContent` would break every external
reference. `readCompositionDuration(htmlPath)` lets demo scripts read the
declared duration without rendering, useful in `narration.durationFor()`.

Ships:

  * src/composition.ts + src/index.ts export
  * compositions/intro.html — minimal contract-conforming sample
    (gradient title, subtitle, GSAP master timeline, ready signal)
  * demos/composition-intro.demo.ts proves the primitive end-to-end
  * 4 tests covering readCompositionDuration

Smoke-verified: 3.4s mp4 output matches composition's data-duration,
midpoint frame shows GSAP-animated content correctly.

Roadmap: next step is a `compositions/iphone-frame.html` that wraps an
Argo recording inside a 3D iPhone via html-in-canvas + GLTF — the move
that turns this into a real argo×hyperframes crossover.
…rames block

Stacks on the renderComposition primitive. Adds the launch-flag plumbing
needed to render hyperframes blocks that depend on the WICG html-in-canvas
API, and probes the integration against a real imported block.

Surfaces:

* `video.experimentalCanvasDrawElement: true` — emits
  `--enable-features=CanvasDrawElement` to chromium launch args
* `video.browserChannel: 'chrome-canary'` — Playwright launches a
  system-installed channel instead of bundled chromium
* `record.allowFileAccessFromFiles` (auto-enabled when the canvas-draw
  flag is on) — emits `--allow-file-access-from-files` so file://
  pages can fetch sibling GLTF / textures

Imported the hyperframes `vfx-iphone-device` block via
`npx hyperframes add` (Node 22+ for the CLI; Argo runtime stays Node 20):
3D iPhone + MacBook GLTF models with html-in-canvas screen content +
camera choreography. Files committed under `compositions/` and `models/`.

Status: simple compositions render fully (tests/composition-intro pipeline
produces correct GSAP-animated output at 3.4s). The 3D iphone block
renders to black frames — the block's API gate also checks for canvas
`layoutSubtree` attribute support (not just `drawElementImage`), and
likely also needs WebGL launch flags. Findings captured in the
project_html_in_canvas_watch.md memory note.

Next plumbing layer to make 3D blocks render: WebGL launch flags
(`--use-gl=angle --use-angle=swiftshader --enable-webgl
--ignore-gpu-blacklist`, matching Argo's existing shader-render), and
likely a temporary HTTP server for composition rendering instead of
relying on file:// URLs.
…one block renders

Layers atop the previous wire-up commit. Final result: hyperframes' real
`vfx-iphone-device` block (3D iPhone + MacBook GLTF + html-in-canvas screen
content + morphing glass lens) renders correctly through Argo's recording
pipeline.

Changes:

* `src/composition-server.ts` (new): per-scene HTTP server rooted at the
  project so compositions in `compositions/` resolve sibling assets in
  `models/`, `assets/`, etc. via path-relative URLs. Injects
  `<base href="/">` into the served HTML so the composition needs no
  modification. Replaces the prior `file://` URL approach (chromium
  blocks file:// fetches of sibling resources for some asset types).

* `src/composition.ts`: poll for `window.__timelines[scene]` BEFORE
  marking the scene + playing. Hyperframes blocks register the timeline
  inside an async `onReady()` after GLTF/DRACO loads complete (5-12s).
  Marking before the timeline is registered burned that warmup as black
  scene time. Now the scene's first frame is the composition's first
  animated frame. `renderComposition` also calls `narration.startRecording`
  itself if the demo hasn't already, so the recording clock anchors at
  the readiness boundary instead of capturing browser launch.

* `src/narration.ts`: tiny `isRecording` getter so `renderComposition`
  can decide whether to call `startRecording` itself.

* `src/record.ts`: WebGL launch flags
  (`--use-gl=angle --use-angle=swiftshader --enable-webgl
  --ignore-gpu-blocklist`) auto-enabled together with
  `experimentalCanvasDrawElement`. Without these, headless Three.js
  in Canary doesn't initialize and the scene renders black.

Verified against the imported `vfx-iphone-device` block (installed via
`npx hyperframes add vfx-iphone-device` against Node 22's bundled CLI):
captured frames show the rotating 3D iPhone with HyperFrames UI on the
screen, floating morphing glass lens, dynamic lighting, floor reflection.

Remaining blank head (~5-7s after recording starts) is inside the
hyperframes block's own warmup — Three.js environment map + PMREM
cubemap prerender + initial GSAP frames before the camera reaches a
visible pose. Argo can't shave that further from outside; a clean fix
needs the block to expose `window.__compositionReady` resolving on
first-visible-frame instead of asset-load-complete. Worth a hyperframes
contribution.

Findings + corrected analysis captured in
project_html_in_canvas_watch.md.
Track the install command, not the artifacts. The hyperframes CLI is the
source of truth for these files and version-pins them; carrying our own
copy would fork the upstream.

* .gitignore now excludes models/, compositions/vfx-*, compositions/experimental-*
* demos/composition-iphone.demo.ts has the one-line install command in
  its header, including the Node 22 PATH override needed for the
  hyperframes CLI.
Validates `renderComposition` against three real catalog blocks that
don't need html-in-canvas / WebGL / Canary:

* `compositions/intro.html` (hand-rolled baseline, 3.4s) — already in tree
* `apple-money-count` (5s) — GSAP-only finance counter $0→$10K
* `blue-sweater-intro-video` (12s) — multi-stage AI creator intro card

Full crossover verified on stable chromium: each block renders correctly
through Argo's recording pipeline, durations match `data-duration`
exactly, both blocks chain back-to-back in a single demo (17.2s total).

Hyperframes-installed assets are gitignored:
  - models/ (GLTF + textures)
  - compositions/{apple-*,blue-sweater-*,vfx-*,experimental-*,components/}
  - assets/{joe-sai-avatar.png,sfx-production.wav,sfx/}

Install commands are in each demo's header. The hyperframes CLI is
version-controlled by its own release; pinning artifacts here would fork.
CDP screencast captures video frames only, not browser audio output, so
hyperframes blocks that play sound effects through `<audio>` children
(apple-money-count's sfx-production.wav, blue-sweater's intro music etc.)
landed silently in Argo's exports.

Sidecar approach: `renderComposition` queries the loaded composition for
all `<audio>` elements, resolves their `src` attribute to absolute paths
via the composition server's serverRoot, and appends one entry per track
to `.argo/<demo>/.composition-audio.jsonl`:

  { scene, src: <abs path>, startMs: <scene mark wallclock>, durationMs }

The pipeline reads the sidecar after recording, builds an
`extraAudioTracks: Array<{src, startMs, volume?}>` list, and threads it
through to `exportVideo`. Inside `export.ts`, each extra track gets its
own `-i` input, an `adelay=startMs|startMs` per track, a `volume` filter,
and amix's into the existing audio source (narration + music + extras
all mix together before loudnorm).

* `src/composition.ts` — audio extraction + sidecar write
* `src/export.ts` — `extraAudioTracks` option + ffmpeg filter generation
* `src/pipeline.ts` — `readCompositionAudioSidecar()` + wiring
* `src/record.ts` — clean stale `.composition-audio.jsonl` between runs

Verified: composition-trio output now has h264 video + aac audio + chapters.
The blue-sweater intro music plays at 0ms, apple-money sting at 12063ms,
mixed with 0.7 default volume.

Also: `.gitignore` now covers `.agents/skills/`, `.claude/skills/`,
`skills/` (except argo-guide), `skills-lock.json`, and
`argo-showcase-captured/` since those are all hyperframes-CLI-managed
artifacts that don't belong in the repo.
Tight 15s launch teaser proving the renderComposition bridge works
end-to-end with real materials. Three-scene structure:

  1. compositions/argo-launch-intro.html (4s)
     Hand-rolled hyperframes-contract composition: gradient "Argo" title,
     "Now demoing" eyebrow, "Playwright in. Launch assets out." subtitle,
     paused GSAP master timeline at window.__timelines["intro"].
  2. Recorded Argo showcase hero (~8s)
     page.goto('/showcase.html'), spotlight + focusRing on the hero
     command, CDP-direct captures the live page through to fade out.
  3. compositions/argo-launch-outro.html (3s)
     Logo + "Demos as code." + `npx argo init` URL pill, expanding ring,
     fade-out.

Single `argo pipeline argo-launch --config demos/argo-launch.config.mjs`
invocation. Output: 15.1s mp4 with h264 video + aac audio + chapter
markers. Hyperframes contract followed exactly so a composition that
runs in hyperframes runs unchanged here.

Demonstrates the canonical bridge: composition scenes book-end a real
Playwright recording without authoring leaving Argo's pipeline. Next
milestone (Plan B) will replace the existing showcase intro/outro with
this same approach.
Argo's strength is recording the real running app. But some scenes
communicate better as authored motion graphics — concepts the product
can't physically show on its own. The skill should make the agent
explicit about that choice instead of always reaching for an overlay
or extending a recording.

* Add `renderComposition` to the Core APIs table with a one-liner that
  points to the new section
* New top-level section "Mixing in Compositions (when recording isn't
  the right tool)" — when to reach for a composition (concepts,
  comparisons, brand moments, transitions) and when to stay with
  recording (real product UI, real interactions). Includes the mixed
  composition + recording pattern that bookends an Argo demo.
* New reference `references/compositions.md` — full contract, layout
  guidance (end-state-first), determinism rules, hyperframes catalog
  import, mixed-demo patterns, audio sidecar, common pitfalls.

Frames the strategic split honestly: the recording IS the demo, but
compositions frame it. Don't over-rotate to compositions; don't ignore
them either. Concrete examples: "the wedge thesis" (concept that
recording can't show), title cards / kinetic outros (recording would
be forced), side-by-side comparisons (recording shows one thing).

The example mixed demo in the skill matches Plan B's structure
(intro composition → recorded hero → optional intermediate composition
→ recorded preview → outro composition), so when a downstream agent
authors a launch demo it has a worked example to follow.
Extends the composition guidance with a sharper choice the agent should
make at each composition scene: is this generic polish (logo reveal,
kinetic title card, lower-third, 3D device frame) or specific to my
product (wedge thesis, custom comparison, schematic explainer of my
architecture)?

Generic polish → import from the Hyperframes catalog. Battle-tested
motion design, brand-tunable, version-pinned by their CLI. Hand-rolling
these is almost never worth it — the result will look worse at the same
effort budget.

Product-specific concepts → hand-roll a minimal composition. The catalog
can't have these; they're unique to your story.

The example most worth surfacing: instead of authoring an animated Argo
logo intro from scratch, `npx hyperframes add logo-outro` ships a polished
one in seconds. Demo authors should reach for this first for brand
moments.

Updates both the SKILL.md inline section and references/compositions.md
deep-dive so the rule is visible at both the summary and detailed levels.
@shreyaskarnik shreyaskarnik merged commit 64eed98 into main May 7, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant