feat: renderComposition — Argo × Hyperframes crossover bridge by shreyaskarnik · Pull Request #15 · shreyaskarnik/argo

shreyaskarnik · 2026-05-07T03:27:25Z

Summary

Adds renderComposition — a primitive that mounts a self-contained HTML composition (Hyperframes-shaped) as a scene in an Argo demo, alongside recorded scenes from the real running app. Established the strategic split between the two tools and ships everything needed to use them together.

The crossover thesis: Argo records the real running app (its USP); compositions frame it (Hyperframes' USP). This branch ships the bridge that lets a single demo mix both in one timeline.

What's new

Core primitive

`renderComposition(page, narration, htmlPath, opts)` — loads a self-contained composition, polls for `window.__timelines[scene]` registration (matches Hyperframes' shape exactly), marks the scene, plays the timeline, holds for `data-duration`. Auto-starts `narration.startRecording` if the demo hasn't already, so the recording clock anchors at the first animated frame instead of browser launch.
`readCompositionDuration(htmlPath)` helper for ahead-of-time duration lookups.

Composition runtime infrastructure

`src/composition-server.ts` — per-scene HTTP server rooted at the project, with `<base href="/">` injection so compositions resolve sibling assets (textures, GLTF, fonts) via project-relative URLs. Replaces `file://` (chromium blocks file:// → file:// fetches for some asset types).
Composition contract: `data-composition-id` + `data-duration` + paused GSAP master timeline at `window.__timelines[scene]` + optional `window.__compositionReady` Promise. Identical to Hyperframes' shape so a composition that runs in Hyperframes runs unchanged in Argo.
CanvasDrawElement plumbing: `video.experimentalCanvasDrawElement: true` + `video.browserChannel: 'chrome-canary'` for blocks that need Hyperframes' html-in-canvas API (3D / GLTF / WebGL).

Audio sidecar

Auto-detects `` children in compositions, writes `.argo//.composition-audio.jsonl`, the pipeline reads it, and ffmpeg amix's each track at scene start with TTS narration. Required because CDP screencast captures video frames only, not browser audio output.

Skill guidance for AI agents

New `references/compositions.md` (279 lines): when to reach for a composition vs stay with recording, hand-roll vs import-from-catalog decision rule, contract details, mixed-demo patterns, audio sidecar, common pitfalls.
SKILL.md inline section "Mixing in Compositions" + `renderComposition` in the Core APIs table.
Decision rule: recording is the default, compositions frame it. Generic polish (logos, titles, lower-thirds) → import from Hyperframes catalog. Product-specific concepts (wedge thesis, custom comparisons) → hand-roll.

Demos / proof

`compositions/intro.html` — minimal hand-rolled composition following the contract.
`demos/composition-intro` — single-scene proof: 3.4s with GSAP-animated content correctly captured.
`demos/composition-trio` — chains `apple-money-count` (5s GSAP counter) + `blue-sweater-intro-video` (12s creator intro) imported via `npx hyperframes add`. 17.2s output with both blocks rendering correctly + audio mixed in via the sidecar.
`demos/argo-launch` (Plan A) — 15.1s crossover teaser: hand-rolled intro composition → recorded showcase hero (with spotlight + focus ring on the live page) → hand-rolled outro composition. Single `argo pipeline` invocation, h264 + aac + chapters.
`demos/composition-iphone` — Hyperframes' `vfx-iphone-device` (3D iPhone + MacBook with html-in-canvas screen content + morphing glass lens) renders correctly through Argo's pipeline on Chrome Canary with the CanvasDrawElement flag wired.

What's deliberately out of scope (follow-ups)

Plan B: the canonical Argo launch video — replace the existing showcase's hand-rolled hero + CTA with composition scenes, keep the recorded middle. Tracked as next milestone, will follow the patterns this branch ships.
Word-level STT for narration (project_word_level_stt.md memory) — Hyperframes' website-to-hyperframes pipeline runs whisper on TTS output for word-level alignment. Argo could adopt the same pattern in v0.38 for per-word captions and composition sync to specific words.
Wrap Argo recording inside a 3D Hyperframes block (e.g., recording-on-iPhone-screen via `__argoVideoSrc` + a fork of `vfx-iphone-device`). Needs a Hyperframes-side PR or an Argo-shipped adapter block.

Test plan

`npm test` — 649 tests pass (4 new tests for `readCompositionDuration`)
`npm run build` clean
`composition-intro` demo: 3.4s mp4, GSAP content visible at midpoint
`composition-trio` demo: 17.2s mp4 with audio mix from both blocks (verified `` SFX plays in final output)
`argo-launch` demo: 15.1s mp4 with intro composition + recorded showcase hero + outro composition + audio
`composition-iphone` demo: 16s mp4 with 3D iPhone + html-in-canvas content rendered correctly on Chrome Canary
Hyperframes-installed assets are gitignored; install commands documented in each demo's header
CI matrix on chromium/firefox/webkit (compositions-only paths run on chromium; hyperframes-CLI install only happens on Canary path)

Compatibility

No breaking changes. `renderComposition` is purely additive; existing recorded demos work identically. The new config options (`experimentalCanvasDrawElement`, `browserChannel`, `outputWidth`/`outputHeight`/`watermark.scale` from earlier merges) all default to current behavior when unset.

Branch state — 9 commits stacked

```
docs(skill): add hand-roll vs import-from-catalog decision rule
docs(skill): teach when to reach for a composition over a recording
demo(composition): argo-launch — Argo × Hyperframes crossover (Plan A)
feat(composition): audio sidecar — hyperframes SFX in final mp4
demo(composition): trio of stable-chromium hyperframes blocks
chore: gitignore hyperframes-installed assets
experiment(composition): HTTP server + timeline polling + WebGL — iPhone block renders
experiment(composition): wire CanvasDrawElement flags + import hyperframes block
feat(composition): renderComposition primitive (Argo × Hyperframes bridge)
```

…idge) Adds the embedding seam between Argo's recording pipeline and Hyperframes- style compositional rendering. Argo demos can now mix recorded scenes with self-contained HTML compositions in the same narration timeline. Composition contract follows hyperframes' shape exactly so a composition that runs in hyperframes runs unchanged in Argo: * Root: <div data-composition-id="X" data-width data-height data-duration> * Optional `window.__compositionReady` Promise gate * Optional `window.__timelines[scene]` paused GSAP master timeline that renderComposition resumes after marking the scene Argo additions on top of hyperframes' contract: * `window.__argoVideoSrc` — set by renderComposition when `opts.videoSrc` is provided. Compositions consume this to embed an Argo recording as a `<video>` child of an html-in-canvas, then texture it onto a 3D device frame, apply WebGL effects, etc. (the bridge that the May 4 html-in- canvas probe identified — recording can't be the canvas-as-layout- subtree, but it can be a `<video>` inside one.) Loads via file:// URL so relative asset paths in the composition (textures, GLTF, fonts) resolve normally — `setContent` would break every external reference. `readCompositionDuration(htmlPath)` lets demo scripts read the declared duration without rendering, useful in `narration.durationFor()`. Ships: * src/composition.ts + src/index.ts export * compositions/intro.html — minimal contract-conforming sample (gradient title, subtitle, GSAP master timeline, ready signal) * demos/composition-intro.demo.ts proves the primitive end-to-end * 4 tests covering readCompositionDuration Smoke-verified: 3.4s mp4 output matches composition's data-duration, midpoint frame shows GSAP-animated content correctly. Roadmap: next step is a `compositions/iphone-frame.html` that wraps an Argo recording inside a 3D iPhone via html-in-canvas + GLTF — the move that turns this into a real argo×hyperframes crossover.

…rames block Stacks on the renderComposition primitive. Adds the launch-flag plumbing needed to render hyperframes blocks that depend on the WICG html-in-canvas API, and probes the integration against a real imported block. Surfaces: * `video.experimentalCanvasDrawElement: true` — emits `--enable-features=CanvasDrawElement` to chromium launch args * `video.browserChannel: 'chrome-canary'` — Playwright launches a system-installed channel instead of bundled chromium * `record.allowFileAccessFromFiles` (auto-enabled when the canvas-draw flag is on) — emits `--allow-file-access-from-files` so file:// pages can fetch sibling GLTF / textures Imported the hyperframes `vfx-iphone-device` block via `npx hyperframes add` (Node 22+ for the CLI; Argo runtime stays Node 20): 3D iPhone + MacBook GLTF models with html-in-canvas screen content + camera choreography. Files committed under `compositions/` and `models/`. Status: simple compositions render fully (tests/composition-intro pipeline produces correct GSAP-animated output at 3.4s). The 3D iphone block renders to black frames — the block's API gate also checks for canvas `layoutSubtree` attribute support (not just `drawElementImage`), and likely also needs WebGL launch flags. Findings captured in the project_html_in_canvas_watch.md memory note. Next plumbing layer to make 3D blocks render: WebGL launch flags (`--use-gl=angle --use-angle=swiftshader --enable-webgl --ignore-gpu-blacklist`, matching Argo's existing shader-render), and likely a temporary HTTP server for composition rendering instead of relying on file:// URLs.

…one block renders Layers atop the previous wire-up commit. Final result: hyperframes' real `vfx-iphone-device` block (3D iPhone + MacBook GLTF + html-in-canvas screen content + morphing glass lens) renders correctly through Argo's recording pipeline. Changes: * `src/composition-server.ts` (new): per-scene HTTP server rooted at the project so compositions in `compositions/` resolve sibling assets in `models/`, `assets/`, etc. via path-relative URLs. Injects `<base href="/">` into the served HTML so the composition needs no modification. Replaces the prior `file://` URL approach (chromium blocks file:// fetches of sibling resources for some asset types). * `src/composition.ts`: poll for `window.__timelines[scene]` BEFORE marking the scene + playing. Hyperframes blocks register the timeline inside an async `onReady()` after GLTF/DRACO loads complete (5-12s). Marking before the timeline is registered burned that warmup as black scene time. Now the scene's first frame is the composition's first animated frame. `renderComposition` also calls `narration.startRecording` itself if the demo hasn't already, so the recording clock anchors at the readiness boundary instead of capturing browser launch. * `src/narration.ts`: tiny `isRecording` getter so `renderComposition` can decide whether to call `startRecording` itself. * `src/record.ts`: WebGL launch flags (`--use-gl=angle --use-angle=swiftshader --enable-webgl --ignore-gpu-blocklist`) auto-enabled together with `experimentalCanvasDrawElement`. Without these, headless Three.js in Canary doesn't initialize and the scene renders black. Verified against the imported `vfx-iphone-device` block (installed via `npx hyperframes add vfx-iphone-device` against Node 22's bundled CLI): captured frames show the rotating 3D iPhone with HyperFrames UI on the screen, floating morphing glass lens, dynamic lighting, floor reflection. Remaining blank head (~5-7s after recording starts) is inside the hyperframes block's own warmup — Three.js environment map + PMREM cubemap prerender + initial GSAP frames before the camera reaches a visible pose. Argo can't shave that further from outside; a clean fix needs the block to expose `window.__compositionReady` resolving on first-visible-frame instead of asset-load-complete. Worth a hyperframes contribution. Findings + corrected analysis captured in project_html_in_canvas_watch.md.

Track the install command, not the artifacts. The hyperframes CLI is the source of truth for these files and version-pins them; carrying our own copy would fork the upstream. * .gitignore now excludes models/, compositions/vfx-*, compositions/experimental-* * demos/composition-iphone.demo.ts has the one-line install command in its header, including the Node 22 PATH override needed for the hyperframes CLI.

Validates `renderComposition` against three real catalog blocks that don't need html-in-canvas / WebGL / Canary: * `compositions/intro.html` (hand-rolled baseline, 3.4s) — already in tree * `apple-money-count` (5s) — GSAP-only finance counter $0→$10K * `blue-sweater-intro-video` (12s) — multi-stage AI creator intro card Full crossover verified on stable chromium: each block renders correctly through Argo's recording pipeline, durations match `data-duration` exactly, both blocks chain back-to-back in a single demo (17.2s total). Hyperframes-installed assets are gitignored: - models/ (GLTF + textures) - compositions/{apple-*,blue-sweater-*,vfx-*,experimental-*,components/} - assets/{joe-sai-avatar.png,sfx-production.wav,sfx/} Install commands are in each demo's header. The hyperframes CLI is version-controlled by its own release; pinning artifacts here would fork.

CDP screencast captures video frames only, not browser audio output, so hyperframes blocks that play sound effects through `<audio>` children (apple-money-count's sfx-production.wav, blue-sweater's intro music etc.) landed silently in Argo's exports. Sidecar approach: `renderComposition` queries the loaded composition for all `<audio>` elements, resolves their `src` attribute to absolute paths via the composition server's serverRoot, and appends one entry per track to `.argo/<demo>/.composition-audio.jsonl`: { scene, src: <abs path>, startMs: <scene mark wallclock>, durationMs } The pipeline reads the sidecar after recording, builds an `extraAudioTracks: Array<{src, startMs, volume?}>` list, and threads it through to `exportVideo`. Inside `export.ts`, each extra track gets its own `-i` input, an `adelay=startMs|startMs` per track, a `volume` filter, and amix's into the existing audio source (narration + music + extras all mix together before loudnorm). * `src/composition.ts` — audio extraction + sidecar write * `src/export.ts` — `extraAudioTracks` option + ffmpeg filter generation * `src/pipeline.ts` — `readCompositionAudioSidecar()` + wiring * `src/record.ts` — clean stale `.composition-audio.jsonl` between runs Verified: composition-trio output now has h264 video + aac audio + chapters. The blue-sweater intro music plays at 0ms, apple-money sting at 12063ms, mixed with 0.7 default volume. Also: `.gitignore` now covers `.agents/skills/`, `.claude/skills/`, `skills/` (except argo-guide), `skills-lock.json`, and `argo-showcase-captured/` since those are all hyperframes-CLI-managed artifacts that don't belong in the repo.

Tight 15s launch teaser proving the renderComposition bridge works end-to-end with real materials. Three-scene structure: 1. compositions/argo-launch-intro.html (4s) Hand-rolled hyperframes-contract composition: gradient "Argo" title, "Now demoing" eyebrow, "Playwright in. Launch assets out." subtitle, paused GSAP master timeline at window.__timelines["intro"]. 2. Recorded Argo showcase hero (~8s) page.goto('/showcase.html'), spotlight + focusRing on the hero command, CDP-direct captures the live page through to fade out. 3. compositions/argo-launch-outro.html (3s) Logo + "Demos as code." + `npx argo init` URL pill, expanding ring, fade-out. Single `argo pipeline argo-launch --config demos/argo-launch.config.mjs` invocation. Output: 15.1s mp4 with h264 video + aac audio + chapter markers. Hyperframes contract followed exactly so a composition that runs in hyperframes runs unchanged here. Demonstrates the canonical bridge: composition scenes book-end a real Playwright recording without authoring leaving Argo's pipeline. Next milestone (Plan B) will replace the existing showcase intro/outro with this same approach.

Argo's strength is recording the real running app. But some scenes communicate better as authored motion graphics — concepts the product can't physically show on its own. The skill should make the agent explicit about that choice instead of always reaching for an overlay or extending a recording. * Add `renderComposition` to the Core APIs table with a one-liner that points to the new section * New top-level section "Mixing in Compositions (when recording isn't the right tool)" — when to reach for a composition (concepts, comparisons, brand moments, transitions) and when to stay with recording (real product UI, real interactions). Includes the mixed composition + recording pattern that bookends an Argo demo. * New reference `references/compositions.md` — full contract, layout guidance (end-state-first), determinism rules, hyperframes catalog import, mixed-demo patterns, audio sidecar, common pitfalls. Frames the strategic split honestly: the recording IS the demo, but compositions frame it. Don't over-rotate to compositions; don't ignore them either. Concrete examples: "the wedge thesis" (concept that recording can't show), title cards / kinetic outros (recording would be forced), side-by-side comparisons (recording shows one thing). The example mixed demo in the skill matches Plan B's structure (intro composition → recorded hero → optional intermediate composition → recorded preview → outro composition), so when a downstream agent authors a launch demo it has a worked example to follow.

Extends the composition guidance with a sharper choice the agent should make at each composition scene: is this generic polish (logo reveal, kinetic title card, lower-third, 3D device frame) or specific to my product (wedge thesis, custom comparison, schematic explainer of my architecture)? Generic polish → import from the Hyperframes catalog. Battle-tested motion design, brand-tunable, version-pinned by their CLI. Hand-rolling these is almost never worth it — the result will look worse at the same effort budget. Product-specific concepts → hand-roll a minimal composition. The catalog can't have these; they're unique to your story. The example most worth surfacing: instead of authoring an animated Argo logo intro from scratch, `npx hyperframes add logo-outro` ships a polished one in seconds. Demo authors should reach for this first for brand moments. Updates both the SKILL.md inline section and references/compositions.md deep-dive so the rule is visible at both the summary and detailed levels.

shreyaskarnik added 9 commits May 6, 2026 15:48

shreyaskarnik merged commit 64eed98 into main May 7, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: renderComposition — Argo × Hyperframes crossover bridge#15

feat: renderComposition — Argo × Hyperframes crossover bridge#15
shreyaskarnik merged 9 commits into
mainfrom
feat/render-composition

shreyaskarnik commented May 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

shreyaskarnik commented May 7, 2026

Summary

What's new

Core primitive

Composition runtime infrastructure

Audio sidecar

Skill guidance for AI agents

Demos / proof

What's deliberately out of scope (follow-ups)

Test plan

Compatibility

Branch state — 9 commits stacked

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant