Skip to content

feat(cra): presigned S3 URLs + progress tracking#538

Open
jirkamotejl wants to merge 26 commits intomasterfrom
feature/cra-encoding-improvements
Open

feat(cra): presigned S3 URLs + progress tracking#538
jirkamotejl wants to merge 26 commits intomasterfrom
feature/cra-encoding-improvements

Conversation

@jirkamotejl
Copy link
Contributor

Summary

  • Presigned S3 URLs: Encoder no longer downloads video to local disk or uploads via SFTP. CRA fetches video directly from S3 via presigned URL (7-day expiry). Only the XML manifest is uploaded via SFTP.
  • Two-phase encoding: When encoder_sd_profile_group is defined (e.g. "VoDSD"), CreateMediaJob submits two manifests with the same refId — SD first, then HD (VoDHDauto). SD completion triggers subtitle transcription. Backward compatible: single-phase when encoder_sd_profile_group is nil.
  • Smarter progress tracking: CheckProgressJob parses CRA messages array for per-phase milestones (validation, audio, video, thumbnails, packaging), extracts video duration, and estimates completion time. New states: sd_processing → sd_processed → hd_processing → full_media_processed.

Remaining tasks (separate PRs)

  • MonitorProcessingJob: two-phase state awareness + dynamic timeouts (Task 4)
  • FileProcessing concern: sd_ready? / sd_mp4_url helpers (Task 5)
  • Integration test for full two-phase lifecycle (Task 6)
  • Economia override: encoder_sd_profile_group returning "VoDSD" in production (separate repo)

Test plan

  • Encoder tests: presigned URL manifest + fallback to file attribute (2 tests)
  • CreateMediaJob tests: two-phase submission + single-phase backward compat (3 tests)
  • CheckProgressJob tests: SD→HD transitions, message parsing, ETA, single-phase DONE, encoding generation (8 tests)
  • MonitorProcessingJob existing tests still pass (10 tests)
  • All 23 CRA tests pass
  • Economia override + video_test.rb (separate PR after folio gem update)
  • Staging environment smoke test with real CRA API

Instead of downloading the video to the server and re-uploading via
SFTP, the ingest manifest now includes a presigned S3 URL. CRA
downloads the video directly from S3, eliminating disk/memory pressure
on the application server.

Also adds safety guards for destroying files that were never sent to
CRA (early return in DeleteMediaJob and file_processing concern).
@jirkamotejl jirkamotejl force-pushed the feature/cra-encoding-improvements branch from 9931257 to 31dce63 Compare February 27, 2026 08:27
@jirkamotejl jirkamotejl changed the title feat(cra): presigned S3 URLs + two-phase encoding (SD→HD) feat(cra): presigned S3 URLs + progress tracking Feb 27, 2026
- Parse CRA encoding messages for phase-level progress (validation,
  audio, video, thumbnails, packaging) and estimate completion time
- Extract JobResolver to deduplicate job resolution logic across
  CreateMediaJob, CheckProgressJob, and MonitorProcessingJob
- Fix CheckProgressJob to properly save and broadcast on FAILED status
- Guard process_output_hash against nil HLS/DASH entries
- Set progress_percentage to 100.0 on DONE for consistent final state
- Add dedicated encoding_progress MessageBus event for UI updates
- Add processing_failed locale translations
Listen for the dedicated CraMediaCloud encoding_progress MessageBus
event (not the generic file_update) to avoid reacting to unrelated
file changes like in-place input edits or metadata extraction.

During processing: update state label text inline (no page reload).
On state change (e.g. processing -> ready): full Turbo reload.
@jirkamotejl jirkamotejl force-pushed the feature/cra-encoding-improvements branch from 31dce63 to 8ef7294 Compare February 27, 2026 09:05
- Add encoding_completed_at timestamp to CheckProgressJob on DONE
- New EncodingInfoComponent in file detail meta bar showing progress,
  phases, ETA during encoding and duration after completion
- Expand encoding_progress broadcast with progress/phases/ETA data
- Live-update encoding info via MessageBus without page reload
- Fix stale state badge: listen for file_update events, Turbo reload
  frame only when aasm_state actually changes
- Clean up FileSerializer: remove progress % hack from aasm_state_human
- Move meta-item wrapper inside component template so it doesn't
  render an empty div when component is hidden
- Render component during all CRA processing, not just when
  progress_percentage is already present
- Handle nil progress_percentage in template
Render encoding info as inline span next to state label instead of
a separate meta-item. Shows as "Zpracováváno — 45.1% · audio · ~16 min".
- Translate encoding phases (validace, audio, náhledy, video, balení)
- ETA shows as live countdown "zbývá ~9:32" ticking every second
- Countdown refreshes on each MessageBus progress update
- Phase translations and remaining label passed via data attributes
- Remove encoding phases display (technical detail, not useful for users)
- Interpolate progress percentage between 15s MessageBus updates
  for smooth continuous movement
- ETA countdown ticks every second: "zbývá ~9:32"
- Progress and ETA share a single 1s ticker interval
Track displayed interpolated value and use Math.max on server update
so progress never visually decreases.
Move all encoding progress interpolation and ETA countdown logic from
show_component.js into encoding_info_component.js. The ticker JS now
only loads when the encoding info component is actually rendered.
Also remove unused phases_completed from broadcast payload.
Remove ENCODING_RATIO constant (was 0.35, actual ~0.70) and compute
ETA purely from elapsed time and CRA progress fraction.
…l initial state, pause interpolation when ahead
…s, env-unique reference IDs, improve progress UI

- Handle CRA REMOVED status in CheckProgressJob to stop infinite polling
- Add environment prefix to reference IDs to prevent cross-env collisions
- Prevent MonitorProcessingJob from creating duplicate CreateMediaJobs
- Add REMOVED and WAITING to JobResolver STATUS_MAP
- Anchor-based ETA calculation using CRA phase field
- Mock progress 0-25% during phase 0 based on file size
- Stale detection (45s threshold) freezes progress when backend stops updating
- CSS :has() selector pulses processing state dot
- Reduce initial CheckProgressJob delay from 30s to 10s
Remap CRA's nonlinear 0→1 progress to user-friendly 0→100 scale
(25→90% for encoding phase), replace anchor-based ETA with simple
elapsed/progress extrapolation, handle VALIDATING status, add
last_progress_check_at for stale detection, simplify frontend by
removing progressSlowCrawl logic.
Mock progress uses hyperbolic curve (asymptotically approaches 25%
but never reaches it) instead of linear ramp. ETA clamp ensures
estimated_completion_at can only shrink, never push further out.
Replace strict "never grow" clamp with a padding multiplier that
starts at ~1.5x when little data is available and shrinks toward
1.0x as progress advances. Produces naturally shrinking estimates.
DISPLAY_ENCODING_START was 25% while frontend MOCK_PROGRESS_CAP was
30%, causing potential backward jumps when server value arrived below
mock. Both now start at 30%.
Don't set progress_percentage until CRA status is PROCESSING — keeps
frontend in mock mode during WAITING/CREATED/VALIDATING instead of
jumping to 30% and stalling. Smooth ETA with exponential moving
average (70/30 blend) to prevent oscillation from CRA jitter.
…ress, interpolation

Show only what we know: current phase (waiting/encoding/packaging) and
raw CRA progress during encoding. No JS tickers, no mock curves, no
ETA calculations.
Add `unique :until_and_while_executing` (same pattern as Mux) to prevent
duplicate jobs in queue. Wrap state updates with `with_lock` to prevent
concurrent writes to remote_services_data.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant