Skip to content

Added remote feature-flag override support to the labs service#28251

Draft
jonatansberg wants to merge 5 commits into
mainfrom
feature/remote-feature-flags
Draft

Added remote feature-flag override support to the labs service#28251
jonatansberg wants to merge 5 commits into
mainfrom
feature/remote-feature-flags

Conversation

@jonatansberg
Copy link
Copy Markdown
Member

Summary

Adds an optional remote override layer to the labs feature-flag service. When configured, Ghost periodically fetches a small JSON manifest of flag overrides from a URL and overlays it on the local flag state, so flag values can be changed for a managed instance without cutting a release.

Off by default and fully inert unless remoteFlags is configured with a URL (and the instance has a hostSettings:siteId), so self-hosted and development installs are completely unaffected.

How it works

  • Resolver (services/remote-flags/resolve.js): a pure function that turns a sparse JSON manifest into a flat {flag: boolean} map for this instance. Each entry is a bare boolean (full override) or {value, percent} (a deterministic, stable percentage ramp). Validated with Zod; only flags already known to labs are honored; invalid or unknown entries are skipped individually.
  • Overlay (shared/labs.js): getAll() applies the resolved overrides at precedence config.labs > remote > GA_FEATURES > DB settings. An explicit config.labs pin always wins, and members is never remote-overridable.
  • Poller (services/remote-flags/): a fail-open background poller (conditional GET with ETag, last-known-good on error, empty on 404, never throws out of the loop). Started fire-and-forget from boot and gated entirely behind the remoteFlags config block.

Safety

  • Off by default; gated on explicit config plus a site id.
  • Fail-open: a missing, unreachable, or invalid manifest never changes flag state in a way that can break a site — it falls back to local flag state.
  • config.labs stays the unbeatable backstop, so a locally pinned flag cannot be overridden remotely.
  • Only flags already defined in labs.js can be targeted; defining a brand-new flag still requires a release.

Tests

Unit tests cover the resolver (including a pinned golden bucket value to guard ramp stability), the labs overlay precedence, the poller fail-open/caching/timer behaviour, and the config gating — plus an end-to-end test that a remote kill of a GA flag flips it off through the real labs.getAllFlags() set.

jonatansberg and others added 5 commits May 29, 2026 14:49
- a pure, dependency-free function that resolves a sparse JSON manifest of flag
  overrides into a flat {flag: boolean} map for a single instance, given the
  instance's site id and the set of known flags
- supports bare-boolean full overrides and {value, percent} ramps bucketed by a
  stable md5(flag:siteId) % 100 hash, so raising a percent only ever adds sites
- only honors keys in the supplied known-flags list and skips malformed/unknown
  entries individually: one bad entry can't discard the manifest, and the
  manifest can never introduce a flag core does not define
- never throws on hostile input (null options, non-array knownFlags, non-scalar
  siteId); a golden bucket value is pinned in tests to guard ramp stability
- standalone and inert until wired into the labs service in a later commit

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- labs.getAll() now overlays a remote override map between GA_FEATURES and
  config.labs, giving precedence config.labs > remote > GA_FEATURES > DB settings
- a remote entry can kill a GA-forced flag or beat a DB value, but an explicit
  config.labs pin still wins, and members is recomputed after the overlay so
  remote can never change it
- the store is module-global, shallow-copied on set so callers can't mutate live
  flag state, and fully inert when empty: no behaviour change until a later commit
  pushes resolved overrides in via setRemoteOverrides
- GA kill-switches and members cannot persist into the DB because the settings
  input serializer filters writes to WRITABLE_KEYS_ALLOWLIST (disjoint from GA)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- a fail-open background poller that fetches a JSON manifest over HTTP, resolves
  it for this instance, and pushes the result into labs via an injected sink
- conditional GET (ETag/If-None-Match); keeps last-known-good on network error,
  bad status, or unparseable body; applies empty on 404; never throws out of the
  poll loop, with a defensive backstop around resolve/apply
- the ETag is committed only after a manifest is successfully applied, so an
  apply failure cannot poison the cache into a 304 that skips the retry
- emits a structured remote_flags.applied event only when the resolved set
  changes, keyed on a sorted-key serialization so a mere key reorder does not
  emit a burst of spurious logs
- self-scheduling jittered poll with an unref'd timer, idempotent start, a single
  outstanding timer, and a re-entrancy guard so refreshes never race the ETag
- HTTP client, known-flags source, timing and randomness are injected so the
  fetch/cache/fail-open logic is fully unit-tested without real I/O or timers
- not wired into boot yet; inert until constructed and started

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- adds the remoteFlags config block (default enabled:false, url:null), a gated
  service index, and a hook in initBackgroundServices
- opt-in: init() constructs and starts the poller only when remoteFlags.enabled
  is true with a valid url and a hostSettings:siteId is present; otherwise it
  returns null and the feature is a complete no-op, so self-hosted and dev
  installs are untouched
- the manifest url is validated once at start so a misconfigured value fails
  loudly instead of warning on every poll
- start() is fire-and-forget so boot is never blocked on (or failed by) the first
  manifest fetch; the service is fail-open
- adds an end-to-end test that a remote kill of a GA flag flips it off through
  the actual labs.getAllFlags() set, guarding the kill-switch contract

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…knobs

- the resolver now validates each manifest entry with a Zod schema
  (bool | {value, percent?}) instead of hand-rolled type checks, aligning with
  the rest of the codebase (zod@4 is already used across apps). The known-flags
  list is still supplied by the labs service (no second list), and invalid or
  unknown entries are still skipped per-entry
- behaviour is unchanged for valid JSON; the only difference is that a non-finite
  or non-numeric percent (Infinity/NaN/null/string) is now rejected and skipped
  rather than coerced, which is more correct and cannot occur in real JSON
- removed the pollInterval/jitter config reads in the service index: they were
  never set anywhere, so they were dead knobs that looked tunable but were not.
  The service keeps its constructor params (used by tests) and built-in defaults
- fixed a garbled doc comment on stop()

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 29, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d2a0e065-df44-4a4d-aa27-ad8adae7b4d3

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/remote-feature-flags

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant