Context
Issue #3 defined a full implementation framework for priority-based concurrency throttling. PR #4 delivered the load detection and global throttle foundation as "Phase 1." This issue covers the remaining work needed to complete that vision.
What Phase 1 delivered (PR #4)
- Load probing (
Threads_running + AS queue depth) with normal/elevated/critical classification
- Hysteresis with separate enter/exit thresholds and minimum dwell time
- Global AS queue-runner throttling (batch size, time limit, concurrent batches)
- Four-stage rollout (
off → test_observe → observe → enforce)
- Cross-request state persistence (object cache → APCu → option fallback)
- Structured logging for all throttle events and load transitions
What Phase 1 cannot do
Distinguish between action types. Under load, all AS work slows equally — a NoFraud payment webhook gets the same treatment as a Facebook catalog sync. The May 1 incident needed the opposite: keep payments flowing, pause deferrable syncs.
Progressive implementation plan
Each wave can be deployed independently and provides value on its own. Later waves build on earlier ones but each is a shippable increment.
Wave A — Refactor & registry (no behavior change, safe to deploy anytime)
Structural prep work. No runtime behavior changes, so this can ship without a rollout plan.
Wave B — Per-action throttling (core feature, needs rollout plan)
The primary deliverable. Requires Wave A. Should follow the same observe → enforce rollout pattern as Phase 1.
Wave C — Thundering herd prevention (independent, can ship in any order)
Completely standalone — no dependency on Waves A or B. Addresses a different failure mode (N concurrent requests triggering the same expensive operation).
Wave D — Operator tooling (additive, ship as components land)
Each item here can ship as soon as the component it manages exists. No need to wait for everything.
Component details
Priority Registry (Wave A)
Map Action Scheduler hook names to priority tiers. Stored as a WordPress option, editable in admin (Wave D), with a filter for code-based overrides.
Default registry:
| Tier |
Hook patterns |
Behavior under load |
| Critical |
nofraud_*, woocommerce_payment_*, wc_payment_* |
Always runs |
| High |
woocommerce_scheduled_subscription_*, wcs_*, woocommerce_deliver_webhook_* |
Delayed 5-10 min during critical load |
| Normal |
woocommerce_run_*, action_scheduler_* |
Deferred +15 min during critical, +5 min during elevated |
| Deferrable |
facebook_for_woocommerce_*, wc_facebook_*, shipstation_*, klaviyo_*, woocommerce_flush_* |
Paused until load normalizes |
Public API: get_priority( $hook_name ) returns tier string. Supports prefix/wildcard matching. Unregistered hooks default to normal.
Throttle Engine — per-action decisions (Wave B)
Hook action_scheduler_before_execute to apply the priority × load decision matrix:
| Load Level |
Critical |
High |
Normal |
Deferrable |
| Normal |
Run |
Run |
Run |
Run |
| Elevated |
Run |
Run |
Defer +5 min |
Defer +15 min |
| Critical |
Run |
Defer +5 min |
Defer +15 min |
Defer +60 min |
Deferred actions are rescheduled, never deleted. After load returns to normal, a cooldown ramp gradually releases deferred actions over 2-3 cycles to prevent a burst.
This builds on Phase 1's existing load detection and batch-size reduction — Phase 1 slows the whole queue, Phase 2 adds selective filtering within it.
Mutex Guard (Wave C)
Prevents thundering herd patterns (e.g., 150 concurrent NoFraud admin_init scans).
API:
acquire_lock( $operation_key, $ttl = 60 ) — returns true if acquired, false if held
release_lock( $operation_key ) — early release
is_locked( $operation_key ) — check without acquiring
force_release( $operation_key ) — admin/CLI recovery
Storage: wp_options with autoload = no (not transients — transients may use object cache that isn't shared across PHP processes on some hosts). Lock expired when current_time > stored_timestamp + TTL.
Admin Settings (Wave D)
A settings page (or tab within an existing Hypercart admin page) with:
- Enable/disable toggles for throttle system and mutex guard
- Load threshold inputs (threads_running and queue depth for elevated/critical)
- Priority registry editor (hook patterns per tier)
- Reschedule delay configuration per load level
- Current load status indicator (normal/elevated/critical)
- Last 20 throttled actions with timestamps
WP-CLI Commands (Wave D)
wp queryguard throttle status # Current load level, batch size, active deferrals
wp queryguard throttle pause # Force critical mode (maintenance window)
wp queryguard throttle resume # Return to auto-detection
wp queryguard throttle history # Last 50 throttle events
wp queryguard mutex list # Show active locks
wp queryguard mutex release <key> # Force-release a stuck lock
Deployment summary
| Wave |
What ships |
Value on its own |
Depends on |
| A |
Load monitor extraction + priority registry |
Clean architecture, registry available for filters — no behavior change |
Nothing (safe anytime) |
| B |
Per-action throttle + per-action logging |
Selective throttling: payments keep flowing, deferrable syncs pause under load |
Wave A |
| C |
Mutex guard |
Prevents thundering herd (N concurrent expensive operations) |
Nothing (independent) |
| D |
Admin UI + WP-CLI |
Operator visibility and control without code deploys |
Respective waves |
Success criteria
From the original spec in #3:
- During May 1-like conditions: deferrable syncs auto-pause, critical payment processing continues, subscription renewals delay but complete
- Zero false positives during normal operation
- Admin visibility into throttle decisions in real-time
- No action is ever lost — only rescheduled
Dependencies
- #12 — Extract throttle engine into separate classes (covered by Wave A)
- Phase 1 runtime validation via
test_observe / observe on staging should complete first to confirm load detection baselines
References
Context
Issue #3 defined a full implementation framework for priority-based concurrency throttling. PR #4 delivered the load detection and global throttle foundation as "Phase 1." This issue covers the remaining work needed to complete that vision.
What Phase 1 delivered (PR #4)
Threads_running+ AS queue depth) with normal/elevated/critical classificationoff→test_observe→observe→enforce)What Phase 1 cannot do
Distinguish between action types. Under load, all AS work slows equally — a NoFraud payment webhook gets the same treatment as a Facebook catalog sync. The May 1 incident needed the opposite: keep payments flowing, pause deferrable syncs.
Progressive implementation plan
Each wave can be deployed independently and provides value on its own. Later waves build on earlier ones but each is a shippable increment.
Wave A — Refactor & registry (no behavior change, safe to deploy anytime)
Structural prep work. No runtime behavior changes, so this can ship without a rollout plan.
class-hcqg-load-monitor.php— refactor only, per #12. Moves existing probe and level-evaluation logic into its own class. Zero behavior change. #20class-hcqg-priority-registry.php) — map AS hook names to priority tiers (critical/high/normal/deferrable) with wildcard/prefix matching and a filterable default registry. Ships inert until Wave B wires it into throttle decisions. #21Wave B — Per-action throttling (core feature, needs rollout plan)
The primary deliverable. Requires Wave A. Should follow the same
observe→enforcerollout pattern as Phase 1.action_scheduler_before_executeto skip/defer individual actions based on the priority × load level decision matrix. Deferred actions are rescheduled, never deleted. Includes cooldown ramp after load normalizes. #22Wave C — Thundering herd prevention (independent, can ship in any order)
Completely standalone — no dependency on Waves A or B. Addresses a different failure mode (N concurrent requests triggering the same expensive operation).
class-hcqg-mutex-guard.php) — lock mechanism so only one request runs an expensive operation at a time.acquire_lock()/release_lock()/force_release()API. Useswp_optionswithautoload = no. #27Wave D — Operator tooling (additive, ship as components land)
Each item here can ship as soon as the component it manages exists. No need to wait for everything.
wp queryguard throttle status/pause/resume/history. Can ship after Wave B.wp queryguard mutex list/release. Can ship after Wave C.Component details
Priority Registry (Wave A)
Map Action Scheduler hook names to priority tiers. Stored as a WordPress option, editable in admin (Wave D), with a filter for code-based overrides.
Default registry:
nofraud_*,woocommerce_payment_*,wc_payment_*woocommerce_scheduled_subscription_*,wcs_*,woocommerce_deliver_webhook_*woocommerce_run_*,action_scheduler_*facebook_for_woocommerce_*,wc_facebook_*,shipstation_*,klaviyo_*,woocommerce_flush_*Public API:
get_priority( $hook_name )returns tier string. Supports prefix/wildcard matching. Unregistered hooks default tonormal.Throttle Engine — per-action decisions (Wave B)
Hook
action_scheduler_before_executeto apply the priority × load decision matrix:Deferred actions are rescheduled, never deleted. After load returns to normal, a cooldown ramp gradually releases deferred actions over 2-3 cycles to prevent a burst.
This builds on Phase 1's existing load detection and batch-size reduction — Phase 1 slows the whole queue, Phase 2 adds selective filtering within it.
Mutex Guard (Wave C)
Prevents thundering herd patterns (e.g., 150 concurrent NoFraud
admin_initscans).API:
acquire_lock( $operation_key, $ttl = 60 )— returns true if acquired, false if heldrelease_lock( $operation_key )— early releaseis_locked( $operation_key )— check without acquiringforce_release( $operation_key )— admin/CLI recoveryStorage:
wp_optionswithautoload = no(not transients — transients may use object cache that isn't shared across PHP processes on some hosts). Lock expired whencurrent_time > stored_timestamp + TTL.Admin Settings (Wave D)
A settings page (or tab within an existing Hypercart admin page) with:
WP-CLI Commands (Wave D)
Deployment summary
Success criteria
From the original spec in #3:
Dependencies
test_observe/observeon staging should complete first to confirm load detection baselinesReferences