Skip to content

[10.2.x] ATS Configuration Reload with observability/tracing - Token model (#12892)#13354

Open
masaori335 wants to merge 1 commit into
apache:10.2.xfrom
masaori335:asf-10.2.x-traffic-ctl
Open

[10.2.x] ATS Configuration Reload with observability/tracing - Token model (#12892)#13354
masaori335 wants to merge 1 commit into
apache:10.2.xfrom
masaori335:asf-10.2.x-traffic-ctl

Conversation

@masaori335

Copy link
Copy Markdown
Contributor

Backport #12892


ATS Configuration Reload with observability/tracing — Token model Replace the fire-and-forget configuration reload mechanism with a new token-based, observable reload framework. Every reload operation is now assigned a unique token, tracked through a task tree, and queryable via CLI or JSONRPC at any point after submission.

Core components introduced:

  • ConfigRegistry: centralized singleton for config file registration, filename records, trigger records, and reload handlers. Replaces the scattered registration across AddConfigFilesHere.cc and individual modules.
  • ReloadCoordinator: manages reload session lifecycle including token generation, concurrency control (--force to override), timeout detection, and rolling history.
  • ConfigReloadTask: tracks a single reload as a tree of sub-tasks with per-handler status, timings, and logs.
  • ConfigContext: lightweight context passed to handlers providing in_progress(), complete(), fail(), log(), supplied_yaml(), and add_dependent_ctx(). Safe no-op at startup when no reload is active.
  • ConfigReloadProgress: periodic checker that detects stuck tasks and marks them as TIMEOUT.

New traffic_ctl commands:

  • config reload [-m] [-t ] [-d @file] [--force]

  • config status [-t ] [-c all]

    All commands support --format json for automation and CI pipelines.

New JSONRPC APIs:

  • admin_config_reload: unified file-based or inline reload with token, force, and configs parameters.
  • get_reload_config_status: query reload status by token or get the last N reloads.

Migrated config handlers to ConfigRegistry: ip_allow, cache_control, cache_hosting, parent_proxy, split_dns, remap, logging, ssl_client_coordinator (with sni.yaml and ssl_multicert.config as dependencies), ssl_ticket_key, records, and pre-warm. Static configs (storage, volume, plugin, socks, jsonrpc) registered as inventory-only.

Removed legacy ConfigUpdateHandler/ConfigUpdateContinuation from ConfigProcessor.h. Removed AddConfigFilesHere.cc in favor of per-module self-registration.

Fixed duplicate handler execution for configs with multiple trigger records (e.g. ssl_client_coordinator) by deduplicating against the ConfigReloadTask subtask tree.

Added RecFlushConfigUpdateCbs() to synchronously fire pending record callbacks after rereadConfig(), ensuring all subtasks are registered before the first status poll.

New configuration records:

  • proxy.config.admin.reload.timeout (default: 1h)
  • proxy.config.admin.reload.check_interval (default: 2s)

Backward compatible: existing traffic_ctl config reload works as before; internally it now uses the new framework with automatic token assignment and tracking.

(cherry picked from commit 5bab268)

Conflicts:
include/tscore/ArgParser.h
src/iocore/cache/P_CacheHosting.h
src/iocore/hostdb/CMakeLists.txt
src/proxy/ReverseProxy.cc
src/records/CMakeLists.txt
src/tscore/ArgParser.cc

…ache#12892)

ATS Configuration Reload with observability/tracing — Token model
Replace the fire-and-forget configuration reload mechanism with a new
token-based, observable reload framework. Every reload operation is now
assigned a unique token, tracked through a task tree, and queryable via
CLI or JSONRPC at any point after submission.

Core components introduced:

- ConfigRegistry: centralized singleton for config file registration,
  filename records, trigger records, and reload handlers. Replaces the
  scattered registration across AddConfigFilesHere.cc and individual
  modules.
- ReloadCoordinator: manages reload session lifecycle including token
  generation, concurrency control (--force to override), timeout
  detection, and rolling history.
- ConfigReloadTask: tracks a single reload as a tree of sub-tasks with
  per-handler status, timings, and logs.
- ConfigContext: lightweight context passed to handlers providing
  in_progress(), complete(), fail(), log(), supplied_yaml(), and
  add_dependent_ctx(). Safe no-op at startup when no reload is active.
- ConfigReloadProgress: periodic checker that detects stuck tasks and
  marks them as TIMEOUT.

New traffic_ctl commands:

- config reload [-m] [-t <token>] [-d @file] [--force]
- config status [-t <token>] [-c all]

  All commands support --format json for automation and CI pipelines.

New JSONRPC APIs:

- admin_config_reload: unified file-based or inline reload with token,
  force, and configs parameters.
- get_reload_config_status: query reload status by token or get the
  last N reloads.

Migrated config handlers to ConfigRegistry: ip_allow, cache_control,
cache_hosting, parent_proxy, split_dns, remap, logging,
ssl_client_coordinator (with sni.yaml and ssl_multicert.config as
dependencies), ssl_ticket_key, records, and pre-warm. Static configs
(storage, volume, plugin, socks, jsonrpc) registered as inventory-only.

Removed legacy ConfigUpdateHandler/ConfigUpdateContinuation from
ConfigProcessor.h. Removed AddConfigFilesHere.cc in favor of
per-module self-registration.

Fixed duplicate handler execution for configs with multiple trigger
records (e.g. ssl_client_coordinator) by deduplicating against the
ConfigReloadTask subtask tree.

Added RecFlushConfigUpdateCbs() to synchronously fire pending record
callbacks after rereadConfig(), ensuring all subtasks are registered
before the first status poll.

New configuration records:
- proxy.config.admin.reload.timeout (default: 1h)
- proxy.config.admin.reload.check_interval (default: 2s)

Backward compatible: existing `traffic_ctl config reload` works as
before; internally it now uses the new framework with automatic token
assignment and tracking.

(cherry picked from commit 5bab268)

Conflicts:
	include/tscore/ArgParser.h
	src/iocore/cache/P_CacheHosting.h
	src/iocore/hostdb/CMakeLists.txt
	src/proxy/ReverseProxy.cc
	src/records/CMakeLists.txt
	src/tscore/ArgParser.cc
@masaori335 masaori335 added this to the 10.2.0 milestone Jul 1, 2026
@masaori335 masaori335 self-assigned this Jul 1, 2026
Copilot AI review requested due to automatic review settings July 1, 2026 22:39
@masaori335 masaori335 added Backport Marked for backport for an LTS patch release traffic_ctl traffic_ctl related work. Config Reload labels Jul 1, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR backports the token-based, observable configuration reload framework (originally #12892), replacing the prior fire-and-forget reload behavior with a tracked reload task tree that can be monitored via traffic_ctl and JSONRPC.

Changes:

  • Introduces ConfigRegistry + ReloadCoordinator + ConfigContext and wires core modules to self-register reload handlers with tokenized task tracking.
  • Adds/updates JSONRPC endpoints (admin_config_reload, get_reload_config_status) and extends traffic_ctl config reload/status to support tokens, monitor/details, inline YAML, and force.
  • Adds extensive AuTest + unit test coverage for reload lifecycle, deduplication, reserve-subtask behavior, and handler completion/timeout behavior.

Reviewed changes

Copilot reviewed 95 out of 96 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/gold_tests/traffic_ctl/traffic_ctl_test_utils.py Adds test helpers for traffic_ctl config reload/status and configurable expected return codes.
tests/gold_tests/traffic_ctl/traffic_ctl_config_reload.test.py New gold test coverage for tokenized traffic_ctl config reload/status behaviors.
tests/gold_tests/tls/tls_client_cert_plugin.test.py Adjusts readiness expectation count for SNI reload logging.
tests/gold_tests/remap/remap_reload.test.py Enables debug tags relevant to reload tracing for remap reload tests.
tests/gold_tests/parent_config/parent_config_reload.test.py New test validating parent.config reload via file touch + record-trigger.
tests/gold_tests/jsonrpc/jsonrpc_api_schema.test.py Temporarily disables schema assertion for admin_config_reload response.
tests/gold_tests/jsonrpc/json/admin_detached_config_reload_req.json New JSONRPC request fixture for admin_config_reload.
tests/gold_tests/jsonrpc/config_reload_tracking.test.py New JSONRPC test for token generation/history/basic status querying.
tests/gold_tests/jsonrpc/config_reload_reserve_subtask.test.py New test for reserve_subtask() behavior when records completes first.
tests/gold_tests/jsonrpc/config_reload_full_smoke.test.py New smoke test touching all registered configs + record-trigger reloads.
tests/gold_tests/jsonrpc/config_reload_dedup.test.py New deduplication test for multi-trigger ssl client coordinator paths.
tests/gold_tests/ip_allow/ip_category.test.py Extends debug tags to include config.reload for test observability.
tests/gold_tests/ip_allow/ip_allow_reload_triggered.test.py New functional test for ip_allow + ip_categories dependency reload behavior.
tests/gold_tests/dns/splitdns_reload.test.py New test validating splitdns reload handler invocation.
tests/gold_tests/cache/cache_config_reload.test.py New test validating cache.config + hosting.config reload via registry.
src/traffic_server/traffic_server.cc Registers records.yaml handler + static inventory files; adapts plugin callback registration.
src/traffic_server/RpcAdminPubHandlers.cc Registers new reload-related JSONRPC methods.
src/traffic_server/CMakeLists.txt Adjusts link ordering to include configmanager.
src/traffic_logstats/CMakeLists.txt Adds missing linkage to records/configmanager for new dependencies.
src/traffic_ctl/TrafficCtlStatus.h Adds CTRL_EX_TEMPFAIL (75) for in-progress/temporary failures.
src/traffic_ctl/traffic_ctl.cc Expands config reload/status CLI options for token/monitor/details/inline YAML.
src/traffic_ctl/jsonrpc/CtrlRPCRequests.h Adds structured reload/status request/response models (incl. YAML Node configs).
src/traffic_ctl/jsonrpc/ctrl_yaml_codecs.h Adds YAML codecs for reload/status requests and responses.
src/traffic_ctl/CtrlPrinters.h Adds printing helpers for reload task tree and progress line rendering.
src/traffic_ctl/CtrlPrinters.cc Implements reload progress bar + task tree reporting; adjusts JSON output behavior.
src/traffic_ctl/CtrlCommands.h Adds helpers for reload/status tracking, monitoring loop, and inline data loading.
src/records/unit_tests/test_ConfigReloadTask.cc New unit tests for reload task state/timeouts/stale behavior.
src/records/unit_tests/test_ConfigRegistry.cc New unit tests for registry resolve + dependency key routing/dedup.
src/records/RecordsConfig.cc Adds new dynamic records for reload timeout/check interval.
src/records/RecCore.cc Plumbs ConfigContext into unregistered-record warnings for reload logging.
src/records/P_RecCore.cc Adds RecFlushConfigUpdateCbs() to flush pending record callbacks synchronously.
src/records/CMakeLists.txt Links reload infrastructure into records and adds new unit tests.
src/proxy/ReverseProxy.cc Registers remap reload handler via ConfigRegistry; updates reloadUrlRewrite signature.
src/proxy/ParentSelection.cc Migrates parent.config reload to ConfigRegistry and adds ctx completion logging.
src/proxy/logging/LogConfig.cc Migrates logging reload triggers to registry and threads ConfigContext through deferred reload.
src/proxy/IPAllow.cc Migrates ip_allow + ip_categories dependency tracking to registry with ctx status reporting.
src/proxy/http2/CMakeLists.txt Links configmanager where needed due to new config reload components.
src/proxy/http/remap/unit-tests/CMakeLists.txt Links configmanager and adds non-Apple multiple-definition workaround.
src/proxy/http/PreWarmConfig.cc Migrates record-triggered prewarm config reload to registry with ctx completion.
src/proxy/hdrs/CMakeLists.txt Links configmanager for updated dependencies.
src/proxy/CMakeLists.txt Updates proxy link dependencies to include http/configmanager components.
src/proxy/CacheControl.cc Migrates cache_control reload to registry with ctx completion.
src/mgmt/rpc/handlers/config/Configuration.cc Implements unified tokenized reload (file vs inline configs) + status/history JSONRPC.
src/mgmt/rpc/CMakeLists.txt Links configmanager in JSONRPC server unit tests.
src/mgmt/config/ReloadCoordinator.cc New coordinator managing reload lifecycle, history, concurrency, and subtask reservation.
src/mgmt/config/FileManager.cc Routes records reload through registry and modernizes plugin callback storage.
src/mgmt/config/ConfigReloadExecutor.cc New ET_TASK continuation to run reload work and flush record callbacks.
src/mgmt/config/ConfigContext.cc New context implementation for progress/logging + injected YAML propagation to dependents.
src/mgmt/config/CMakeLists.txt Rebuilds configmanager library composition and dependencies around new components.
src/mgmt/config/AddConfigFilesHere.cc Removes legacy centralized config file registration.
src/iocore/net/SSLSNIConfig.cc Threads ConfigContext into SNI reload path for task tracking.
src/iocore/net/SSLConfig.cc Threads ConfigContext into SSL reload paths; migrates ssl_ticket_key to registry record triggers.
src/iocore/net/SSLClientCoordinator.cc Migrates SSL coordinator triggers/dependencies to registry and uses dependent contexts for subcomponents.
src/iocore/net/QUICMultiCertConfigLoader.cc Threads ConfigContext into QUIC cert reload.
src/iocore/net/quic/QUICConfig.cc Threads ConfigContext into QUIC config reload.
src/iocore/net/P_SSLConfig.h Updates SSL API signatures to accept optional ConfigContext.
src/iocore/net/P_SSLClientCoordinator.h Updates coordinator API signature and includes ConfigContext.
src/iocore/eventsystem/RecProcess.cc Moves config update debug logging under a dedicated dbg_ctl.
src/iocore/eventsystem/CMakeLists.txt Links configmanager in event system unit tests.
src/iocore/dns/SplitDNS.cc Migrates splitdns reload to registry with ctx completion/fail reporting.
src/iocore/cache/P_CacheHosting.h Removes legacy hosting.config callback scaffolding (migrated to registry).
src/iocore/cache/CacheHosting.cc Removes legacy hosting.config callback function.
src/iocore/cache/Cache.cc Registers cache_hosting reload handler late (after cache init) via registry.
src/iocore/aio/CMakeLists.txt Links configmanager in aio unit tests.
src/cripts/CMakeLists.txt Links yaml-cpp for new YAML usage in cripts library.
include/shared/rpc/yaml_codecs.h Extends try_extract to support caller-provided default values.
include/records/YAMLConfigReloadTaskEncoder.h New YAML encoder for reload task snapshots used in JSONRPC responses.
include/records/RecCore.h Exposes RecFlushConfigUpdateCbs() and updates unregistered-record warnings to accept ConfigContext.
include/proxy/ReverseProxy.h Updates remap reload API to accept ConfigContext.
include/proxy/ParentSelection.h Updates parent reconfigure API to accept optional ConfigContext.
include/proxy/logging/LogConfig.h Updates logging reconfigure API and stores reload context on LogConfig.
include/proxy/IPAllow.h Updates ip_allow reconfigure API to accept optional ConfigContext.
include/proxy/http/PreWarmConfig.h Updates prewarm reconfigure API to accept optional ConfigContext.
include/proxy/CacheControl.h Updates cache_control reload API to accept ConfigContext.
include/mgmt/rpc/handlers/config/Configuration.h Documents unified reload API and adds status query declaration.
include/mgmt/config/ReloadCoordinator.h New public interface for reload lifecycle management and subtask APIs.
include/mgmt/config/FileManager.h Updates plugin callback registration signature and removes legacy registry init declaration.
include/mgmt/config/ConfigReloadExecutor.h New API for scheduling async reload work on ET_TASK.
include/mgmt/config/ConfigReloadErrors.h New shared error code definitions for reload lifecycle and validation.
include/mgmt/config/ConfigContext.h New public handler context API for reload status/logging + injected YAML + dependent subtasks.
include/iocore/net/SSLSNIConfig.h Updates SNI reconfigure API to accept optional ConfigContext.
include/iocore/net/QUICMultiCertConfigLoader.h Updates QUIC cert reload API to accept optional ConfigContext.
include/iocore/net/quic/QUICConfig.h Updates QUIC config reload API to accept optional ConfigContext.
include/iocore/eventsystem/ConfigProcessor.h Removes legacy ConfigUpdateHandler/Continuation machinery.
include/iocore/dns/SplitDNSProcessor.h Updates splitdns reconfigure API and removes legacy handler pointer.
doc/developer-guide/jsonrpc/jsonrpc-api.en.rst Documents token-based reload + inline configs + status/history query endpoint.
doc/developer-guide/index.en.rst Adds new developer guide page entry for config reload framework docs.

Comment on lines +130 to +139
out.created_time = helper::try_extract<std::string>(node, "created_time");
for (auto &&msg : node["message"]) {
out.messages.push_back(msg.as<std::string>());
}
out.config_token = helper::try_extract<std::string>(node, "token");

for (auto &&element : node["tasks"]) {
ConfigReloadResponse::ReloadInfo task = get_info(get_info, element);
out.tasks.push_back(std::move(task));
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a false positive for yaml-cpp Iterating a missing key does not throw, it's a safe no-op and a well known idiom.

node here is const, so node["message"] uses the const operator[], which returns a Zombie node when the key is absent (node/impl.h):

Comment on lines +49 to +52
meta["created_time_ms"] = info.created_time_ms;
meta["last_updated_time_ms"] = info.last_updated_time_ms;
meta["main_task"] = info.main_task ? "true" : "false";

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't affect the JSONRPC output. The RPC response is serialized with YAML::DoubleQuoted (include/mgmt/rpc/jsonrpc/json/YAMLCodec.h):

YAML::Emitter json;
json << YAML::DoubleQuoted << YAML::Flow;
encode(resp, json);

"errors": [
{
"message": "Reload ongoing with token 'deploy-v2.1'",
"code": 1
"errors": [
{
"message": "Token 'nonexistent' not found",
"code": 4
@masaori335 masaori335 changed the title ATS Configuration Reload with observability/tracing - Token model (#12892) [10.2.x] ATS Configuration Reload with observability/tracing - Token model (#12892) Jul 3, 2026

@brbzull0 brbzull0 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

beside the red CI. Looks good to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Backport Marked for backport for an LTS patch release Config Reload traffic_ctl traffic_ctl related work.

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

3 participants