From 77294e1e74fd2d4e58da0456ec8264ead4a623d4 Mon Sep 17 00:00:00 2001 From: "github-actions[bot]" <41898282+github-actions[bot]@users.noreply.github.com> Date: Mon, 16 Feb 2026 11:06:58 +0000 Subject: [PATCH] docs: address issue #24139 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This change was automatically generated by the documentation agent team in response to issue #24139. 🤖 Generated with cagent --- content/manuals/engine/daemon/prometheus.md | 80 +++++++++++++++++++++ 1 file changed, 80 insertions(+) diff --git a/content/manuals/engine/daemon/prometheus.md b/content/manuals/engine/daemon/prometheus.md index 5e194d627da4..0ff15a9259bf 100644 --- a/content/manuals/engine/daemon/prometheus.md +++ b/content/manuals/engine/daemon/prometheus.md @@ -150,6 +150,86 @@ traffic caused by the container you just ran. ![Prometheus report showing traffic](images/prometheus-graph_load.webp) +## Available metrics + +Docker exposes metrics in Prometheus format. This section describes the available metrics and their meaning. + +> [!WARNING] +> +> The available metrics and the names of those metrics are in active +> development and may change at any time. + +### Metric types + +Docker metrics use the following Prometheus metric types: + +- **Counter**: A cumulative metric that only increases (or resets to zero on restart). Use counters for values like total number of events or requests. +- **Gauge**: A metric that can go up or down. Use gauges for values like current memory usage or number of running containers. +- **Histogram**: A metric that samples observations and counts them in configurable buckets. Histograms expose multiple time series: + - `_bucket{le=""}`: Cumulative counters for observation buckets + - `_sum`: Total sum of all observed values + - `_count`: Count of events that have been observed + +For histogram metrics, you can calculate averages, percentiles, and rates. For example, to calculate the average duration: `rate(_sum[5m]) / rate(_count[5m])`. + +### Engine metrics + +These metrics provide information about the Docker Engine's operation and resource usage. + +| Metric | Type | Description | +| ------------------------------------------- | --------- | ---------------------------------------------------------------------------------------------------------------------------- | +| `engine_daemon_container_actions_seconds` | Histogram | Time taken to process container operations (start, stop, create, etc.). Labels indicate the action type. | +| `engine_daemon_container_states_containers` | Gauge | Number of containers currently in each state (running, paused, stopped). Labels indicate the state. | +| `engine_daemon_engine_cpus_cpus` | Gauge | Number of CPUs available on the host system. | +| `engine_daemon_engine_info` | Gauge | Static information about the Docker Engine. Always set to 1. Labels provide version, architecture, and other engine details. | +| `engine_daemon_engine_memory_bytes` | Gauge | Total memory available on the host system in bytes. | +| `engine_daemon_events_subscribers_total` | Gauge | Number of current subscribers to Docker events. | +| `engine_daemon_events_total` | Counter | Total number of events processed by the daemon. Labels indicate the event action and type. | +| `engine_daemon_health_checks_failed_total` | Counter | Total number of health checks that have failed. | +| `engine_daemon_health_checks_total` | Counter | Total number of health checks performed. | +| `engine_daemon_host_info_functions_seconds` | Histogram | Time taken to gather host information. | +| `engine_daemon_network_actions_seconds` | Histogram | Time taken to process network operations (create, connect, disconnect, etc.). Labels indicate the action type. | + +### Swarm metrics + +These metrics are only available when the Docker Engine is running in Swarm mode. + +| Metric | Type | Description | +| ------------------------------------------------ | --------- | ----------------------------------------------------------------------------------------------- | +| `swarm_dispatcher_scheduling_delay_seconds` | Histogram | Time from task creation to scheduling decision. Measures scheduler performance. | +| `swarm_manager_configs_total` | Gauge | Total number of configs in the swarm cluster. | +| `swarm_manager_leader` | Gauge | Indicates if this node is the swarm manager leader (1) or not (0). | +| `swarm_manager_networks_total` | Gauge | Total number of networks in the swarm cluster. | +| `swarm_manager_nodes` | Gauge | Number of nodes in the swarm cluster. Labels indicate node state (ready, down, etc.). | +| `swarm_manager_secrets_total` | Gauge | Total number of secrets in the swarm cluster. | +| `swarm_manager_services_total` | Gauge | Total number of services in the swarm cluster. | +| `swarm_manager_tasks_total` | Gauge | Total number of tasks in the swarm cluster. Labels indicate task state (running, failed, etc.). | +| `swarm_node_manager` | Gauge | Indicates if this node is a swarm manager (1) or worker (0). | +| `swarm_raft_snapshot_latency_seconds` | Histogram | Time taken to create and restore Raft snapshots. | +| `swarm_raft_transaction_latency_seconds` | Histogram | Time taken to commit Raft transactions. Measures consensus performance. | +| `swarm_store_batch_latency_seconds` | Histogram | Time taken for batch operations in the swarm store. | +| `swarm_store_lookup_latency_seconds` | Histogram | Time taken for lookup operations in the swarm store. | +| `swarm_store_memory_store_lock_duration_seconds` | Histogram | Duration of lock acquisitions in the memory store. | +| `swarm_store_read_tx_latency_seconds` | Histogram | Time taken for read transactions in the swarm store. | +| `swarm_store_write_tx_latency_seconds` | Histogram | Time taken for write transactions in the swarm store. | + +### Using histogram metrics + +For histogram metrics (those with `_seconds` in the name), Prometheus creates three time series: + +- `_bucket`: Cumulative counters for each configured bucket +- `_sum`: Total sum of all observed values +- `_count`: Total count of observations + +For example, `engine_daemon_container_actions_seconds` produces: + +- `engine_daemon_container_actions_seconds_bucket{action="start",le="0.005"}`: Count of start actions taking ≤5ms +- `engine_daemon_container_actions_seconds_bucket{action="start",le="0.01"}`: Count of start actions taking ≤10ms +- `engine_daemon_container_actions_seconds_sum{action="start"}`: Total time spent on start actions +- `engine_daemon_container_actions_seconds_count{action="start"}`: Total number of start actions + +Use these to calculate percentiles, averages, and rates in your Prometheus queries. + ## Next steps The example provided here shows how to run Prometheus as a container on your