Skip to content

Commit 8097005

Browse files
Merge pull request #1602 from StackVista/stac-22541
STAC-22541: Derived state monitors
2 parents 950cfe3 + a1d7c57 commit 8097005

3 files changed

Lines changed: 49 additions & 13 deletions

File tree

SUMMARY.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@
3333
* [Troubleshooting](use/alerting/notifications/troubleshooting.md)
3434
* [Customize](dynamic/customize-alerting.md)
3535
* [Add a monitor using the CLI](use/alerting/k8s-add-monitors-cli.md)
36+
* [Derived State monitor](use/alerting/k8s-derived-state-monitors.md)
3637
* [Override monitor arguments](use/alerting/k8s-override-monitor-arguments.md)
3738
* [Write a remediation guide](use/alerting/k8s-write-remediation-guide.md)
3839

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
---
2+
description: SUSE Observability
3+
---
4+
5+
# Derived State Monitors
6+
7+
## Overview
8+
9+
In Observability scenarios where logical (business) components lack direct monitors but are affected by issues in their technical dependencies, you can use the derived-state-monitor function to derive a state from the connected technical components for the logical component.
10+
This monitor traverses component dependencies and selects the most critical health state based on direct observations (e.g., from metrics), ignoring any already-derived states. It will apply the derived state to all components selected through the `componentTypes` parameter.
11+
During traversal, only components with observed (non-derived) health states are considered for health derivation. Components with derived states are skipped in evaluation but still traversed to reach deeper dependencies—for example, logical components depending on other logical components.
12+
13+
## Derived Health State Monitor example
14+
15+
A Monitor implemented using the `derived-state-monitor` function looks like:
16+
17+
```
18+
- _type: "Monitor"
19+
name: "Aggregated health state of a Deployment, StatefulSet, ReplicaSet and DaemonSet"
20+
tags:
21+
- deployments
22+
- replicasets
23+
- statefulsets
24+
- daemonsets
25+
- derived
26+
- propagated
27+
identifier: "urn:custom:monitor:..."
28+
status: "DISABLED"
29+
description: "Description"
30+
function: {{ get "urn:stackpack:common:monitor-function:derived-state-monitor" }}
31+
arguments:
32+
componentTypes: "deployment, replicaset, statefulset, daemonset"
33+
intervalSeconds: 30
34+
remediationHint: "Investigate component [{{ causeName }}](/#/components/{{ causeComponentUrnForUrl }}) as is causing the workload to be unhealthy."
35+
```
36+
* The function has a single argument `componentTypes` where you can express the different component types as a single string of `,` separated values
37+
* The function offers two values to use in the remediation guide, `causeComponentName` being the component name where the state is propagated from and its `causeComponentUrnForUrl` to be able to create a link
38+
39+
The monitor can be implemented using the guide at [Add a threshold monitor to components using the CLI](/use/alerting/k8s-add-monitors-cli.md)

use/alerting/kubernetes-monitors.md

Lines changed: 9 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -144,22 +144,18 @@ Cluster doesn't have any health itself. But a cluster is build from few componen
144144
- all nodes
145145
and then takes the most critical health state.
146146

147-
### Aggregated health state of a DaemonSet
147+
### Derived Workloads health state (Deployment, DaemonSet, ReplicaSet, StatefulSet)
148148

149-
The monitor aggregates states of all children Pods and then returns the most critical health state.
149+
The monitor aggregates states of all top-most dependencies and then returns the most critical health state based on direct observations (e.g., from metrics).
150+
This approach ensures that health signals propagate from low-level technical components (like Pods) to higher-level logical components, but only when the component itself lacks an observed health state.
151+
To use this monitor effectively, make sure that some or all of following health checks are disabled:
152+
* Deployment desired replicas
153+
* DaemonSet desired replicas
154+
* ReplicaSet desired replicas
155+
* StatefulSet desired replicas
150156

151-
### Aggregated health state of a Deployment
157+
If you have a use case where logical components have no direct monitors then you can use the [Derived State Monitor](/use/alerting/k8s-derived-state-monitors.md) function to infer their health based on the technical components they depend on.
152158

153-
The monitor aggregates states of all children ReplicaSets and then returns the most critical health state. ReplicaSets have
154-
the similar Monitor, so eventually this one aggregates health states of all children ReplicaSets and Pods.
155-
156-
### Aggregated health state of a ReplicaSet
157-
158-
The monitor aggregates states of all children Pods and then returns the most critical health state.
159-
160-
### Aggregated health state of a StatefulSet
161-
162-
The monitor aggregates states of all children Pods and then returns the most critical health state.
163159

164160
## See also
165161

0 commit comments

Comments
 (0)