Skip to content

Commit 4a72eee

Browse files
committed
OSDOCS-16874-5: CQA for SCALE-4 Low-Latency/CNF Debugging and Related
1 parent 1b15140 commit 4a72eee

20 files changed

Lines changed: 213 additions & 151 deletions

modules/accessing-an-example-cluster-node-tuning-operator-specification.adoc

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -8,20 +8,23 @@
88
[id="accessing-an-example-node-tuning-operator-specification_{context}"]
99
= Accessing an example Node Tuning Operator specification
1010

11-
Use this process to access an example Node Tuning Operator specification.
11+
[role="_abstract"]
12+
To understand how to correctly format your tuning parameters, access an example Node Tuning Operator specification. By reviewing this template, you can properly configure node-level settings for your workloads.
13+
14+
For the example Node Tuning Operator specification provided in the procedure, the default CR is meant for delivering standard node-level tuning for the {product-title} platform and it can only be modified to set the Operator Management state. Any other custom changes to the default CR will be overwritten by the Operator. For custom tuning, create your own Tuned CRs. Newly created CRs will be combined with the default CR and custom tuning applied to {product-title} nodes based on node or pod labels and profile priorities.
15+
16+
[WARNING]
17+
====
18+
While in certain situations the support for pod labels can be a convenient way of automatically delivering required tuning, this practice is discouraged and strongly advised against, especially in large-scale clusters. The default Tuned CR ships without pod label matching. If a custom profile is created with pod label matching, then the functionality will be enabled at that time. The pod label functionality will be deprecated in future versions of the Node Tuning Operator.
19+
====
1220

1321
.Procedure
1422

15-
* Run the following command to access an example Node Tuning Operator specification:
23+
* Run the following command to access an example Node Tuning Operator specification:
1624
+
1725
[source,terminal]
1826
----
1927
oc get tuned.tuned.openshift.io/default -o yaml -n openshift-cluster-node-tuning-operator
2028
----
2129
22-
The default CR is meant for delivering standard node-level tuning for the {product-title} platform and it can only be modified to set the Operator Management state. Any other custom changes to the default CR will be overwritten by the Operator. For custom tuning, create your own Tuned CRs. Newly created CRs will be combined with the default CR and custom tuning applied to {product-title} nodes based on node or pod labels and profile priorities.
2330
24-
[WARNING]
25-
====
26-
While in certain situations the support for pod labels can be a convenient way of automatically delivering required tuning, this practice is discouraged and strongly advised against, especially in large-scale clusters. The default Tuned CR ships without pod label matching. If a custom profile is created with pod label matching, then the functionality will be enabled at that time. The pod label functionality will be deprecated in future versions of the Node Tuning Operator.
27-
====

modules/cluster-node-tuning-operator-default-profiles-set.adoc

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,10 @@
88
[id="custom-tuning-default-profiles-set_{context}"]
99
= Default profiles set on a cluster
1010

11-
The following are the default profiles set on a cluster.
11+
[role="_abstract"]
12+
To understand the baseline configurations automatically applied to your environment, review the default profiles set on a cluster. By analyzing these built-in settings, you can determine if additional node tuning is necessary for your specific workloads.
13+
14+
The following configuration example shows default profiles set on a cluster:
1215

1316
[source,yaml]
1417
----
@@ -32,10 +35,10 @@ spec:
3235
- label: node-role.kubernetes.io/infra
3336
- profile: openshift-node
3437
priority: 40
38+
# ...
3539
----
3640

37-
Starting with {product-title} 4.9, all OpenShift TuneD profiles are shipped with
38-
the TuneD package. You can use the `oc exec` command to view the contents of these profiles:
41+
Starting with {product-title} 4.9, all {product-title} TuneD profiles are shipped with the TuneD package. You can use the following `oc exec` command to view the contents of these profiles:
3942

4043
[source,terminal]
4144
----

modules/cluster-node-tuning-operator-verify-profiles.adoc

Lines changed: 19 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -4,15 +4,20 @@
44

55
:_mod-docs-content-type: PROCEDURE
66
[id="verifying-tuned-profiles-are-applied_{context}"]
7-
= Verifying that the TuneD profiles are applied
7+
= Verifying that the TuneD profiles are applied
88

9-
Verify the TuneD profiles that are applied to your cluster node.
9+
[role="_abstract"]
10+
To confirm that your node-level tuning configurations are active, verify the TuneD profiles applied to your cluster node. Checking these settings ensures that your system is correctly optimized for your specific workloads.
1011

12+
.Procedure
13+
14+
. Verify the TuneD profiles that are applied to your cluster node by entering the following command:
15+
+
1116
[source,terminal]
1217
----
1318
$ oc get profile.tuned.openshift.io -n openshift-cluster-node-tuning-operator
1419
----
15-
20+
+
1621
.Example output
1722
[source,terminal]
1823
----
@@ -23,27 +28,25 @@ master-2 openshift-control-plane True False 6h33m
2328
worker-a openshift-node True False 6h28m
2429
worker-b openshift-node True False 6h28m
2530
----
31+
+
32+
** `NAME`: Specifies the name of the Profile object. There is one Profile object per node and their names match.
33+
** `TUNED`: Specifies the name of the desired TuneD profile to apply.
34+
** `APPLIED`: Set as `True` if the TuneD daemon applied the desired profile. Supported values include `True`, `False`, and `Unknown`.
35+
** `DEGRADED`: Set as `True` if any errors were reported during application of the TuneD profile. Supported values include `True`, `False`, and `Unknown`.
36+
** `AGE`: Specifies the time elapsed since the creation of Profile object.
2637

27-
* `NAME`: Name of the Profile object. There is one Profile object per node and their names match.
28-
* `TUNED`: Name of the desired TuneD profile to apply.
29-
* `APPLIED`: `True` if the TuneD daemon applied the desired profile. (`True/False/Unknown`).
30-
* `DEGRADED`: `True` if any errors were reported during application of the TuneD profile (`True/False/Unknown`).
31-
* `AGE`: Time elapsed since the creation of Profile object.
32-
33-
The `ClusterOperator/node-tuning` object also contains useful information about the Operator and its node agents' health. For example, Operator misconfiguration is reported by `ClusterOperator/node-tuning` status messages.
34-
35-
To get status information about the `ClusterOperator/node-tuning` object, run the following command:
36-
38+
. To get status information about the `ClusterOperator/node-tuning` object, run the following command. The `ClusterOperator/node-tuning` object also contains useful information about the Operator and the health status of node agents. For example, Operator misconfiguration is reported by `ClusterOperator/node-tuning` status messages.
39+
+
3740
[source,terminal]
3841
----
3942
$ oc get co/node-tuning -n openshift-cluster-node-tuning-operator
4043
----
41-
44+
+
4245
.Example output
4346
[source,terminal,subs="attributes+"]
4447
----
4548
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
4649
node-tuning {product-version}.1 True False True 60m 1/5 Profiles with bootcmdline conflict
4750
----
48-
49-
If either the `ClusterOperator/node-tuning` or a profile object's status is `DEGRADED`, additional information is provided in the Operator or operand logs.
51+
+
52+
If either the `ClusterOperator/node-tuning` or the status of a profile object is `DEGRADED`, additional information is provided in the Operator or operand logs.

modules/custom-tuning-specification.adoc

Lines changed: 69 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,12 @@ endif::[]
1212
[id="custom-tuning-specification_{context}"]
1313
= Custom tuning specification
1414

15+
[role="_abstract"]
16+
To define custom node-level configurations for your workloads, review the custom tuning specification. By understanding the structure of the custom resource (CR) for the Operator, you can correctly format your TuneD profiles and selection logic.
17+
1518
The custom resource (CR) for the Operator has two major sections. The first section, `profile:`, is a list of TuneD profiles and their names. The second, `recommend:`, defines the profile selection logic.
1619

17-
Multiple custom tuning specifications can co-exist as multiple CRs in the Operator's namespace. The existence of new CRs or the deletion of old CRs is detected by the Operator. All existing custom tuning specifications are merged and appropriate objects for the containerized TuneD daemons are updated.
20+
Multiple custom tuning specifications can co-exist as multiple CRs in the namespace of the Operator. The existence of new CRs or the deletion of old CRs is detected by the Operator. All existing custom tuning specifications are merged and appropriate objects for the containerized TuneD daemons are updated.
1821

1922
*Management state*
2023

@@ -102,26 +105,39 @@ The individual items of the list:
102105
ifndef::rosa-hcp-tuning[]
103106
[source,yaml]
104107
----
105-
- machineConfigLabels: <1>
106-
<mcLabels> <2>
107-
match: <3>
108-
<match> <4>
109-
priority: <priority> <5>
110-
profile: <tuned_profile_name> <6>
111-
operand: <7>
112-
debug: <bool> <8>
108+
- machineConfigLabels:
109+
<mcLabels>
110+
match:
111+
<match>
112+
priority: <priority>
113+
profile: <tuned_profile_name>
114+
operand:
115+
debug: <bool>
113116
tunedConfig:
114-
reapply_sysctl: <bool> <9>
117+
reapply_sysctl: <bool>
115118
----
116-
<1> Optional.
117-
<2> A dictionary of key/value `MachineConfig` labels. The keys must be unique.
118-
<3> If omitted, profile match is assumed unless a profile with a higher priority matches first or `machineConfigLabels` is set.
119-
<4> An optional list.
120-
<5> Profile ordering priority. Lower numbers mean higher priority (`0` is the highest priority).
121-
<6> A TuneD profile to apply on a match. For example `tuned_profile_1`.
122-
<7> Optional operand configuration.
123-
<8> Turn debugging on or off for the TuneD daemon. Options are `true` for on or `false` for off. The default is `false`.
124-
<9> Turn `reapply_sysctl` functionality on or off for the TuneD daemon. Options are `true` for on and `false` for off.
119+
+
120+
where:
121+
+
122+
--
123+
`machineConfigLabels`:: Optional parameter.
124+
125+
`<mcLabels>`:: Specifies a dictionary of key/value `MachineConfig` labels. The keys must be unique.
126+
127+
`match`:: If omitted, profile match is assumed unless a profile with a higher priority matches first or `machineConfigLabels` is set.
128+
129+
`<match>`:: An optional list.
130+
131+
`priority`:: Specifies profile ordering priority. Lower numbers mean higher priority (`0` is the highest priority).
132+
133+
`<tuned_profile_name>`:: Specifies a TuneD profile to apply on a match. For example `tuned_profile_1`.
134+
135+
`operand`:: Optional operand configuration.
136+
137+
`debug`:: Turn debugging on or off for the TuneD daemon. Options are `true` for on or `false` for off. The default is `false`.
138+
139+
`tunedConfig.reapply_sysctl`:: Turn `reapply_sysctl` functionality on or off for the TuneD daemon. Options are `true` for on and `false` for off.
140+
--
125141
endif::rosa-hcp-tuning[]
126142
ifdef::rosa-hcp-tuning[]
127143
[source,json]
@@ -134,49 +150,65 @@ ifdef::rosa-hcp-tuning[]
134150
],
135151
"recommend": [
136152
{
137-
"profile": <tuned_profile_name>, <1>
138-
"priority":{ <priority>, <2>
153+
"profile": <tuned_profile_name>,
154+
"priority":{ <priority>,
139155
},
140-
"match": [ <3>
156+
"match": [
141157
{
142-
"label": <label_information> <4>
158+
"label": <label_information>
143159
},
144160
]
145161
},
146162
]
147163
}
148164
----
149-
<1> A TuneD profile to apply on a match. For example `tuned_profile_1`.
150-
<2> Profile ordering priority. Lower numbers mean higher priority (`0` is the highest priority).
151-
<3> If omitted, profile match is assumed unless a profile with a higher priority matches first.
152-
<4> The label for the profile matched items.
165+
+
166+
where:
167+
+
168+
--
169+
`profile`:: Specifies a TuneD profile to apply on a match. For example `tuned_profile_1`.
170+
171+
`priority`:: Specifies profile ordering priority. Lower numbers mean higher priority (`0` is the highest priority).
172+
173+
`match`:: If omitted, profile match is assumed unless a profile with a higher priority matches first.
174+
175+
`label`:: Specifies the label for the profile matched items.
176+
--
153177
endif::[]
154178

155179
`<match>` is an optional list recursively defined as follows:
156180

157181
ifndef::rosa-hcp-tuning[]
158182
[source,yaml]
159183
----
160-
- label: <label_name> <1>
161-
value: <label_value> <2>
162-
type: <label_type> <3>
163-
<match> <4>
184+
- label: <label_name>
185+
value: <label_value>
186+
type: <label_type>
187+
<match>
164188
----
165-
<1> Node or pod label name.
166-
<2> Optional node or pod label value. If omitted, the presence of `<label_name>` is enough to match.
167-
<3> Optional object type (`node` or `pod`). If omitted, `node` is assumed.
168-
<4> An optional `<match>` list.
189+
+
190+
where:
191+
+
192+
--
193+
`label`:: Specifies the node or pod label name.
194+
195+
`value`:: Specifies an optional node or pod label value. If omitted, the presence of `<label_name>` is enough to match.
196+
197+
`type`:: Specifies an optional object type, such as `node` or `pod`. If omitted, `node` is assumed.
198+
199+
`<match>` An optional `<match>` list.
200+
--
169201
endif::rosa-hcp-tuning[]
170202
ifdef::rosa-hcp-tuning[]
171203
[source,yaml]
172204
----
173205
"match": [
174206
{
175-
"label": <1>
207+
"label":
176208
},
177209
]
178210
----
179-
<1> Node or pod label name.
211+
** `label` Node or pod label name.
180212
endif::[]
181213

182214
If `<match>` is not omitted, all nested `<match>` sections must also evaluate to `true`. Otherwise, `false` is assumed and the profile with the respective `<match>` section will not be applied or recommended. Therefore, the nesting (child `<match>` sections) works as logical AND operator. Conversely, if any item of the `<match>` list matches, the entire `<match>` list evaluates to `true`. Therefore, the list acts as logical OR operator.

modules/node-observability-create-custom-resource.adoc

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,12 @@
66
[id="creating-node-observability-custom-resource_{context}"]
77
= Creating the Node Observability custom resource
88

9-
You must create and run the `NodeObservability` custom resource (CR) before you run the profiling query. When you run the `NodeObservability` CR, it creates the necessary machine config and machine config pool CRs to enable the CRI-O profiling on the worker nodes matching the `nodeSelector`.
9+
[role="_abstract"]
10+
You must create and run the `NodeObservability` custom resource (CR) before you run the profiling query. When you run the `NodeObservability` CR, the CR creates the necessary machine config and machine config pool CRs to enable the CRI-O profiling on the compute nodes matching the `nodeSelector`.
1011

1112
[IMPORTANT]
1213
====
13-
If CRI-O profiling is not enabled on the worker nodes, the `NodeObservabilityMachineConfig` resource gets created. Worker nodes matching the `nodeSelector` specified in `NodeObservability` CR restarts. This might take 10 or more minutes to complete.
14+
If CRI-O profiling is not enabled on the compute nodes, the `NodeObservabilityMachineConfig` resource gets created. Compute nodes matching the `nodeSelector` specified in `NodeObservability` CR restarts. This might take 10 or more minutes to complete.
1415
====
1516

1617
[NOTE]
@@ -21,8 +22,9 @@ Kubelet profiling is enabled by default.
2122
The CRI-O unix socket of the node is mounted on the agent pod, which allows the agent to communicate with CRI-O to run the pprof request. Similarly, the `kubelet-serving-ca` certificate chain is mounted on the agent pod, which allows secure communication between the agent and node's kubelet endpoint.
2223

2324
.Prerequisites
25+
2426
* You have installed the Node Observability Operator.
25-
* You have installed the OpenShift CLI (oc).
27+
* You have installed the {oc-first}.
2628
* You have access to the cluster with `cluster-admin` privileges.
2729
2830
.Procedure
@@ -45,25 +47,25 @@ $ oc project node-observability-operator
4547
+
4648
[source,yaml]
4749
----
48-
apiVersion: nodeobservability.olm.openshift.io/v1alpha2
49-
kind: NodeObservability
50-
metadata:
51-
name: cluster <1>
52-
spec:
53-
nodeSelector:
54-
kubernetes.io/hostname: <node_hostname> <2>
55-
type: crio-kubelet
50+
apiVersion: nodeobservability.olm.openshift.io/v1alpha2
51+
kind: NodeObservability
52+
metadata:
53+
name: cluster
54+
spec:
55+
nodeSelector:
56+
kubernetes.io/hostname: <node_hostname>
57+
type: crio-kubelet
5658
----
57-
<1> You must specify the name as `cluster` because there should be only one `NodeObservability` CR per cluster.
58-
<2> Specify the nodes on which the Node Observability agent must be deployed.
59+
+
60+
** `metadata.name`: Specifies the name as `cluster` because there should be only one `NodeObservability` CR per cluster.
61+
** `spec.nodeSelector.kubernetes.io/hostname`: Specifies the nodes on which the Node Observability agent must be deployed.
5962

6063
. Run the `NodeObservability` CR:
6164
+
6265
[source,terminal]
6366
----
6467
oc apply -f nodeobservability.yaml
6568
----
66-
6769
+
6870
.Example output
6971
[source,terminal]
@@ -77,7 +79,6 @@ nodeobservability.olm.openshift.io/cluster created
7779
----
7880
$ oc get nob/cluster -o yaml | yq '.status.conditions'
7981
----
80-
8182
+
8283
.Example output
8384
[source,terminal]
@@ -91,6 +92,5 @@ conditions:
9192
status: "True"
9293
type: Ready
9394
----
94-
9595
+
9696
`NodeObservability` CR run is completed when the reason is `Ready` and the status is `True`.

modules/node-observability-high-level-workflow.adoc

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,9 @@
66
[id="workflow-node-observability-operator_{context}"]
77
= Workflow of the Node Observability Operator
88

9+
[role="_abstract"]
10+
To systematically query and analyze profiling data, follow the workflow for the Node Observability Operator. By understanding this process, you can collect metrics and troubleshoot performance issues on your compute nodes.
11+
912
The following workflow outlines on how to query the profiling data using the Node Observability Operator:
1013

1114
. Install the Node Observability Operator in the {product-title} cluster.

0 commit comments

Comments
 (0)