Skip to content

Commit 3b05d71

Browse files
committed
OSDOCS-16874-5: CQA for SCALE-4 Low-Latency/CNF Debugging and Related
1 parent 1fb58ef commit 3b05d71

8 files changed

Lines changed: 62 additions & 49 deletions

modules/node-observability-create-custom-resource.adoc

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,12 @@
66
[id="creating-node-observability-custom-resource_{context}"]
77
= Creating the Node Observability custom resource
88

9-
You must create and run the `NodeObservability` custom resource (CR) before you run the profiling query. When you run the `NodeObservability` CR, it creates the necessary machine config and machine config pool CRs to enable the CRI-O profiling on the worker nodes matching the `nodeSelector`.
9+
[role="_abstract"]
10+
You must create and run the `NodeObservability` custom resource (CR) before you run the profiling query. When you run the `NodeObservability` CR, the CR creates the necessary machine config and machine config pool CRs to enable the CRI-O profiling on the compute nodes matching the `nodeSelector`.
1011

1112
[IMPORTANT]
1213
====
13-
If CRI-O profiling is not enabled on the worker nodes, the `NodeObservabilityMachineConfig` resource gets created. Worker nodes matching the `nodeSelector` specified in `NodeObservability` CR restarts. This might take 10 or more minutes to complete.
14+
If CRI-O profiling is not enabled on the compute nodes, the `NodeObservabilityMachineConfig` resource gets created. Compute nodes matching the `nodeSelector` specified in `NodeObservability` CR restarts. This might take 10 or more minutes to complete.
1415
====
1516

1617
[NOTE]
@@ -21,8 +22,9 @@ Kubelet profiling is enabled by default.
2122
The CRI-O unix socket of the node is mounted on the agent pod, which allows the agent to communicate with CRI-O to run the pprof request. Similarly, the `kubelet-serving-ca` certificate chain is mounted on the agent pod, which allows secure communication between the agent and node's kubelet endpoint.
2223

2324
.Prerequisites
25+
2426
* You have installed the Node Observability Operator.
25-
* You have installed the OpenShift CLI (oc).
27+
* You have installed the {oc-first}.
2628
* You have access to the cluster with `cluster-admin` privileges.
2729
2830
.Procedure
@@ -45,25 +47,25 @@ $ oc project node-observability-operator
4547
+
4648
[source,yaml]
4749
----
48-
apiVersion: nodeobservability.olm.openshift.io/v1alpha2
49-
kind: NodeObservability
50-
metadata:
51-
name: cluster <1>
52-
spec:
53-
nodeSelector:
54-
kubernetes.io/hostname: <node_hostname> <2>
55-
type: crio-kubelet
50+
apiVersion: nodeobservability.olm.openshift.io/v1alpha2
51+
kind: NodeObservability
52+
metadata:
53+
name: cluster
54+
spec:
55+
nodeSelector:
56+
kubernetes.io/hostname: <node_hostname>
57+
type: crio-kubelet
5658
----
57-
<1> You must specify the name as `cluster` because there should be only one `NodeObservability` CR per cluster.
58-
<2> Specify the nodes on which the Node Observability agent must be deployed.
59+
+
60+
** `metadata.name`: Specifies the name as `cluster` because there should be only one `NodeObservability` CR per cluster.
61+
** `spec.nodeSelector.kubernetes.io/hostname`: Specifies the nodes on which the Node Observability agent must be deployed.
5962

6063
. Run the `NodeObservability` CR:
6164
+
6265
[source,terminal]
6366
----
6467
oc apply -f nodeobservability.yaml
6568
----
66-
6769
+
6870
.Example output
6971
[source,terminal]
@@ -77,7 +79,6 @@ nodeobservability.olm.openshift.io/cluster created
7779
----
7880
$ oc get nob/cluster -o yaml | yq '.status.conditions'
7981
----
80-
8182
+
8283
.Example output
8384
[source,terminal]
@@ -91,6 +92,5 @@ conditions:
9192
status: "True"
9293
type: Ready
9394
----
94-
9595
+
9696
`NodeObservability` CR run is completed when the reason is `Ready` and the status is `True`.

modules/node-observability-high-level-workflow.adoc

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,9 @@
66
[id="workflow-node-observability-operator_{context}"]
77
= Workflow of the Node Observability Operator
88

9+
[role="_abstract"]
10+
To systematically query and analyze profiling data, follow the workflow for the Node Observability Operator. By understanding this process, you can collect metrics and troubleshoot performance issues on your compute nodes.
11+
912
The following workflow outlines on how to query the profiling data using the Node Observability Operator:
1013

1114
. Install the Node Observability Operator in the {product-title} cluster.

modules/node-observability-install-cli.adoc

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,12 @@
66
[id="install-node-observability-using-cli_{context}"]
77
= Installing the Node Observability Operator using the CLI
88

9-
You can install the Node Observability Operator by using the OpenShift CLI (oc).
9+
[role="_abstract"]
10+
You can install the Node Observability Operator by using the {oc-first}.
1011

1112
.Prerequisites
1213

13-
* You have installed the OpenShift CLI (oc).
14+
* You have installed the {oc-first}.
1415
* You have access to the cluster with `cluster-admin` privileges.
1516
1617
.Procedure
@@ -21,7 +22,6 @@ You can install the Node Observability Operator by using the OpenShift CLI (oc).
2122
----
2223
$ oc get packagemanifests -n openshift-marketplace node-observability-operator
2324
----
24-
2525
+
2626
.Example output
2727
[source,terminal]
@@ -78,7 +78,6 @@ EOF
7878
----
7979
$ oc -n node-observability-operator get sub node-observability-operator -o yaml | yq '.status.installplan.name'
8080
----
81-
8281
+
8382
.Example output
8483
[source,terminal]
@@ -94,7 +93,6 @@ $ oc -n node-observability-operator get ip <install_plan_name> -o yaml | yq '.st
9493
----
9594
+
9695
`<install_plan_name>` is the install plan name that you obtained from the output of the previous command.
97-
9896
+
9997
.Example output
10098
[source,terminal]
@@ -108,7 +106,6 @@ COMPLETE
108106
----
109107
$ oc get deploy -n node-observability-operator
110108
----
111-
112109
+
113110
.Example output
114111
[source,terminal]

modules/node-observability-install-web-console.adoc

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
[id="install-node-observability-using-web-console_{context}"]
77
= Installing the Node Observability Operator using the web console
88

9+
[role="_abstract"]
910
You can install the Node Observability Operator from the {product-title} web console.
1011

1112
.Prerequisites
@@ -16,16 +17,27 @@ You can install the Node Observability Operator from the {product-title} web con
1617
.Procedure
1718

1819
. Log in to the {product-title} web console.
20+
1921
. In the Administrator's navigation panel, select *Ecosystem* -> *Software Catalog*.
22+
2023
. In the *All items* field, enter *Node Observability Operator* and select the *Node Observability Operator* tile.
24+
2125
. Click *Install*.
26+
2227
. On the *Install Operator* page, configure the following settings:
28+
+
2329
.. In the *Update channel* area, click *alpha*.
30+
+
2431
.. In the *Installation mode* area, click *A specific namespace on the cluster*.
32+
+
2533
.. From the *Installed Namespace* list, select *node-observability-operator* from the list.
34+
+
2635
.. In the *Update approval* area, select *Automatic*.
36+
+
2737
.. Click *Install*.
2838

2939
.Verification
40+
3041
. In the Administrator's navigation panel, expand *Ecosystem* -> *Installed Operators*.
42+
3143
. Verify that the Node Observability Operator is listed in the Operators list.

modules/node-observability-installation.adoc

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44

55
:_mod-docs-content-type: CONCEPT
66
[id="install-node-observability-operator_{context}"]
7-
= Installing the Node Observability Operator
7+
= Node Observability Operator installation methods
88

9+
[role="_abstract"]
910
The Node Observability Operator is not installed in {product-title} by default. You can install the Node Observability Operator by using the {product-title} CLI or the web console.

modules/node-observability-run-profiling-query.adoc

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,14 +6,18 @@
66
[id="running-profiling-query_{context}"]
77
= Running the profiling query
88

9-
To run the profiling query, you must create a `NodeObservabilityRun` resource. The profiling query is a blocking operation that fetches CRI-O and Kubelet profiling data for a duration of 30 seconds. After the profiling query is complete, you must retrieve the profiling data inside the container file system `/run/node-observability` directory. The lifetime of data is bound to the agent pod through the `emptyDir` volume, so you can access the profiling data while the agent pod is in the `running` status.
9+
[role="_abstract"]
10+
To run the profiling query, you must create a `NodeObservabilityRun` resource. The profiling query is a blocking operation that fetches CRI-O and Kubelet profiling data for a duration of 30 seconds.
11+
12+
After the profiling query is complete, you must retrieve the profiling data inside the container file system `/run/node-observability` directory. The lifetime of data is bound to the agent pod through the `emptyDir` volume, so you can access the profiling data while the agent pod is in the `running` status.
1013

1114
[IMPORTANT]
1215
====
1316
You can request only one profiling query at any point of time.
1417
====
1518

1619
.Prerequisites
20+
1721
* You have installed the Node Observability Operator.
1822
* You have created the `NodeObservability` custom resource (CR).
1923
* You have access to the cluster with `cluster-admin` privileges.
@@ -31,6 +35,7 @@ metadata:
3135
spec:
3236
nodeObservabilityRef:
3337
name: cluster
38+
# ...
3439
----
3540

3641
. Trigger the profiling query by running the `NodeObservabilityRun` resource:
@@ -46,7 +51,6 @@ $ oc apply -f nodeobservabilityrun.yaml
4651
----
4752
$ oc get nodeobservabilityrun nodeobservabilityrun -o yaml | yq '.status.conditions'
4853
----
49-
5054
+
5155
.Example output
5256
[source,terminal]
@@ -63,7 +67,6 @@ conditions:
6367
status: "True"
6468
type: Finished
6569
----
66-
6770
+
6871
The profiling query is complete once the status is `True` and type is `Finished`.
6972

modules/node-observability-scripting-cr.adoc

Lines changed: 15 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,11 @@
66
[id="node-observability-scripting-cr_{context}"]
77
= Creating the Node Observability custom resource for scripting
88

9-
You must create and run the `NodeObservability` custom resource (CR) before you run the scripting. When you run the `NodeObservability` CR, it enables the agent in scripting mode on the compute nodes matching the `nodeSelector` label.
9+
[role="_abstract"]
10+
You must create and run the `NodeObservability` custom resource (CR) before you run the scripting. When you run the `NodeObservability` CR, the CR enables the agent in scripting mode on the compute nodes matching the `nodeSelector` label.
1011

1112
.Prerequisites
13+
1214
* You have installed the Node Observability Operator.
1315
* You have installed the {oc-first}.
1416
* You have access to the cluster with `cluster-admin` privileges.
@@ -33,27 +35,26 @@ $ oc project node-observability-operator
3335
+
3436
[source,yaml]
3537
----
36-
apiVersion: nodeobservability.olm.openshift.io/v1alpha2
37-
kind: NodeObservability
38-
metadata:
39-
name: cluster <1>
40-
spec:
41-
nodeSelector:
42-
kubernetes.io/hostname: <node_hostname> <2>
43-
type: scripting <3>
38+
apiVersion: nodeobservability.olm.openshift.io/v1alpha2
39+
kind: NodeObservability
40+
metadata:
41+
name: cluster
42+
spec:
43+
nodeSelector:
44+
kubernetes.io/hostname: <node_hostname>
45+
type: scripting <3>
4446
----
45-
<1> You must specify the name as `cluster` because there should be only one `NodeObservability` CR per cluster.
46-
<2> Specify the nodes on which the Node Observability agent must be deployed.
47-
<3> To deploy the agent in scripting mode, you must set the type to `scripting`.
48-
47+
+
48+
** `metadata.name`: Specifies the name as `cluster` because there should be only one `NodeObservability` CR per cluster.
49+
** `spec.nodeSelector.kubernetes.io/hostname`: Specifies the nodes on which the Node Observability agent must be deployed.
50+
** `spec.type`: Specifies the type to `scripting` to deploy the agent in scripting mode.
4951

5052
. Create the `NodeObservability` CR by running the following command:
5153
+
5254
[source,terminal]
5355
----
5456
$ oc apply -f nodeobservability.yaml
5557
----
56-
5758
+
5859
.Example output
5960
[source,terminal]
@@ -67,7 +68,6 @@ nodeobservability.olm.openshift.io/cluster created
6768
----
6869
$ oc get nob/cluster -o yaml | yq '.status.conditions'
6970
----
70-
7171
+
7272
.Example output
7373
[source,terminal]
@@ -81,6 +81,5 @@ conditions:
8181
status: "True"
8282
type: Ready
8383
----
84-
8584
+
8685
The `NodeObservability` CR run is completed when the `reason` is `Ready` and `status` is `"True"`.

scalability_and_performance/node-observability-operator.adoc

Lines changed: 5 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -6,10 +6,10 @@ include::_attributes/common-attributes.adoc[]
66

77
toc::[]
88

9-
The Node Observability Operator collects and stores CRI-O and Kubelet profiling or metrics from scripts of compute nodes.
10-
11-
With the Node Observability Operator, you can query the profiling data, enabling analysis of performance trends in CRI-O and Kubelet. It supports debugging performance-related issues and executing embedded scripts for network metrics by using the `run` field in the custom resource definition. To enable CRI-O and Kubelet profiling or scripting, you can configure the `type` field in the custom resource definition.
9+
[role="_abstract"]
10+
To analyze performance trends and debug issues on your compute nodes, use the Node Observability Operator to collect and query CRI-O and Kubelet metrics. By reviewing this profiling data, you can optimize system performance and execute embedded scripts for network analysis.
1211

12+
With the Node Observability Operator, you can query the profiling data, enabling analysis of performance trends in CRI-O and Kubelet. The Operator supports debugging performance-related issues and executing embedded scripts for network metrics by using the `run` field in the custom resource definition. To enable CRI-O and Kubelet profiling or scripting, you can configure the `type` field in the custom resource definition.
1313

1414
:FeatureName: The Node Observability Operator
1515
include::snippets/technology-preview.adoc[leveloffset=+0]
@@ -22,7 +22,6 @@ include::modules/node-observability-install-cli.adoc[leveloffset=+2]
2222

2323
include::modules/node-observability-install-web-console.adoc[leveloffset=+2]
2424

25-
2625
[id="requesting-crio-kubelet-profiling-using-noo_{context}"]
2726
== Requesting CRI-O and Kubelet profiling data using the Node Observability Operator
2827

@@ -32,13 +31,12 @@ include::modules/node-observability-create-custom-resource.adoc[leveloffset=+2]
3231

3332
include::modules/node-observability-run-profiling-query.adoc[leveloffset=+2]
3433

35-
3634
[id="node-observability-operator-scripting_{context}"]
3735
== Node Observability Operator scripting
3836

39-
Scripting allows you to run pre-configured bash scripts, using the current Node Observability Operator and Node Observability Agent.
37+
By scripting, you can run pre-configured bash scripts by using the current Node Observability Operator and Node Observability Agent.
4038

41-
These scripts monitor key metrics like CPU load, memory pressure, and worker node issues. They also collect sar reports and custom performance metrics.
39+
These scripts monitor key metrics like CPU load, memory pressure, and compute node issues. They also collect sar reports and custom performance metrics.
4240

4341
include::modules/node-observability-scripting-cr.adoc[leveloffset=+2]
4442

0 commit comments

Comments
 (0)