You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
You must create and run the `NodeObservability` custom resource (CR) before you run the profiling query. When you run the `NodeObservability` CR, it creates the necessary machine config and machine config pool CRs to enable the CRI-O profiling on the worker nodes matching the `nodeSelector`.
9
+
[role="_abstract"]
10
+
You must create and run the `NodeObservability` custom resource (CR) before you run the profiling query. When you run the `NodeObservability` CR, the CR creates the necessary machine config and machine config pool CRs to enable the CRI-O profiling on the compute nodes matching the `nodeSelector`.
10
11
11
12
[IMPORTANT]
12
13
====
13
-
If CRI-O profiling is not enabled on the worker nodes, the `NodeObservabilityMachineConfig` resource gets created. Worker nodes matching the `nodeSelector` specified in `NodeObservability` CR restarts. This might take 10 or more minutes to complete.
14
+
If CRI-O profiling is not enabled on the compute nodes, the `NodeObservabilityMachineConfig` resource gets created. Compute nodes matching the `nodeSelector` specified in `NodeObservability` CR restarts. This might take 10 or more minutes to complete.
14
15
====
15
16
16
17
[NOTE]
@@ -21,8 +22,9 @@ Kubelet profiling is enabled by default.
21
22
The CRI-O unix socket of the node is mounted on the agent pod, which allows the agent to communicate with CRI-O to run the pprof request. Similarly, the `kubelet-serving-ca` certificate chain is mounted on the agent pod, which allows secure communication between the agent and node's kubelet endpoint.
22
23
23
24
.Prerequisites
25
+
24
26
* You have installed the Node Observability Operator.
25
-
* You have installed the OpenShift CLI (oc).
27
+
* You have installed the {oc-first}.
26
28
* You have access to the cluster with `cluster-admin` privileges.
To systematically query and analyze profiling data, follow the workflow for the Node Observability Operator. By understanding this process, you can collect metrics and troubleshoot performance issues on your compute nodes.
11
+
9
12
The following workflow outlines on how to query the profiling data using the Node Observability Operator:
10
13
11
14
. Install the Node Observability Operator in the {product-title} cluster.
The Node Observability Operator is not installed in {product-title} by default. You can install the Node Observability Operator by using the {product-title} CLI or the web console.
Copy file name to clipboardExpand all lines: modules/node-observability-run-profiling-query.adoc
+6-3Lines changed: 6 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,14 +6,18 @@
6
6
[id="running-profiling-query_{context}"]
7
7
= Running the profiling query
8
8
9
-
To run the profiling query, you must create a `NodeObservabilityRun` resource. The profiling query is a blocking operation that fetches CRI-O and Kubelet profiling data for a duration of 30 seconds. After the profiling query is complete, you must retrieve the profiling data inside the container file system `/run/node-observability` directory. The lifetime of data is bound to the agent pod through the `emptyDir` volume, so you can access the profiling data while the agent pod is in the `running` status.
9
+
[role="_abstract"]
10
+
To run the profiling query, you must create a `NodeObservabilityRun` resource. The profiling query is a blocking operation that fetches CRI-O and Kubelet profiling data for a duration of 30 seconds.
11
+
12
+
After the profiling query is complete, you must retrieve the profiling data inside the container file system `/run/node-observability` directory. The lifetime of data is bound to the agent pod through the `emptyDir` volume, so you can access the profiling data while the agent pod is in the `running` status.
10
13
11
14
[IMPORTANT]
12
15
====
13
16
You can request only one profiling query at any point of time.
14
17
====
15
18
16
19
.Prerequisites
20
+
17
21
* You have installed the Node Observability Operator.
18
22
* You have created the `NodeObservability` custom resource (CR).
19
23
* You have access to the cluster with `cluster-admin` privileges.
@@ -31,6 +35,7 @@ metadata:
31
35
spec:
32
36
nodeObservabilityRef:
33
37
name: cluster
38
+
# ...
34
39
----
35
40
36
41
. Trigger the profiling query by running the `NodeObservabilityRun` resource:
Copy file name to clipboardExpand all lines: modules/node-observability-scripting-cr.adoc
+15-16Lines changed: 15 additions & 16 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,9 +6,11 @@
6
6
[id="node-observability-scripting-cr_{context}"]
7
7
= Creating the Node Observability custom resource for scripting
8
8
9
-
You must create and run the `NodeObservability` custom resource (CR) before you run the scripting. When you run the `NodeObservability` CR, it enables the agent in scripting mode on the compute nodes matching the `nodeSelector` label.
9
+
[role="_abstract"]
10
+
You must create and run the `NodeObservability` custom resource (CR) before you run the scripting. When you run the `NodeObservability` CR, the CR enables the agent in scripting mode on the compute nodes matching the `nodeSelector` label.
10
11
11
12
.Prerequisites
13
+
12
14
* You have installed the Node Observability Operator.
13
15
* You have installed the {oc-first}.
14
16
* You have access to the cluster with `cluster-admin` privileges.
To execute embedded scripts for network analysis, configure Node Observability Operator scripting. By setting up these custom scripts, you can debug performance-related issues on your compute nodes.
11
+
9
12
.Prerequisites
10
13
11
14
* You have installed the Node Observability Operator.
The example `Pod` specs illustrate pod interactions with Topology Manager.
9
+
[role="_abstract"]
10
+
To understand how the Topology Manager allocates hardware resources, review the example Pod specifications. By analyzing these interactions, you can properly configure your workloads for optimal alignment and performance.
10
11
11
-
The following pod runs in the `BestEffort` QoS class because no resource requests or limits are specified.
12
+
The following pod configuration example runs in the `BestEffort` QoS class because no resource requests or limits are specified:
12
13
13
14
[source,yaml]
14
15
----
@@ -18,7 +19,7 @@ spec:
18
19
image: nginx
19
20
----
20
21
21
-
The next pod runs in the `Burstable` QoS class because requests are less than limits.
22
+
The pod configuration example pod runs in the `Burstable` QoS class because requests are less than limits:
22
23
23
24
[source,yaml]
24
25
----
@@ -34,10 +35,11 @@ spec:
34
35
----
35
36
36
37
If the selected policy is anything other than `none`, Topology Manager would process all the pods and it enforces resource alignment only for the `Guaranteed` Qos `Pod` specification.
38
+
37
39
When the Topology Manager policy is set to `none`, the relevant containers are pinned to any available CPU without considering NUMA affinity. This is the default behavior and it does not optimize for performance-sensitive workloads.
38
40
Other values enable the use of topology awareness information from device plugins core resources, such as CPU and memory. The Topology Manager attempts to align the CPU, memory, and device allocations according to the topology of the node when the policy is set to other values than `none`. For more information about the available values, see _Topology Manager policies_.
39
41
40
-
The following example pod runs in the `Guaranteed` QoS class because requests are equal to limits.
42
+
The following example pod configuration runs in the `Guaranteed` QoS class because requests are equal to limits:
0 commit comments