Skip to content

Conversation

@qiliRedHat
Copy link
Contributor

@openshift-ci openshift-ci bot requested review from liqcui and svetsa-rh November 10, 2025 08:08
@openshift-ci
Copy link

openshift-ci bot commented Nov 10, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: qiliRedHat

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 10, 2025
@qiliRedHat
Copy link
Contributor Author

Test Result

# ./run_perf_test.sh 
--- Monitoring persistence is already configured ---
NAME          STATUS   AGE
dittybopper   Active   174m
--- Dittybopper is already installed ---
Prepare test...
Applying test pod to find the least usage node...
pod/test-pod created
pod/test-pod condition met
Cleaning up test pod...
pod "test-pod" deleted
Perf test will be run on node: ip-10-0-16-222.us-east-2.compute.internal
Label the perf test node ip-10-0-16-222.us-east-2.compute.internal with node-role.kubernetes.io/perf...
node/ip-10-0-16-222.us-east-2.compute.internal labeled
Generating the stress pod yaml...
--- Starting Test: Feature enabled = false, Stress Type = cpu ---
Collecting idle proxy metrics...
Applying cpu stress workload...
pod/cpu-stress-pod created
pod/cpu-stress-pod condition met
cpu stress workload is ready
Fri Nov  7 12:39:47 PM UTC 2025
Collecting proxy metrics under load after 180 seconds...
Cleaning up workload...
Fri Nov  7 12:42:47 PM UTC 2025
pod "cpu-stress-pod" deleted
--- Finished Test: baseline_feature_disabled_cpu_stress ---
--- Sleep 180s to let the previous stress to cool down ---
--- Starting Test: Feature enabled = false, Stress Type = memory ---
Collecting idle proxy metrics...
Applying memory stress workload...
pod/memory-stress-pod created
pod/memory-stress-pod condition met
memory stress workload is ready
Fri Nov  7 12:45:51 PM UTC 2025
Collecting proxy metrics under load after 180 seconds...
Cleaning up workload...
Fri Nov  7 12:48:52 PM UTC 2025
pod "memory-stress-pod" deleted
--- Finished Test: baseline_feature_disabled_memory_stress ---
--- Sleep 180s to let the previous stress to cool down ---
--- Starting Test: Feature enabled = false, Stress Type = io ---
Collecting idle proxy metrics...
Applying io stress workload...
pod/io-stress-pod created
pod/io-stress-pod condition met
io stress workload is ready
Fri Nov  7 12:52:24 PM UTC 2025
Collecting proxy metrics under load after 180 seconds...
Cleaning up workload...
Fri Nov  7 12:55:25 PM UTC 2025
pod "io-stress-pod" deleted
--- Finished Test: baseline_feature_disabled_io_stress ---
--- Sleep 180s to let the previous stress to cool down ---
==========================================
  OpenShift PSI Enablement Script
==========================================

[INFO] Checking prerequisites...
[SUCCESS] Prerequisites check passed

[INFO] Step 1: Checking current PSI status on worker nodes...
[INFO] Checking PSI status on all worker nodes...
[WARNING] PSI is NOT enabled on ip-10-0-16-222.us-east-2.compute.internal
[WARNING] PSI is NOT enabled on ip-10-0-51-255.us-east-2.compute.internal
[WARNING] PSI is NOT enabled on ip-10-0-84-37.us-east-2.compute.internal
[INFO] PSI is not fully enabled. Proceeding with enablement...

[INFO] Step 2: Creating MachineConfig YAML...
[INFO] Creating MachineConfig YAML: 99-worker-enable-psi.yaml
[SUCCESS] MachineConfig YAML created

[INFO] Step 3: Applying MachineConfig to cluster...
[INFO] Applying MachineConfig to cluster...
machineconfig.machineconfiguration.openshift.io/99-worker-enable-psi created
[SUCCESS] MachineConfig applied

[INFO] Current worker node status:
NAME                                        STATUS   ROLES         AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                                                KERNEL-VERSION                 CONTAINER-RUNTIME
ip-10-0-16-222.us-east-2.compute.internal   Ready    perf,worker   9h    v1.34.1   10.0.16.222   <none>        Red Hat Enterprise Linux CoreOS 9.6.20251105-0 (Plow)   5.14.0-570.62.1.el9_6.x86_64   cri-o://1.34.1-4.rhaos4.21.git5780ac7.el9
ip-10-0-51-255.us-east-2.compute.internal   Ready    worker        9h    v1.34.1   10.0.51.255   <none>        Red Hat Enterprise Linux CoreOS 9.6.20251105-0 (Plow)   5.14.0-570.62.1.el9_6.x86_64   cri-o://1.34.1-4.rhaos4.21.git5780ac7.el9
ip-10-0-84-37.us-east-2.compute.internal    Ready    worker        9h    v1.34.1   10.0.84.37    <none>        Red Hat Enterprise Linux CoreOS 9.6.20251105-0 (Plow)   5.14.0-570.62.1.el9_6.x86_64   cri-o://1.34.1-4.rhaos4.21.git5780ac7.el9

--- Wait 120 seconds for mcp to start updating ---
[INFO] Step 4: Waiting for all worker nodes to be updated...
[INFO] Waiting for MachineConfigPool 'worker' to update...
[INFO] This may take 30-60 minutes. Workers will reboot one by one.
[INFO] MCP Status: Updated=0/3, Ready=0/3, Degraded=0
[INFO] Waiting 30 seconds before next check... (elapsed: 0s)
[INFO] MCP Status: Updated=1/3, Ready=1/3, Degraded=0
[INFO] Waiting 30 seconds before next check... (elapsed: 30s)
[INFO] MCP Status: Updated=1/3, Ready=1/3, Degraded=0
[INFO] Waiting 30 seconds before next check... (elapsed: 60s)
[INFO] MCP Status: Updated=1/3, Ready=1/3, Degraded=0
[INFO] Waiting 30 seconds before next check... (elapsed: 90s)
[INFO] MCP Status: Updated=1/3, Ready=1/3, Degraded=0
[INFO] Waiting 30 seconds before next check... (elapsed: 120s)
[INFO] MCP Status: Updated=2/3, Ready=2/3, Degraded=0
[INFO] Waiting 30 seconds before next check... (elapsed: 150s)
[INFO] MCP Status: Updated=2/3, Ready=2/3, Degraded=0
[INFO] Waiting 30 seconds before next check... (elapsed: 180s)
[INFO] MCP Status: Updated=2/3, Ready=2/3, Degraded=0
[INFO] Waiting 30 seconds before next check... (elapsed: 210s)
[INFO] MCP Status: Updated=2/3, Ready=2/3, Degraded=0
[INFO] Waiting 30 seconds before next check... (elapsed: 240s)
[INFO] MCP Status: Updated=3/3, Ready=3/3, Degraded=0
[SUCCESS] All worker nodes have been updated!
[SUCCESS] Worker node update completed successfully!

--- Wait 60 seconds for PSI to be ready on nodes---
[INFO] Step 5: Verifying PSI is enabled on all worker nodes...
[INFO] Checking PSI status on all worker nodes...
[SUCCESS] PSI is enabled on ip-10-0-16-222.us-east-2.compute.internal
[SUCCESS] PSI is enabled on ip-10-0-51-255.us-east-2.compute.internal
[SUCCESS] PSI is enabled on ip-10-0-84-37.us-east-2.compute.internal
[SUCCESS] ✅ PSI has been successfully enabled on all worker nodes!

==========================================
[SUCCESS] PSI Enablement Complete!
==========================================
[INFO] You can verify PSI on any worker node with:
  oc debug node/<node-name> -- chroot /host cat /proc/pressure/cpu

[INFO] MachineConfig created: 99-worker-enable-psi.yaml
[INFO] To remove this configuration later, run:
  oc delete machineconfig 99-worker-enable-psi

--- Sleep 10m to let the cluster to become stable after nodes reboot after enabling PSI ---
--- Starting Test: Feature enabled = true, Stress Type = cpu ---
Collecting idle proxy metrics...
Applying cpu stress workload...
pod/cpu-stress-pod created
pod/cpu-stress-pod condition met
cpu stress workload is ready
Fri Nov  7 01:16:50 PM UTC 2025
Collecting proxy metrics under load after 180 seconds...
Cleaning up workload...
Fri Nov  7 01:19:51 PM UTC 2025
pod "cpu-stress-pod" deleted
--- Finished Test: test_feature_enabled_cpu_stress ---
--- Sleep 180s to let the previous stress to cool down ---
--- Starting Test: Feature enabled = true, Stress Type = memory ---
Collecting idle proxy metrics...
Applying memory stress workload...
pod/memory-stress-pod created
pod/memory-stress-pod condition met
memory stress workload is ready
Fri Nov  7 01:22:54 PM UTC 2025
Collecting proxy metrics under load after 180 seconds...
Cleaning up workload...
Fri Nov  7 01:25:55 PM UTC 2025
pod "memory-stress-pod" deleted
--- Finished Test: test_feature_enabled_memory_stress ---
--- Sleep 180s to let the previous stress to cool down ---
--- Starting Test: Feature enabled = true, Stress Type = io ---
Collecting idle proxy metrics...
Applying io stress workload...
pod/io-stress-pod created
pod/io-stress-pod condition met
io stress workload is ready
Fri Nov  7 01:29:28 PM UTC 2025
Collecting proxy metrics under load after 180 seconds...
Cleaning up workload...
Fri Nov  7 01:32:28 PM UTC 2025
pod "io-stress-pod" deleted
--- Finished Test: test_feature_enabled_io_stress ---
--- Sleep 180s to let the previous stress to cool down ---
node/ip-10-0-16-222.us-east-2.compute.internal unlabeled
--- All tests completed. Logs are in ./perf_logs_20251107_123941 ---
--- Analyzing results... ---
--- Performance Analysis Summary by kubectl top node and /proxy/stats/summary---

### Stress Type: CPU

| Metric                 | Condition | Result (Baseline -> Test) |
|------------------------|-----------|---------------------------|
| **Node CPU (m)**       | Idle      | 73.00 -> 78.00 (+5.00 / +6.85%) |
| **Node CPU (m)**       | Load      | 1596.00 -> 1575.00 (-21.00 / -1.32%) |
| **Node Memory (MiB)**  | Idle      | 1842.00 -> 1920.00 (+78.00 / +4.23%) |
| **Node Memory (MiB)**  | Load      | 1975.00 -> 1996.00 (+21.00 / +1.06%) |
| **Kubelet CPU (m)**    | Idle      | 34.79 -> 35.66 (+0.88 / +2.52%) |
| **Kubelet CPU (m)**    | Load      | 40.01 -> 35.60 (-4.41 / -11.02%) |
| **Kubelet Memory (MiB)**| Idle      | 155.55 -> 152.24 (-3.31 / -2.13%) |
| **Kubelet Memory (MiB)**| Load      | 155.17 -> 156.57 (+1.41 / +0.91%) |

### Stress Type: IO

| Metric                 | Condition | Result (Baseline -> Test) |
|------------------------|-----------|---------------------------|
| **Node CPU (m)**       | Idle      | 79.00 -> 79.00 (+0.00 / +0.00%) |
| **Node CPU (m)**       | Load      | 172.00 -> 170.00 (-2.00 / -1.16%) |
| **Node Memory (MiB)**  | Idle      | 2035.00 -> 2020.00 (-15.00 / -0.74%) |
| **Node Memory (MiB)**  | Load      | 2029.00 -> 2014.00 (-15.00 / -0.74%) |
| **Kubelet CPU (m)**    | Idle      | 32.53 -> 41.28 (+8.75 / +26.91%) |
| **Kubelet CPU (m)**    | Load      | 39.70 -> 31.56 (-8.15 / -20.52%) |
| **Kubelet Memory (MiB)**| Idle      | 160.11 -> 164.11 (+4.00 / +2.50%) |
| **Kubelet Memory (MiB)**| Load      | 157.31 -> 161.72 (+4.41 / +2.80%) |

### Stress Type: MEMORY

| Metric                 | Condition | Result (Baseline -> Test) |
|------------------------|-----------|---------------------------|
| **Node CPU (m)**       | Idle      | 75.00 -> 77.00 (+2.00 / +2.67%) |
| **Node CPU (m)**       | Load      | 87.00 -> 82.00 (-5.00 / -5.75%) |
| **Node Memory (MiB)**  | Idle      | 1987.00 -> 1995.00 (+8.00 / +0.40%) |
| **Node Memory (MiB)**  | Load      | 10203.00 -> 10208.00 (+5.00 / +0.05%) |
| **Kubelet CPU (m)**    | Idle      | 32.69 -> 38.06 (+5.37 / +16.42%) |
| **Kubelet CPU (m)**    | Load      | 30.00 -> 34.81 (+4.82 / +16.06%) |
| **Kubelet Memory (MiB)**| Idle      | 153.03 -> 161.22 (+8.20 / +5.36%) |
| **Kubelet Memory (MiB)**| Load      | 159.75 -> 164.29 (+4.54 / +2.84%) |

@openshift-ci
Copy link

openshift-ci bot commented Nov 10, 2025

@qiliRedHat: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@qiliRedHat
Copy link
Contributor Author

@ngopalak-redhat PTAL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant