Fix hang in cloudstack-sysvmadm script#12355
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## 4.20 #12355 +/- ##
============================================
- Coverage 16.23% 16.23% -0.01%
Complexity 13378 13378
============================================
Files 5657 5657
Lines 498866 498875 +9
Branches 60545 60546 +1
============================================
- Hits 81011 81008 -3
- Misses 408821 408831 +10
- Partials 9034 9036 +2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@blueorangutan package |
|
@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 16231 |
DaanHoogland
left a comment
There was a problem hiding this comment.
clgtm (though I think some de-duplication can be done)
There was a problem hiding this comment.
Pull request overview
This PR fixes a critical hang issue in the cloudstack-sysvmadm script where background process management was broken due to incorrect bash syntax. The fix corrects array variable assignments and simplifies the parallel job execution logic.
- Fixed incorrect bash array syntax that was causing variables to be arrays instead of scalars, leading to infinite loops
- Replaced complex and buggy manual process-checking logic with simple
waitcommands - Improved efficiency by moving thread limit adjustments outside loops
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
LGTM
Results Summary
| Test | Name | Threads | Time | Status |
|---|---|---|---|---|
| 1 | Minimum sequential | -t 1 |
18m 22s | PASS |
| 2 | Non-divisible | -t 7 |
5m 48s | PASS |
| 3 | Default (original bug) | -t 10 |
4m 38s | PASS |
| 4 | Boundary (count - 1) | -t 14 |
5m 23s | PASS |
| 5 | Boundary (exact match) | -t 15 |
4m 02s | PASS |
| 6 | Boundary (exact match repeat) | -t 15 |
4m 12s | PASS |
| 7 | Boundary (count + 1) | -t 16 |
4m 02s | PASS |
| 8 | Much larger | -t 20 |
4m 13s | PASS |
Key Observations
1. Bug is fixed: All tests completed without hanging, including the original bug scenario (Test 3: 15 routers with -t 10).
2. Performance scales as expected:
| Threads | Time | Batches | Analysis |
|---|---|---|---|
| 1 | 18m 22s | 15 | ~73 sec/router sequentially |
| 7 | 5m 48s | 3 | (7+7+1) |
| 10 | 4m 38s | 2 | (10+5) |
| 14 | 5m 23s | 2 | (14+1) - second batch waits for just 1 |
| 15+ | ~4m | 1 | All parallel - fastest |
3. Optimal performance: -t 15 or higher gives best results (~4 min) since all routers restart in parallel.
4. Interesting observation: -t 14 (5m 23s) is slower than -t 10 (4m 38s) because the second batch only has 1 router but still waits for full cycle.
Detailed Test Restuls:
Test 1: Minimum sequential - one VR at a time
[root@ref-trl-10580-k-Mol9-rositsa-kyuchukova-mgmt1 ~]# time cloudstack-sysvmadm -d localhost -u root -p P@ssword123 -r -t 1
mysql: [Warning] Using a password on the command line interface can be insecure.
mysql: [Warning] Using a password on the command line interface can be insecure.
Stopping and starting 15 running routing vm(s)...
Done restarting router(s).
real 18m22.504s
user 0m2.444s
sys 0m3.900s
Test 2: t=7 - Non-divisible
[root@ref-trl-10580-k-Mol9-rositsa-kyuchukova-mgmt1 ~]# time cloudstack-sysvmadm -d localhost -u root -p <password> -r -t 7
mysql: [Warning] Using a password on the command line interface can be insecure.
mysql: [Warning] Using a password on the command line interface can be insecure.
Stopping and starting 15 running routing vm(s)...
Done restarting router(s).
real 5m48.742s
user 0m3.155s
sys 0m5.210s
Test 3: Default/original bug (t=10) - 15 not divisible by 10
[root@ref-trl-10580-k-Mol9-rositsa-kyuchukova-mgmt1 ~]# time cloudstack-sysvmadm -d localhost -u root -p <password> -r
mysql: [Warning] Using a password on the command line interface can be insecure.
mysql: [Warning] Using a password on the command line interface can be insecure.
Stopping and starting 15 running routing vm(s)...
Done restarting router(s).
real 4m38.045s
user 0m3.518s
sys 0m5.757s
Test 4: Boundary (count - 1) / t=14
[root@ref-trl-10580-k-Mol9-rositsa-kyuchukova-mgmt1 ~]# time cloudstack-sysvmadm -d localhost -u root -p P@ssword123 -r -t 14
mysql: [Warning] Using a password on the command line interface can be insecure.
mysql: [Warning] Using a password on the command line interface can be insecure.
Stopping and starting 15 running routing vm(s)...
Done restarting router(s).
real 5m23.235s
user 0m4.968s
sys 0m8.096s
Test 5 Boundary (exact match) t=15 - exactly equals router count
[root@ref-trl-10580-k-Mol9-rositsa-kyuchukova-mgmt1 ~]# time cloudstack-sysvmadm -d localhost -u root -p P@ssword123 -r -t 15
mysql: [Warning] Using a password on the command line interface can be insecure.
mysql: [Warning] Using a password on the command line interface can be insecure.
Stopping and starting 15 running routing vm(s)...
Done restarting router(s).
real 4m2.676s
user 0m5.016s
sys 0m8.023s
Test 6: Boundary (exact match) t=15
[root@ref-trl-10580-k-Mol9-rositsa-kyuchukova-mgmt1 ~]# time cloudstack-sysvmadm -d localhost -u root -p P@ssword123 -r -t 15
mysql: [Warning] Using a password on the command line interface can be insecure.
mysql: [Warning] Using a password on the command line interface can be insecure.
Stopping and starting 15 running routing vm(s)...
Done restarting router(s).
real 4m12.920s
user 0m5.087s
sys 0m8.252s
Test 7: Boundary (count + 1) t=16
[root@ref-trl-10580-k-Mol9-rositsa-kyuchukova-mgmt1 ~]# time cloudstack-sysvmadm -d localhost -u root -p P@ssword123 -r -t 16
mysql: [Warning] Using a password on the command line interface can be insecure.
mysql: [Warning] Using a password on the command line interface can be insecure.
Stopping and starting 15 running routing vm(s)...
Done restarting router(s).
real 4m2.942s
user 0m5.023s
sys 0m8.066s
Test 8: Much larger t=20 - more threads than routers
[root@ref-trl-10580-k-Mol9-rositsa-kyuchukova-mgmt1 ~]# time cloudstack-sysvmadm -d localhost -u root -p P@ssword123 -r -t 20
mysql: [Warning] Using a password on the command line interface can be insecure.
mysql: [Warning] Using a password on the command line interface can be insecure.
Stopping and starting 15 running routing vm(s)...
Done restarting router(s).
real 4m13.027s
user 0m5.047s
sys 0m8.166s
Description
This PR fixes #12067
(see #12067 (comment) for description)
Types of changes
Feature/Enhancement Scale or Bug Severity
Feature/Enhancement Scale
Bug Severity
Screenshots (if appropriate):
How Has This Been Tested?
How did you try to break this feature and the system with this change?