-
Notifications
You must be signed in to change notification settings - Fork 68
net, stuntime, std: VM stuntime measurement during live migration #4056
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,170 @@ | ||
| """ | ||
|
Anatw marked this conversation as resolved.
|
||
| Linux bridge migration stuntime measurement tests over live migration. | ||
|
|
||
| Tests measure the connectivity gap (stuntime) during VM live migration on Linux bridge | ||
| secondary network, for both IPv4 and IPv6, for regression detection. | ||
| Stuntime is defined as the connectivity gap from last successful reply before loss | ||
| to first successful reply after recovery. | ||
|
|
||
|
Anatw marked this conversation as resolved.
|
||
| Stuntime is measured using ICMP ping from client to server in 0.1s intervals, using ping -D so each | ||
| log line includes a timestamp for gap calculation. | ||
| The under-test VMs are configured on a Linux bridge secondary network, with a single interface, | ||
| on which IPv4/IPv6 static addresses will be defined according to the environment the test runs on. | ||
|
Anatw marked this conversation as resolved.
|
||
|
|
||
| Client - The connectivity initiator VM that runs continuous ping toward the server VM. | ||
| Server - The connectivity listener VM that receives the ping and responds. | ||
|
|
||
| STP Reference: | ||
| https://github.com/RedHatQE/openshift-virtualization-tests-design-docs/blob/main/stps/sig-network/stuntime_measurement.md | ||
| """ | ||
|
|
||
| import pytest | ||
|
|
||
| __test__ = False | ||
|
|
||
| """ | ||
| Parametrize: | ||
| - ip_family: | ||
| - ipv4 [Markers: ipv4] | ||
| - ipv6 [Markers: ipv6] | ||
|
|
||
| Preconditions: | ||
| - Shared under-test server VM on Linux bridge secondary network, for the IP family from ip_family parametrization. | ||
| - Shared under-test client VM on Linux bridge secondary network, for that same IP family, | ||
| initially running on the same node as the server VM. | ||
| """ | ||
|
Comment on lines
+25
to
+35
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you mean it's only markers? Or do you use the ip_family in tests? We can't avoid this param and just test according to existing cluster network stack?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These are not standalone markers - they describe module-wide parametrization by IP family (each scenario will get an ip_family argument once we implement). The [Markers: ipv4] / [Markers: ipv6] lines are pytest marks on each parametrized row, as in SOFTWARE_TEST_DESCRIPTION.md (and the Parametrize: + [Markers: ...] wording there). I also --collect-only'd with the -m line we use for the IPv6 single-stack lane - it only picked up the IPv6 parametrized cases, so single-stack is not blocked by having both parametrized rows defined. On dual-stack, both rows can be collected when the job expression allows both markers. Why not drop ip_family and only branch inside one test from the cluster stack? The families we care about here are fixed (IPv4 vs IPv6) - we are not discovering an unknown set of destinations at runtime. Subtests are a better fit when the cases are dynamic (e.g. loop over addresses you only know after you read the VM) - this is the opposite. Runtime branching or subtests also makes it harder to see per-family pass/fail in reports and what actually ran. @pytest.mark.parametrize + pytest.param(..., marks=...) keeps separate collected items per family.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I do not agree to this reasoning. However, the direction is good IMO because the check should examine an IP family at a time. The recovery of the network traffic is dependent on several factors, including MAC address learning and addresses resolution. Therefore, testing two families in the same test may give wrong values to the second checked famility.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks @EdDev for explaining this rationale. |
||
|
|
||
|
|
||
| @pytest.mark.incremental | ||
|
Anatw marked this conversation as resolved.
|
||
| class TestMigrationStuntime: | ||
| @pytest.mark.polarion("CNV-15252") | ||
| def test_client_migrates_off_server_node(self): | ||
| """ | ||
| Test that measured stuntime does not exceed the global threshold when the client | ||
| VM migrates from the node hosting the server VM into a different node. | ||
|
|
||
| Preconditions: | ||
| - Under-test server VM on Linux bridge secondary network, for the IP family from ip_family parametrization. | ||
| - Under-test client VM on Linux bridge secondary network, for that same IP family, | ||
| running on the same node as the server VM. | ||
| - Ping initiated from the client to the server. | ||
|
|
||
| Steps: | ||
| 1. Initiate live migration of the client VM to a node different from the node hosting the server VM | ||
| and wait for migration completion. | ||
| 2. Stop the continuous ping. | ||
| 3. Compute stuntime from the ping results. | ||
|
|
||
| Expected: | ||
| - Measured stuntime does not exceed the global threshold. | ||
| """ | ||
|
|
||
| @pytest.mark.polarion("CNV-15253") | ||
| def test_client_migrates_between_non_server_nodes(self): | ||
| """ | ||
| Test that measured stuntime does not exceed the global threshold when the client VM migrates between nodes | ||
| while the client and server VMs remain on different nodes. | ||
|
|
||
| Preconditions: | ||
| - Under-test server VM on Linux bridge secondary network, for the IP family from ip_family parametrization. | ||
| - Under-test client VM on Linux bridge secondary network, for that same IP family, | ||
| running on a worker node other than the node hosting the server VM. | ||
| - Ping initiated from the client to the server. | ||
|
|
||
| Steps: | ||
| 1. Initiate live migration of the client VM to a node different from the node hosting the server VM | ||
| and wait for migration completion. | ||
| 2. Stop the continuous ping. | ||
| 3. Compute stuntime from the ping results. | ||
|
|
||
| Expected: | ||
| - Measured stuntime does not exceed the global threshold. | ||
| """ | ||
|
|
||
| @pytest.mark.polarion("CNV-15254") | ||
| def test_client_migrates_to_server_node(self): | ||
| """ | ||
| Test that measured stuntime does not exceed the global threshold when the client VM migrates | ||
| from a node other than the node hosting the server VM onto the node hosting the server VM. | ||
|
|
||
| Preconditions: | ||
| - Under-test server VM on Linux bridge secondary network, for the IP family from ip_family parametrization. | ||
| - Under-test client VM on Linux bridge secondary network, for that same IP family, | ||
| running on a worker node other than the node hosting the server VM. | ||
| - Ping initiated from the client to the server. | ||
|
|
||
| Steps: | ||
| 1. Initiate live migration of the client VM to the node hosting the server VM | ||
| and wait for migration completion. | ||
| 2. Stop the continuous ping. | ||
| 3. Compute stuntime from the ping results. | ||
|
|
||
| Expected: | ||
| - Measured stuntime does not exceed the global threshold. | ||
| """ | ||
|
|
||
| @pytest.mark.polarion("CNV-15255") | ||
| def test_server_migrates_off_client_node(self): | ||
| """ | ||
| Test that measured stuntime does not exceed the global threshold when the server | ||
| VM migrates from the node hosting the client VM into a different node. | ||
|
|
||
| Preconditions: | ||
| - Under-test server VM on Linux bridge secondary network, for the IP family from ip_family parametrization. | ||
| - Under-test client VM on Linux bridge secondary network, for that same IP family, | ||
| running on the same node as the server VM. | ||
| - Ping initiated from the client to the server. | ||
|
|
||
| Steps: | ||
| 1. Initiate live migration of the server VM to a node different from the node hosting the client VM | ||
| and wait for migration completion. | ||
| 2. Stop the continuous ping. | ||
| 3. Compute stuntime from the ping results. | ||
|
|
||
| Expected: | ||
| - Measured stuntime does not exceed the global threshold. | ||
| """ | ||
|
|
||
| @pytest.mark.polarion("CNV-15256") | ||
| def test_server_migrates_between_non_client_nodes(self): | ||
| """ | ||
| Test that measured stuntime does not exceed the global threshold when the server VM migrates between nodes | ||
| while the client and server VMs remain on different nodes. | ||
|
|
||
| Preconditions: | ||
| - Under-test server VM on Linux bridge secondary network, for the IP family from ip_family parametrization. | ||
| - Under-test client VM on Linux bridge secondary network, for that same IP family, | ||
| running on a worker node other than the node hosting the server VM (before and after migration). | ||
| - Ping initiated from the client to the server. | ||
|
|
||
| Steps: | ||
| 1. Initiate live migration of the server VM to a node different from the node hosting the client VM | ||
| and wait for migration completion. | ||
| 2. Stop the continuous ping. | ||
| 3. Compute stuntime from the ping results. | ||
|
|
||
| Expected: | ||
| - Measured stuntime does not exceed the global threshold. | ||
| """ | ||
|
|
||
| @pytest.mark.polarion("CNV-15257") | ||
| def test_server_migrates_to_client_node(self): | ||
| """ | ||
| Test that measured stuntime does not exceed the global threshold when the server VM migrates from a node | ||
| other than the node hosting the client VM onto the node hosting the client VM. | ||
|
|
||
| Preconditions: | ||
| - Under-test server VM on Linux bridge secondary network, for the IP family from ip_family parametrization. | ||
| - Under-test client VM on Linux bridge secondary network, for that same IP family, | ||
| running on a worker node other than the node hosting the server VM. | ||
| - Ping initiated from the client to the server. | ||
|
|
||
| Steps: | ||
| 1. Initiate live migration of the server VM to the node hosting the client VM | ||
| and wait for migration completion. | ||
| 2. Stop the continuous ping. | ||
| 3. Compute stuntime from the ping results. | ||
|
|
||
| Expected: | ||
| - Measured stuntime does not exceed the global threshold. | ||
| """ | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,170 @@ | ||
| """ | ||
| OVN localnet (OVS bridge) migration stuntime measurement tests over live migration. | ||
|
|
||
| Tests measure the connectivity gap (stuntime) during VM live migration on OVN localnet | ||
| secondary network, for both IPv4 and IPv6, for regression detection. | ||
| Stuntime is defined as the connectivity gap from last successful reply before loss | ||
| to first successful reply after recovery. | ||
|
|
||
| Stuntime is measured using ICMP ping from client to server in 0.1s intervals, using ping -D so each | ||
| log line includes a timestamp for gap calculation. | ||
| The under-test VMs are configured on an OVN localnet secondary network, with a single interface, | ||
| on which IPv4/IPv6 static addresses will be defined according to the environment the test runs on. | ||
|
|
||
| Client - The connectivity initiator VM that runs continuous ping toward the server VM. | ||
| Server - The connectivity listener VM that receives the ping and responds. | ||
|
|
||
| STP Reference: | ||
| https://github.com/RedHatQE/openshift-virtualization-tests-design-docs/blob/main/stps/sig-network/stuntime_measurement.md | ||
| """ | ||
|
|
||
| import pytest | ||
|
|
||
| __test__ = False | ||
|
|
||
| """ | ||
| Parametrize: | ||
| - ip_family: | ||
| - ipv4 [Markers: ipv4] | ||
| - ipv6 [Markers: ipv6] | ||
|
|
||
| Preconditions: | ||
| - Shared under-test server VM on OVN localnet secondary network, for the IP family from ip_family parametrization. | ||
| - Shared under-test client VM on OVN localnet secondary network, for that same IP family, | ||
| initially running on the same node as the server VM. | ||
| """ | ||
|
|
||
|
|
||
| @pytest.mark.incremental | ||
| class TestMigrationStuntime: | ||
| @pytest.mark.polarion("CNV-15258") | ||
| def test_client_migrates_off_server_node(self): | ||
| """ | ||
| Test that measured stuntime does not exceed the global threshold when the client | ||
| VM migrates from the node hosting the server VM into a different node. | ||
|
|
||
| Preconditions: | ||
| - Under-test server VM on OVN localnet secondary network, for the IP family from ip_family parametrization. | ||
| - Under-test client VM on OVN localnet secondary network, for that same IP family, | ||
| running on the same node as the server VM. | ||
| - Ping initiated from the client to the server. | ||
|
|
||
| Steps: | ||
| 1. Initiate live migration of the client VM to a node different from the node hosting the server VM | ||
| and wait for migration completion. | ||
| 2. Stop the continuous ping. | ||
| 3. Compute stuntime from the ping results. | ||
|
|
||
| Expected: | ||
| - Measured stuntime does not exceed the global threshold. | ||
| """ | ||
|
|
||
| @pytest.mark.polarion("CNV-15259") | ||
| def test_client_migrates_between_non_server_nodes(self): | ||
| """ | ||
| Test that measured stuntime does not exceed the global threshold when the client VM migrates between nodes | ||
| while the client and server VMs remain on different nodes. | ||
|
|
||
| Preconditions: | ||
| - Under-test server VM on OVN localnet secondary network, for the IP family from ip_family parametrization. | ||
| - Under-test client VM on OVN localnet secondary network, for that same IP family, | ||
| running on a worker node other than the node hosting the server VM. | ||
| - Ping initiated from the client to the server. | ||
|
|
||
| Steps: | ||
| 1. Initiate live migration of the client VM to a node different from the node hosting the server VM | ||
| and wait for migration completion. | ||
| 2. Stop the continuous ping. | ||
| 3. Compute stuntime from the ping results. | ||
|
|
||
| Expected: | ||
| - Measured stuntime does not exceed the global threshold. | ||
| """ | ||
|
|
||
| @pytest.mark.polarion("CNV-15260") | ||
| def test_client_migrates_to_server_node(self): | ||
| """ | ||
| Test that measured stuntime does not exceed the global threshold when the client VM migrates | ||
| from a node other than the node hosting the server VM onto the node hosting the server VM. | ||
|
|
||
| Preconditions: | ||
| - Under-test server VM on OVN localnet secondary network, for the IP family from ip_family parametrization. | ||
| - Under-test client VM on OVN localnet secondary network, for that same IP family, | ||
| running on a worker node other than the node hosting the server VM. | ||
| - Ping initiated from the client to the server. | ||
|
|
||
| Steps: | ||
| 1. Initiate live migration of the client VM to the node hosting the server VM | ||
| and wait for migration completion. | ||
| 2. Stop the continuous ping. | ||
| 3. Compute stuntime from the ping results. | ||
|
|
||
| Expected: | ||
| - Measured stuntime does not exceed the global threshold. | ||
| """ | ||
|
|
||
| @pytest.mark.polarion("CNV-15261") | ||
| def test_server_migrates_off_client_node(self): | ||
| """ | ||
| Test that measured stuntime does not exceed the global threshold when the server | ||
| VM migrates from the node hosting the client VM into a different node. | ||
|
|
||
| Preconditions: | ||
| - Under-test server VM on OVN localnet secondary network, for the IP family from ip_family parametrization. | ||
| - Under-test client VM on OVN localnet secondary network, for that same IP family, | ||
| running on the same node as the server VM. | ||
| - Ping initiated from the client to the server. | ||
|
|
||
| Steps: | ||
| 1. Initiate live migration of the server VM to a node different from the node hosting the client VM | ||
| and wait for migration completion. | ||
| 2. Stop the continuous ping. | ||
| 3. Compute stuntime from the ping results. | ||
|
|
||
| Expected: | ||
| - Measured stuntime does not exceed the global threshold. | ||
| """ | ||
|
|
||
| @pytest.mark.polarion("CNV-15262") | ||
| def test_server_migrates_between_non_client_nodes(self): | ||
| """ | ||
| Test that measured stuntime does not exceed the global threshold when the server VM migrates between nodes | ||
| while the client and server VMs remain on different nodes. | ||
|
|
||
| Preconditions: | ||
| - Under-test server VM on OVN localnet secondary network, for the IP family from ip_family parametrization. | ||
| - Under-test client VM on OVN localnet secondary network, for that same IP family, | ||
| running on a worker node other than the node hosting the server VM (before and after migration). | ||
| - Ping initiated from the client to the server. | ||
|
|
||
| Steps: | ||
| 1. Initiate live migration of the server VM to a node different from the node hosting the client VM | ||
| and wait for migration completion. | ||
| 2. Stop the continuous ping. | ||
| 3. Compute stuntime from the ping results. | ||
|
|
||
| Expected: | ||
| - Measured stuntime does not exceed the global threshold. | ||
| """ | ||
|
|
||
| @pytest.mark.polarion("CNV-15263") | ||
| def test_server_migrates_to_client_node(self): | ||
| """ | ||
| Test that measured stuntime does not exceed the global threshold when the server VM migrates from a node | ||
| other than the node hosting the client VM onto the node hosting the client VM. | ||
|
|
||
| Preconditions: | ||
| - Under-test server VM on OVN localnet secondary network, for the IP family from ip_family parametrization. | ||
| - Under-test client VM on OVN localnet secondary network, for that same IP family, | ||
| running on a worker node other than the node hosting the server VM. | ||
| - Ping initiated from the client to the server. | ||
|
|
||
| Steps: | ||
| 1. Initiate live migration of the server VM to the node hosting the client VM | ||
| and wait for migration completion. | ||
| 2. Stop the continuous ping. | ||
| 3. Compute stuntime from the ping results. | ||
|
|
||
| Expected: | ||
| - Measured stuntime does not exceed the global threshold. | ||
| """ |
Uh oh!
There was an error while loading. Please reload this page.