Skip to content

OCPBUGS-80952: perf: latenyc: compute memory resources dynamically#1517

Open
shajmakh wants to merge 1 commit into
openshift:mainfrom
shajmakh:mem-for-latency
Open

OCPBUGS-80952: perf: latenyc: compute memory resources dynamically#1517
shajmakh wants to merge 1 commit into
openshift:mainfrom
shajmakh:mem-for-latency

Conversation

@shajmakh
Copy link
Copy Markdown
Contributor

@shajmakh shajmakh commented May 15, 2026

When CPUs are very high the pod's fixed memory resources may become too
low to run the latency checks. Add an environment variable to allow more
flexibility while preserving the old behavior for backward
compatibility.
The new behavior goes like this:

  1. If no env var is set, keep the default old behavior (1Gi)
  2. else, if it was set to a specific memory value and it's valid
    quantity then use that in the latency pod, otherwise throw an error.
    If the env var value was set to dynamic, the test will compute the
    memory by (number of computed CPUs * 32Mi).

32Mi was picked based on input from consumers of the application; If
happened that the memory is still not enough, the user has the
flexibility to override the total memory with an explicit value.

Summary by CodeRabbit

  • New Features

    • Added configurable memory settings for latency tests via the LATENCY_TEST_MEMORY environment variable, enabling fine-tuned resource allocation.
    • Supports dynamic memory calculation mode that scales memory based on CPU configuration.
  • Documentation

    • Updated configuration documentation to include the new memory environment variable.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 15, 2026

Walkthrough

This PR adds configurable memory allocation for latency test pods via the LATENCY_TEST_MEMORY environment variable. It replaces the hardcoded 1Gi memory limit with logic that supports dynamic computation (based on CPU count) or custom values, while maintaining a sensible default.

Changes

Latency Test Memory Configuration

Layer / File(s) Summary
Memory configuration constants and initialization
test/e2e/performanceprofile/functests/4_latency/latency.go
Defines defaultTestMemory (1Gi) and dynamicMemory mode constants; initializes latencyTestMemory variable and updates environment-variable documentation to include LATENCY_TEST_MEMORY.
Memory computation and validation logic
test/e2e/performanceprofile/functests/4_latency/latency.go
Implements getLatencyTestMemory(cpus int) to parse LATENCY_TEST_MEMORY, compute dynamic memory as 16Mi * cpus (with default fallback when CPU count matches default), and validate explicit Kubernetes resource quantities via resource.ParseQuantity.
Pod creation integration
test/e2e/performanceprofile/functests/4_latency/latency.go
getLatencyTestPod calls getLatencyTestMemory with the selected CPU count and applies the computed value to the latency container's ResourceMemory limit instead of the previous hardcoded "1Gi".

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 9 | ❌ 3

❌ Failed checks (3 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Structure And Quality ⚠️ Warning Line 355 assertion lacks a meaningful failure message. The custom check requires assertions to include diagnostic messages, but Expect(err).ToNot(HaveOccurred()) provides no context. Add a descriptive message to line 355: Expect(err).ToNot(HaveOccurred(), "failed to compute latency test memory")
Ipv6 And Disconnected Network Test Compatibility ⚠️ Warning Three new Ginkgo e2e tests require pulling images from quay.io/openshift-kni by default, causing external connectivity failures in disconnected IPv6 environments. Set IMAGE_REGISTRY to internal registry in disconnected environments, add [Skipped:Disconnected] tag, or verify image mirroring is configured.
✅ Passed checks (9 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed All Ginkgo test names in the modified file use stable, static strings. No dynamic content (generated IDs, timestamps, node/pod names, IP addresses) found in Describe, Context, It, or By declarations.
Microshift Test Compatibility ✅ Passed No new Ginkgo e2e tests are added in this PR. The changes only modify helper functions and memory resource computation for existing tests that were already present.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No new Ginkgo e2e tests (It, Describe, Context, When blocks) were added. This PR only modifies helper functions getLatencyTestMemory() and getLatencyTestPod(). The check is not applicable.
Topology-Aware Scheduling Compatibility ✅ Passed Test pod memory configuration changes; no topology-breaking scheduling constraints added. Node selection handles HyperShift and SNO properly.
Ote Binary Stdout Contract ✅ Passed No violations of OTE Binary Stdout Contract detected. All changes are helper functions and constants called/used only within test blocks. No process-level code writes to stdout.
Title check ✅ Passed The title clearly summarizes the main change: adding dynamic memory resource computation for latency tests, which is the primary objective of the PR.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot requested review from MarSik and yanirq May 15, 2026 14:16
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 15, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: shajmakh
Once this PR has been reviewed and has the lgtm label, please assign marsik for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/e2e/performanceprofile/functests/4_latency/latency.go`:
- Around line 287-289: The check in getLatencyTestMemory that returns
defaultTestMemory when cpus == defaultTestCpus is unreachable given the current
call-site logic (latencyTestCpus is normalized before calling), so either remove
that branch to simplify getLatencyTestMemory (delete the if cpus ==
defaultTestCpus { return defaultTestMemory, nil } case) or keep it but add a
short comment on the cpus parameter explaining this is defensive for future
callers (mentioning defaultTestCpus and why it might still be passed) so readers
know the branch is intentional; locate getLatencyTestMemory and update
accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c80f1a1c-34b6-48e4-95b1-f963938d2729

📥 Commits

Reviewing files that changed from the base of the PR and between 54a9ef7 and c9670ea.

📒 Files selected for processing (1)
  • test/e2e/performanceprofile/functests/4_latency/latency.go

Comment thread test/e2e/performanceprofile/functests/4_latency/latency.go
@shajmakh shajmakh changed the title perf: latenyc: compute memory resources dynamically OCPBUGS-80952: perf: latenyc: compute memory resources dynamically May 15, 2026
@openshift-ci-robot openshift-ci-robot added jira/severity-critical Referenced Jira bug's severity is critical for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels May 15, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@shajmakh: This pull request references Jira Issue OCPBUGS-80952, which is invalid:

  • expected the bug to target the "5.0.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

When CPUs are very high the pod's fixed memory resources may become too low to run the latency checks. Add an environment variable to allow more flexibility while preserving the old behavior for backward compatibility.
The new behavior goes like this:

  1. If no env var is set, keep the default old behavior (1Gi)
  2. else, if it was set to a specific memory value and it's valid quantity then use that in the latency pod, otherwise throw an error. If the env var value was set to dynamic, the test will compute the memory by (number of computed CPUs * 16Mi).

Summary by CodeRabbit

  • New Features

  • Added configurable memory settings for latency tests via the LATENCY_TEST_MEMORY environment variable, enabling fine-tuned resource allocation.

  • Supports dynamic memory calculation mode that scales memory based on CPU configuration.

  • Documentation

  • Updated configuration documentation to include the new memory environment variable.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Comment thread test/e2e/performanceprofile/functests/4_latency/latency.go
When CPUs are very high the pod's fixed memory resources may become too
low to run the latency checks. Add an environment variable to allow more
flexibility while preserving the old behavior for backward
compatibility.
The new behavior goes like this:
1. If no env var is set, keep the default old behavior (1Gi)
2. else, if it was set to a specific memory value and it's valid
   quantity then use that in the latency pod, otherwise throw an error.
   If the env var value was set to `dynamic`, the test will compute the
memory by (number of computed CPUs * 32Mi).

`32Mi` was picked based on input from consumers of the application; If
happened that the memory is still not enough, the user has the
flexibility to override the total memory with an explicit value.

Signed-off-by: Shereen Haj <shajmakh@redhat.com>
@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels May 15, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@shajmakh: This pull request references Jira Issue OCPBUGS-80952, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)
Details

In response to this:

When CPUs are very high the pod's fixed memory resources may become too
low to run the latency checks. Add an environment variable to allow more
flexibility while preserving the old behavior for backward
compatibility.
The new behavior goes like this:

  1. If no env var is set, keep the default old behavior (1Gi)
  2. else, if it was set to a specific memory value and it's valid
    quantity then use that in the latency pod, otherwise throw an error.
    If the env var value was set to dynamic, the test will compute the
    memory by (number of computed CPUs * 32Mi).

32Mi was picked based on input from consumers of the application; If
happened that the memory is still not enough, the user has the
flexibility to override the total memory with an explicit value.

Summary by CodeRabbit

  • New Features

  • Added configurable memory settings for latency tests via the LATENCY_TEST_MEMORY environment variable, enabling fine-tuned resource allocation.

  • Supports dynamic memory calculation mode that scales memory based on CPU configuration.

  • Documentation

  • Updated configuration documentation to include the new memory environment variable.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@yanirq
Copy link
Copy Markdown
Contributor

yanirq commented May 16, 2026

/retest

@yanirq
Copy link
Copy Markdown
Contributor

yanirq commented May 17, 2026

/retest-required

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 17, 2026

@shajmakh: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-ovn 3c4fe62 link true /test e2e-aws-ovn

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@yanirq
Copy link
Copy Markdown
Contributor

yanirq commented May 18, 2026

/retest-required

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/severity-critical Referenced Jira bug's severity is critical for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants