Skip to content

Add validation for max_ongoing_requests in autoscaling_config#61529

Open
Ziy1-Tan wants to merge 1 commit intoray-project:masterfrom
Ziy1-Tan:fix-max-ongoing-requests-validation
Open

Add validation for max_ongoing_requests in autoscaling_config#61529
Ziy1-Tan wants to merge 1 commit intoray-project:masterfrom
Ziy1-Tan:fix-max-ongoing-requests-validation

Conversation

@Ziy1-Tan
Copy link
Contributor

@Ziy1-Tan Ziy1-Tan commented Mar 5, 2026

Summary

Adds validation to raise a clear error when max_ongoing_requests is incorrectly placed inside autoscaling_config instead of at the deployment level.

Problem

Ray Serve accepts YAML configs where users can mistakenly put max_ongoing_requests inside autoscaling_config. This gets silently ignored, using the default value (5) instead.

Example of faulty config:

deployments:
  - name: RAG Chatbot
    autoscaling_config:
      max_ongoing_requests: 10   # WRONG - should be at deployment level
      target_ongoing_requests: 5

Solution

Added a root_validator in DeploymentSchema to detect invalid fields inside autoscaling_config and raise a clear error message.

Test plan

  • Add test for invalid max_ongoing_requests in autoscaling_config
  • Verify test fails without implementation
  • Implement validation
  • Verify test passes with implementation
  • Run existing deployment schema tests to ensure no regression

Fixes #61439

@Ziy1-Tan Ziy1-Tan requested a review from a team as a code owner March 5, 2026 15:32
@Ziy1-Tan Ziy1-Tan force-pushed the fix-max-ongoing-requests-validation branch from cae6d97 to 0a02087 Compare March 5, 2026 15:34
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a validation in DeploymentSchema to prevent max_ongoing_requests from being incorrectly placed within autoscaling_config. This is a valuable change that improves user experience by providing a clear error for a common misconfiguration that was previously silently ignored. The implementation is correct and is accompanied by a thorough unit test. I have one suggestion to make the validation logic more robust and user-friendly for potential future extensions.

I am having trouble creating individual review comments. Click here to see my feedback.

python/ray/serve/schema.py (525-534)

medium

This implementation works correctly for a single invalid field. However, if more fields are added to invalid_fields in the future, this loop will only report the first one it finds. For a better user experience, it would be ideal to detect all invalid fields at once and report them in a single, comprehensive error message.

Using a set for invalid_fields and performing a set intersection is a more scalable approach.

        # Fields that should NOT be in `autoscaling_config`
        invalid_fields = {"max_ongoing_requests"}

        found_fields = invalid_fields.intersection(autoscaling_config)
        if found_fields:
            fields_str = ", ".join(f"'{f}'" for f in sorted(found_fields))
            is_plural = len(found_fields) > 1
            raise ValueError(
                f"{fields_str} must be at the deployment level, not inside "
                f"'autoscaling_config'. Move {'them' if is_plural else 'it'} to the "
                f"same level as 'autoscaling_config' in your config."
            )

@Ziy1-Tan Ziy1-Tan force-pushed the fix-max-ongoing-requests-validation branch 3 times, most recently from 152b612 to 27dbe90 Compare March 5, 2026 16:02
@ray-gardener ray-gardener bot added serve Ray Serve Related Issue community-contribution Contributed by the community labels Mar 5, 2026
Copy link
Contributor

@abrarsheikh abrarsheikh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current implementation, we are validating max_ongoing_requests specifically, but the fix needs to be more general-purpose. I should have mentioned this in the GH issue description more clearly, my bad. We really want the entire serve config validated such that if the user-provided config deviates from the config schema, we want to fail fast.

@Ziy1-Tan
Copy link
Contributor Author

Ziy1-Tan commented Mar 6, 2026

In the current implementation, we are validating max_ongoing_requests specifically, but the fix needs to be more general-purpose. I should have mentioned this in the GH issue description more clearly, my bad. We really want the entire serve config validated such that if the user-provided config deviates from the config schema, we want to fail fast.

Good point. We can further clarify the description and then either follow up on it continuously or break it down into sub-tasks.

Should we expect all configuration items to be the same as max_ongoing_requests?

@Ziy1-Tan Ziy1-Tan force-pushed the fix-max-ongoing-requests-validation branch from eca820e to 3964931 Compare March 6, 2026 14:16
# Create a valid AutoscalingConfig and manually set the private attribute
# to test that internal fields don't leak into JSON serialization
autoscaling_config = AutoscalingConfig()
autoscaling_config._serialized_policy_def = serialized_policy_def
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test sets private attribute on wrong model object

Medium Severity

_serialized_policy_def is a PrivateAttr on AutoscalingPolicy, not on AutoscalingConfig. Setting autoscaling_config._serialized_policy_def places a stray Python attribute on the wrong object — it needs to be autoscaling_config.policy._serialized_policy_def to actually attach the bytes to the policy. This makes the later assertion assert "_serialized_policy_def" not in autoscaling_config vacuously true, so the test no longer verifies that internal fields don't leak into JSON serialization.

Additional Locations (1)

Fix in Cursor Fix in Web

Change autoscaling_config field type from Optional[Dict] to
Optional[AutoscalingConfig] in DeploymentSchema to enable Pydantic
validation at parse time. Also add 'extra = forbid' config to
AutoscalingConfig to reject unknown fields.

This ensures that any misconfigured autoscaling_config fields
(e.g., max_ongoing_requests placed inside autoscaling_config
instead of at deployment level) will cause an immediate
ValidationError instead of being silently ignored.

The redundant root_validator is removed as Pydantic v2's
extra="forbid" already handles this validation.

Fixes ray-project#61439

Signed-off-by: Simple <simple@example.com>
@Ziy1-Tan Ziy1-Tan force-pushed the fix-max-ongoing-requests-validation branch from 3964931 to b4762e4 Compare March 6, 2026 14:31
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

),
)
autoscaling_config: Optional[Dict] = Field(
autoscaling_config: Optional[Union[Dict, AutoscalingConfig]] = Field(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Union type ordering prevents autoscaling_config validation from triggering

High Severity

The type Optional[Union[Dict, AutoscalingConfig]] defeats the extra = "forbid" validation on AutoscalingConfig. In pydantic v1 (used via ray._common.pydantic_compat), Union types are tried in order — since Dict comes first, any dict input matches immediately without ever attempting AutoscalingConfig validation. Even reversing the order wouldn't help, because pydantic v1 falls through to the next Union member on validation failure. The intended validation of rejecting unknown fields like max_ongoing_requests inside autoscaling_config will never trigger for dict inputs, which is the primary use case (YAML configs).

Additional Locations (1)

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Contributed by the community serve Ray Serve Related Issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Serve] Fail fast when user provides faulty deployment config

2 participants