Add validation for max_ongoing_requests in autoscaling_config#61529
Add validation for max_ongoing_requests in autoscaling_config#61529Ziy1-Tan wants to merge 1 commit intoray-project:masterfrom
Conversation
cae6d97 to
0a02087
Compare
There was a problem hiding this comment.
Code Review
This pull request introduces a validation in DeploymentSchema to prevent max_ongoing_requests from being incorrectly placed within autoscaling_config. This is a valuable change that improves user experience by providing a clear error for a common misconfiguration that was previously silently ignored. The implementation is correct and is accompanied by a thorough unit test. I have one suggestion to make the validation logic more robust and user-friendly for potential future extensions.
I am having trouble creating individual review comments. Click here to see my feedback.
python/ray/serve/schema.py (525-534)
This implementation works correctly for a single invalid field. However, if more fields are added to invalid_fields in the future, this loop will only report the first one it finds. For a better user experience, it would be ideal to detect all invalid fields at once and report them in a single, comprehensive error message.
Using a set for invalid_fields and performing a set intersection is a more scalable approach.
# Fields that should NOT be in `autoscaling_config`
invalid_fields = {"max_ongoing_requests"}
found_fields = invalid_fields.intersection(autoscaling_config)
if found_fields:
fields_str = ", ".join(f"'{f}'" for f in sorted(found_fields))
is_plural = len(found_fields) > 1
raise ValueError(
f"{fields_str} must be at the deployment level, not inside "
f"'autoscaling_config'. Move {'them' if is_plural else 'it'} to the "
f"same level as 'autoscaling_config' in your config."
)
152b612 to
27dbe90
Compare
abrarsheikh
left a comment
There was a problem hiding this comment.
In the current implementation, we are validating max_ongoing_requests specifically, but the fix needs to be more general-purpose. I should have mentioned this in the GH issue description more clearly, my bad. We really want the entire serve config validated such that if the user-provided config deviates from the config schema, we want to fail fast.
Good point. We can further clarify the description and then either follow up on it continuously or break it down into sub-tasks. Should we expect all configuration items to be the same as max_ongoing_requests? |
eca820e to
3964931
Compare
| # Create a valid AutoscalingConfig and manually set the private attribute | ||
| # to test that internal fields don't leak into JSON serialization | ||
| autoscaling_config = AutoscalingConfig() | ||
| autoscaling_config._serialized_policy_def = serialized_policy_def |
There was a problem hiding this comment.
Test sets private attribute on wrong model object
Medium Severity
_serialized_policy_def is a PrivateAttr on AutoscalingPolicy, not on AutoscalingConfig. Setting autoscaling_config._serialized_policy_def places a stray Python attribute on the wrong object — it needs to be autoscaling_config.policy._serialized_policy_def to actually attach the bytes to the policy. This makes the later assertion assert "_serialized_policy_def" not in autoscaling_config vacuously true, so the test no longer verifies that internal fields don't leak into JSON serialization.
Additional Locations (1)
Change autoscaling_config field type from Optional[Dict] to Optional[AutoscalingConfig] in DeploymentSchema to enable Pydantic validation at parse time. Also add 'extra = forbid' config to AutoscalingConfig to reject unknown fields. This ensures that any misconfigured autoscaling_config fields (e.g., max_ongoing_requests placed inside autoscaling_config instead of at deployment level) will cause an immediate ValidationError instead of being silently ignored. The redundant root_validator is removed as Pydantic v2's extra="forbid" already handles this validation. Fixes ray-project#61439 Signed-off-by: Simple <simple@example.com>
3964931 to
b4762e4
Compare
| ), | ||
| ) | ||
| autoscaling_config: Optional[Dict] = Field( | ||
| autoscaling_config: Optional[Union[Dict, AutoscalingConfig]] = Field( |
There was a problem hiding this comment.
Union type ordering prevents autoscaling_config validation from triggering
High Severity
The type Optional[Union[Dict, AutoscalingConfig]] defeats the extra = "forbid" validation on AutoscalingConfig. In pydantic v1 (used via ray._common.pydantic_compat), Union types are tried in order — since Dict comes first, any dict input matches immediately without ever attempting AutoscalingConfig validation. Even reversing the order wouldn't help, because pydantic v1 falls through to the next Union member on validation failure. The intended validation of rejecting unknown fields like max_ongoing_requests inside autoscaling_config will never trigger for dict inputs, which is the primary use case (YAML configs).


Summary
Adds validation to raise a clear error when
max_ongoing_requestsis incorrectly placed insideautoscaling_configinstead of at the deployment level.Problem
Ray Serve accepts YAML configs where users can mistakenly put
max_ongoing_requestsinsideautoscaling_config. This gets silently ignored, using the default value (5) instead.Example of faulty config:
Solution
Added a
root_validatorinDeploymentSchemato detect invalid fields insideautoscaling_configand raise a clear error message.Test plan
Fixes #61439