xds: Implementation of Unified Matcher and CEL Integration#12640
xds: Implementation of Unified Matcher and CEL Integration#12640shivaspeaks wants to merge 23 commits into
Conversation
082243d to
f472c63
Compare
l46kok
left a comment
There was a problem hiding this comment.
Sorry about the drive-by -- I happened to notice this while putting together an unrelated PR in CEL-Java. I'll hold off on submitting it on our end.
| * Compiles the CEL expression string into a CelMatcher. | ||
| */ | ||
| @VisibleForTesting | ||
| public static CelMatcher compile(String expression) |
There was a problem hiding this comment.
Move this to the testing code so we don't need a dependency on the Cel compiler. It looks unused for production.
| String id = over.toString(); | ||
| return !id.equals("ADD_STRING") && !id.equals("ADD_LIST"); |
There was a problem hiding this comment.
You should be able to compare directly against the overload enum instead of the stringified form (ex: over != AddOverload.ADD_STRING).
Fixes: #566 See also grpc/grpc-java#12640 PiperOrigin-RevId: 871541021
Fixes: #566 See also grpc/grpc-java#12640 PiperOrigin-RevId: 871541021
Fixes: #566 See also grpc/grpc-java#12640 PiperOrigin-RevId: 871541021
Also removes antlr dependency from `dev.cel:runtime` artifact Fixes: #566 See also grpc/grpc-java#12640 PiperOrigin-RevId: 871541021
Also removes antlr dependency from `dev.cel:runtime` artifact Fixes: #566 See also grpc/grpc-java#12640 PiperOrigin-RevId: 871541021
Also removes antlr dependency from `dev.cel:runtime` artifact Fixes: #566 See also grpc/grpc-java#12640 PiperOrigin-RevId: 871541021
Also removes antlr dependency from `dev.cel:runtime` artifact Fixes: #566 See also grpc/grpc-java#12640 PiperOrigin-RevId: 871541021
Also removes antlr dependency from `dev.cel:runtime` artifact Fixes: #566 See also grpc/grpc-java#12640 PiperOrigin-RevId: 871541021
Also removes antlr dependency from `dev.cel:runtime` artifact Fixes: #566 See also grpc/grpc-java#12640 PiperOrigin-RevId: 871541021
Also removes antlr dependency from `dev.cel:runtime` artifact Fixes: #566 See also grpc/grpc-java#12640 PiperOrigin-RevId: 871541021
Also removes antlr dependency from `dev.cel:runtime` artifact Fixes: #566 See also grpc/grpc-java#12640 PiperOrigin-RevId: 871541021
Also removes antlr dependency from `dev.cel:runtime` artifact Fixes: #566 See also grpc/grpc-java#12640 PiperOrigin-RevId: 871541021
Also removes antlr dependency from `dev.cel:runtime` artifact Fixes: #566 See also grpc/grpc-java#12640 PiperOrigin-RevId: 872010923
| String id = over.toString(); | ||
| return !id.equals("ADD_STRING") && !id.equals("ADD_LIST"); | ||
| return !over.equals( | ||
| (Object) dev.cel.runtime.standard.AddOperator.AddOverload.ADD_STRING) |
There was a problem hiding this comment.
Why are we needing to cast to Object? If you're working around ErrorProne, then something seems broken and casting to Object is just hiding it.
There was a problem hiding this comment.
Yes it was an errorProne workaround.
over is statically known as a StandardOverload interface, but ADD_STRING is an AddOverload enum. ErrorProne threw an [EqualsIncompatibleType] build error so we need to cast the ADD_STRING enum specifically to (Object) before passing it into equals.
There was a problem hiding this comment.
ADD_STRING is a CelStandardOverload, which isn't a StandardOverload. It seems ErrorProne is correct; these will never be equal.
You should comment cases like this. If this were a bug in ErrorProne we'd probably want to file a bug with ErrorProne, but even if not, the comment helps us know when it is no longer necessary. (We don't have to guess why it was done, so we can confirm if that is no longer the case.)
It looks like that has been adjusted on cel-java master, as the function is now passed CelStandardOverload.
@sergiitk, the interface change looks like a breaking API change. That's surprising since I thought we were using stable APIs. Is this PR using an API that it shouldn't be, was this a known change that just hadn't been released yet, or should we be concerned about the actual stability of the API?
There was a problem hiding this comment.
This is a relatively new API that didn't exist when API stabilization doc was made (go/cel-api-stabilization). It's stable now (I'll update the doc when I get a chance). Reason why I suggested this API is that it follows the recommended pattern for how environment subsetting is supposed to be done across all CEL stacks.
As for EP -- I'm surprised it's actually flagging this (internally, it doesn't). AddOperator enum implements CelStandardOverload, so they should equal. My only guess is that CelStandardOverload is a functional interface, so theoretically it's possible to provide a lambda that's not equals comparable. Though we control the standard overload implementations, so in practice this won't happen. Checking for reference identity should likely get around the issue over == AddOverload.ADD_STRING
There was a problem hiding this comment.
Oh, I see, this is the new setup flow. I didn't look where this FUNCTIONS constant was used. That's fine; I'm less surprised that changed. (And yes, it's good to add it to the doc.)
I'm surprised it's actually flagging this (internally, it doesn't)
I don't think it'd get flagged on master either. It only looked to be a problem because we're still using 0.11.1; in 0.11.1 I agree with ErrorProne that the equals() will always return false.
| import javax.annotation.Nullable; | ||
|
|
||
| /** | ||
| * Interface for extracting values from a match context (e.g. HTTP headers). |
There was a problem hiding this comment.
This says explicitly "from a match context", and the only two implementations use MatchContext, so why is T present? Shouldn't this just have an argument that is MatchContext?
| return result.actions; | ||
| } | ||
|
|
||
| public interface MatchContext { |
There was a problem hiding this comment.
Nothing implements this. Is this still TODO? There's no mention this is missing in the PR description.
There was a problem hiding this comment.
Matcher Context: The input to the Matcher evaluation tree. Provided at runtime by the filter, contains the input data.
MatchContext is the representation of the "Matcher Context" described in gRFC. It is currently an empty shell (no implementations in this PR) because it is designed to be implemented by the specific xDS filters (e.g., RLQS or Composite Filter) that will be the primary consumers of this matching logic, as noted in the gRFC's Implementation section
There was a problem hiding this comment.
The implementation section saying "as part of" is saying "it will be part of that effort." We aren't saying CEL will be part of RLQS and the Composite Filter will call the RLQS code if RLQS was implemented first.
I think the filter will construct the Matcher Context and provide its contents, but I don't think each filter will have its own implementation, or at least an implementation from scratch. Headers will be Metadata, Method is always POST, Path will be pulled from MethodDescriptor, etc. I could understand if there was a server-side vs client-side implementation, but even that doesn't seem likely; worst-case looks to be different factory methods as convenience.
| /** | ||
| * Represents a compiled xDS Matcher. | ||
| */ | ||
| public abstract class UnifiedMatcher { |
There was a problem hiding this comment.
How do we have enough information to satisfy:
If set, the return type of the
inputmust match the input type
of thematcher.
There was a problem hiding this comment.
I see now. UnifiedMatcher has input+matcher combined together, so there's no need to check types when using it. But I think I got confused because there is not Matcher API. There should be a matcher API; in no way do we want to hard-code each matcher type.
| final class OnMatch { | ||
| @Nullable private final UnifiedMatcher nestedMatcher; | ||
| @Nullable private final TypedExtensionConfig action; | ||
| final boolean keepMatching; |
There was a problem hiding this comment.
Overall, it seems like keepMatching is incomplete (beyond this one instance). But that's not mentioned in the PR description. What is the current status of keepMatching?
There was a problem hiding this comment.
In MatcherList: keep_matching is used to continue the loop and accumulate actions.
In MatcherTree (Prefix): It is also implemented for traversing the trie/list of prefixes. Check line 133 in MatcherTree (if (!onMatch.keepMatching))
In MatcherTree (Exact): If a key matches, it returns immediately without checking if it should "keep matching" because thats the only logical way it seems. Should it fall back to on_no_match?
There was a problem hiding this comment.
keepMatching here is literally not used. Clearly that is wrong. Note that I'm most bothered that there's no TODO or mention in the PR description. There's nothing that calls out "this didn't make sense" or "this was not possible" for us to work through.
It seems you're missing the fact keep_matching means multiple completely different matchers can contribute to the result. You need to accumulate keep_matching results across matchers.
This is probably more clear after reading my comments at https://github.com/grpc/proposal/pull/520/changes#r2683391713 . Basically, the most important thing to understand about keep_matching is that it is a semantically different result. You shouldn't semantically mix the results between a terminal result with those of keep_matching; they aren't interchangeable (if C++ wants to and have an encoding for them to determine which results are what, they are free to, but I don't want the security vulnerability risk). We should either use a callback or a separate list in the result (MatchResult {TypedExecutionConfig action; List<TypedExecutionConfig> keepMatchingActions; }, where action == null means matched == false, or use Optional<TypedExecutionConfig> to track it)
| try { | ||
| com.github.xds.type.matcher.v3.CelMatcher celProto = customConfig.getTypedConfig() | ||
| .unpack(com.github.xds.type.matcher.v3.CelMatcher.class); | ||
| if (celProto.hasExprMatch()) { |
There was a problem hiding this comment.
Invert this condition so that we can more easily see what exception is matched with with if and also can get rid of the else. Ditto elsewhere in this class.
| * Returns the type of value extracted by this input. | ||
| */ | ||
| default Class<?> outputType() { | ||
| return Object.class; |
There was a problem hiding this comment.
This disables the check by default and it seems extremely doubtful any implementation would legitimately use Object. Delete the default implementation?
| } | ||
|
|
||
| if (!input.outputType().isAssignableFrom(matcher.inputType()) | ||
| && !matcher.inputType().isAssignableFrom(input.outputType())) { |
There was a problem hiding this comment.
The only way for the input to be assignable from the output and the output to be assignable from the input is for it to be the same class. This is just a verbose equals() check.
And just using equals() sounds perfect. I don't think we need any complex object hierarchy stuff going on. We just need to make sure they match. Equals() is what I'd hope we use, as it is unlikely to be broken.
| this.matcher = new Matcher() { | ||
| @Override | ||
| public boolean match(Object value) { | ||
| if (value instanceof String) { |
There was a problem hiding this comment.
If the input isn't compatible, this should throw. That is a programmer error, as the class said which type it can handle and that wasn't provided. That lets us find bugs. Note that silently ignoring a failure like that can be a security vulnerability. We need to be just as precise with match and non-match; we can't favor one way or the other, as we have no knowledge of what this matching implies.
| } | ||
|
|
||
| @Override | ||
| public Object apply(MatchContext context) { |
There was a problem hiding this comment.
You can have this method return String even when the interface returns Object. That would help guarantee we don't accidentally return any other type.
asheshvidyut
left a comment
There was a problem hiding this comment.
Reviewing for learning purposes.
| private static final Pattern ALLOWED_OVERLOAD_ID_PATTERN = Pattern.compile( | ||
| "^(size|matches|contains|startsWith|endsWith|starts_with|ends_with|" | ||
| + "timestamp|duration|in|index|has|int|uint|double|string|bytes|bool|" | ||
| + "equals|not_equals|less|less_equals|greater|greater_equals|" |
There was a problem hiding this comment.
equals, not equals, logical and/or/not never have suffixes. We should therefore do exact matching for these. Have another set
private static final ImmutableSet<String> ALLOWED_EXACT_OVERLOAD_IDS = ImmutableSet.of(
"equals", "not_equals", "logical_and", "logical_or", "logical_not");
and do
if (ALLOWED_EXACT_OVERLOAD_IDS.contains(id)
|| ALLOWED_OVERLOAD_ID_PREFIX_PATTERN.matcher(id).matches()) {
allowed = true;
There was a problem hiding this comment.
Change the logic to reject any non-empty name when overloadIds is present. Because if a reference has a name, it's not a standard operator or a variable (handled by the first if), so it must be something we don't support, such as custom functions.
There was a problem hiding this comment.
Oh, so standard allowed functions never have a name in the AST, so we should remove ALLOWED_FUNCTIONS, which is now an obsolete list here?
| "^(size|matches|contains|startsWith|endsWith|starts_with|ends_with|" | ||
| + "timestamp|duration|in|index|has|int|uint|double|string|bytes|bool|" | ||
| + "equals|not_equals|less|less_equals|greater|greater_equals|" | ||
| + "add|subtract|multiply|divide|modulo|logical_and|logical_or|logical_not)" |
There was a problem hiding this comment.
Add negate to this list and unit test for it
@Test
public void checkAllowedReferences_negation() throws Exception {
assertAllowed("-(1) == -1");
assertAllowed("-(1.0) == -1.0");
}
We can remove CelMatcher and CelStateMatcher from this list (for the separate PR, since ext-proc doesn't need the matching functionality) with the following changes to unit tests:
Move these unit tests with the separate PR. |
|
A106 said only request variable will be supported but for use by ext-proc A93, we need response supported as well. |
| private Object getRequestField(String requestField) { | ||
| switch (requestField) { | ||
| case "headers": return new HeadersWrapper(context); | ||
| case "host": return orEmpty(context.getHost()); | ||
| case "id": return orEmpty(context.getId()); | ||
| case "method": return or(context.getMethod(), "POST"); | ||
| case "path": | ||
| case "url_path": | ||
| return orEmpty(context.getPath()); | ||
| case "query": return ""; | ||
| case "scheme": return ""; | ||
| case "protocol": return ""; | ||
| case "time": return null; | ||
| case "referer": return getHeader("referer"); | ||
| case "useragent": return getHeader("user-agent"); | ||
|
|
||
| default: | ||
| return null; | ||
| } | ||
| } |
There was a problem hiding this comment.
I noticed that GrpcCelEnvironment.java and MatchContext.java currently only provide support for request variables (headers, method, path, etc.). However, gRFC A103 (Composite Filter) explicitly mentions the need for source and connection variables (e.g., source.address, source.port, connection.tls_version) for server-side matching.
Is there a plan to extend MatchContext and the variable resolver in this PR to include these transport attributes, or should this be tracked as a follow-up task?
There was a problem hiding this comment.
I believe these needs to be added as part of Composite filter implementation itself. This PR gives you the framework for it. It's mentioned in gRFC 103:
In addition to the CEL request attributes described in A106, we will also add support for some additional CEL attributes that we expect to be useful for the composite filter.
| public MatchContext(Metadata metadata, @Nullable String path, | ||
| @Nullable String host, @Nullable String method, | ||
| @Nullable String id) { | ||
| this.metadata = Preconditions.checkNotNull(metadata, "metadata"); | ||
| this.path = path; | ||
| this.host = host; | ||
| this.method = method; | ||
| this.id = id; | ||
| } |
There was a problem hiding this comment.
To support future attributes beyond just headers (like the aforementioned transport info or cluster metadata), we might want to consider a more pluggable way to populate MatchContext.
Currently, it has a fixed set of fields. A map-based approach or a registry of attribute providers might make it easier to add new attributes without changing the core MatchContext class every time
| /** Translates envoy proto HeaderMatcher to internal HeaderMatcher.*/ | ||
| /** Translates envoy proto HeaderMatcher to internal HeaderMatcher. */ | ||
| public static Matchers.HeaderMatcher parseHeaderMatcher( | ||
| io.envoyproxy.envoy.config.route.v3.HeaderMatcher proto) { | ||
| io.envoyproxy.envoy.config.route.v3.HeaderMatcher proto) { |
There was a problem hiding this comment.
can we revert the whitespace changes in this file ?
8515729 to
f312c9c
Compare
Implements gRFC A106: xDS Unified Matcher and CEL Integration
grpc/proposal#520