add datastructure for public syntax tree, display and ExprBuilder#2170
add datastructure for public syntax tree, display and ExprBuilder#2170victornicolet merged 6 commits intomainfrom
Conversation
c32bc61 to
d63a006
Compare
d63a006 to
90b438d
Compare
There was a problem hiding this comment.
Pull request overview
Introduces an initial Public Syntax Tree (PST) module in cedar-policy-core to support ergonomic programmatic policy construction, including core PST data structures, basic expression Display, and an ExprBuilder implementation.
Changes:
- Adds new
pstmodule with expression, constraint, and policy data structures. - Implements
Displayfor PST expressions. - Implements
ExprBuilderfor PST expression construction (internal builder).
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
cedar-policy-core/src/lib.rs |
Exposes the new pst module publicly from the crate root. |
cedar-policy-core/src/pst/mod.rs |
Defines the PST module structure and re-exports key PST types. |
cedar-policy-core/src/pst/expr.rs |
Defines PST expression-related types, conversions, ExprBuilder impl, and Display + tests. |
cedar-policy-core/src/pst/constraints.rs |
Adds PST scope constraint data structures. |
cedar-policy-core/src/pst/policy.rs |
Adds PST policy representation (effect, clauses, annotations, scope constraints). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This comment was marked as outdated.
This comment was marked as outdated.
90b438d to
b66cd0b
Compare
This comment was marked as outdated.
This comment was marked as outdated.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
b66cd0b to
a05fb1b
Compare
This comment was marked as outdated.
This comment was marked as outdated.
|
Getting coverage for the changes in this PR won't be very meaningful, coverage will increase once we start adding functionality to convert / manipulate the PST. |
| /// In set of actions (single action is just length 1) | ||
| In(nonempty::NonEmpty<EntityUID>), |
There was a problem hiding this comment.
Do the AST/EST have the nonempty constraint here? Is permit(principal, action in [], resource); a valid Cedar policy?
There was a problem hiding this comment.
They do not, good catch.
| pub enum SlotId { | ||
| /// Principal slot | ||
| Principal, | ||
| /// Resource slot | ||
| Resource, | ||
| } |
There was a problem hiding this comment.
Discussion question: do we want to try to make the PST structures such that the modifications that will be needed for RFC 98 are backwards-compatible? is that an achievable goal?
There was a problem hiding this comment.
I think it depends on what we define as "backwards compatible"? We can also force clients to not depend on exhaustively matching on the enums.
I think we can achieve compile-time backwards compatible changes if we want to extend types, whether it's because we have less restrictive types (have a variant that accepts any name) or because we force non-exhaustive matches (i.e. #non_exhaustive). By compile-time backwards compatible, I mean that a change in the reprsentation will not cause clients that previously compiled to fail.
There was a problem hiding this comment.
I think the design needs to be able to maintain compiler-level backwards compatibility of the PST for backwards compatible changes to the Cedar language. We're not going to want to release new library major version for minor language versions.
At least, that's the attitude we've had so far. I've argued before that we should be comfortable with the library and language version getting out of sync.
There was a problem hiding this comment.
This is kind of unfortunate since I might like to force consumers of the library to update their code if we add a new operator/slot kind.
There was a problem hiding this comment.
I think RFC 98 will probably require library breaking changes in other places (i.e., a cedar-policy 5.0 using language version 4.x). However, it might be nice if consumers of the PST-specifically were not broken, if we have the opportunity to do that. OTOH, if we're breaking consumers of other parts of the public API it's ok to break consumers of the PST too.
There was a problem hiding this comment.
Do you agree the non-exhaustive attribute on the public structs and enums would be sufficient to ensure compiler backwards compatibility, and probably the better option over not having enums?
On the other hand, if there are language characteristics that we know would require breaking changes in other places, it would be nice that a library update forces the client to update their code. I can't predict that with my lack of experience here.
| Not(Arc<Expr>), | ||
| /// Arithmetic negation | ||
| Neg(Arc<Expr>), | ||
| /// Binary operation | ||
| BinaryOp { | ||
| /// The operator | ||
| op: BinaryOp, | ||
| /// Left operand | ||
| left: Arc<Expr>, | ||
| /// Right operand | ||
| right: Arc<Expr>, | ||
| }, |
There was a problem hiding this comment.
Any reason to flatten the unary ops (Not/Neg) into Expr, but not the binary ops?
There was a problem hiding this comment.
I guess the fact that there are a lot more binary operators, and using "not" is very common and it's marginally more convenient to have it as a top level enum, but we could move it inside a UnaryOp with Neg and IsEmpty as well.
| /// Function call (builtin or extension). Syntactically, this can be either a function-style or | ||
| /// method-style call depending on the extension. | ||
| FuncCall { | ||
| /// Function name | ||
| name: SmolStr, | ||
| /// Arguments | ||
| args: Vec<Arc<Expr>>, | ||
| }, |
There was a problem hiding this comment.
The comment says "builtin or extension", but builtin calls like .containsAny() or .getTag() are actually handled above in BinaryOp. Maybe this should be ExtFuncCall and specifically for extension functions?
There was a problem hiding this comment.
Yes, some confusion on my part on what is a builtin and what is an extension from the point of view of a user. Is "extension" mostly an internal concept? Should we actually move the .containsAny and .getTag to be functions instead of binary operators, or vice-versa?
There was a problem hiding this comment.
I think that's worth a small discussion, and from what I can see in the doc everything is an "operation" and it would make sense to treat all those uniformly.
The split mostly followed the AST/EST's shapes.
There was a problem hiding this comment.
name: SmolStr feels like the wrong design here. The actual ought to be part of an enum (BinOp). The extension functions could use SmolStr as well, but it would also be reasonable to represent them as an enum in the PST (committing to the notion that extensions can't be user defined, but we've already done that for the most part)
Signed-off-by: Victor Nicolet <victornl@amazon.com>
…e implementation Signed-off-by: Victor Nicolet <victornl@amazon.com>
…2173) Signed-off-by: Victor Nicolet <victornl@amazon.com>
8ebc2f0 to
ea91805
Compare
Signed-off-by: Victor Nicolet <victornl@amazon.com>
ea91805 to
8561350
Compare
This comment was marked as outdated.
This comment was marked as outdated.
cdisselkoen
left a comment
There was a problem hiding this comment.
This is in good shape; a few more comments
| /// Representation of an unknown for partial evaluation | ||
| Unknown { | ||
| /// Name of the unknown | ||
| name: SmolStr, | ||
| }, |
There was a problem hiding this comment.
discussion: should we remove this, and just fail PST conversion for ASTs that contain Unknowns? once we fully remove partial evaluation in favor of TPE, we won't have this node in the AST, right?
There was a problem hiding this comment.
Is the Unknown something the client should never see and construct?
From a practical standpoint, currently the ExprBuilder interface cannot fail on unknown construction (unless we panic of course).
Is that a change that would require removing Unknowns everywhere (AST, EST, the "unknown" in the extension functions), and will it be breaking?
There was a problem hiding this comment.
Removing it from the AST should be nonbreaking because the AST is not exposed publicly. If the EST currently has an Unknown node, then removing it is breaking. Does the EST have an Unknown node, or does it represent unknowns via an extension function unknown()?
There was a problem hiding this comment.
From a quick look we should be able to remove the unknown in the AST, leaving the extension function representation. IIRC, that's how the EST does it.
It'd be a larger change though.
There was a problem hiding this comment.
The EST does represent unknowns with an extension function unknown(). Although I think removing the "unknown" extension supported in the PST to represent unknowns would still be a breaking change? And not having the Unknown node in the PST would make the language EST and PST can represent different?
There was a problem hiding this comment.
I would lean towards having Unknown in the PST for now since it's also documented in the JSON policy format
This comment was marked as outdated.
This comment was marked as outdated.
d3ac405 to
dad5926
Compare
+ fix handling of Unknown + eliminate clones Signed-off-by: Victor Nicolet <victornl@amazon.com>
dad5926 to
a3bdea4
Compare
Coverage ReportHead Commit: Base Commit: Download the full coverage report. Coverage of Added or Modified Lines of Rust CodeRequired coverage: 80.00% Actual coverage: 89.66% Status: PASSED ✅ Details
Coverage of All Lines of Rust CodeRequired coverage: 80.00% Actual coverage: 86.38% Status: PASSED ✅ Details
|
|
A summary of the important discussion topics in this PR:
@john-h-kastner-aws and @cdisselkoen what do you think? Is there anything that needs resolving before merging? |
|
I'm good to merge this as-is, and if necessary we can continue the discussion going forward. I also think it's easier to merge this now and not make a whole bunch of PRs into this branch -- among other things, it will make it easier to ensure that this work doesn't keep diverging from |
john-h-kastner-aws
left a comment
There was a problem hiding this comment.
some smaller comments
| Resource, | ||
| } | ||
|
|
||
| impl From<ast::SlotId> for SlotId { |
There was a problem hiding this comment.
nit: we'll probably want #[doc(hidden)] on conversions from internal types
| Name { | ||
| id: id.into_smolstr(), | ||
| namespace: Arc::new( | ||
| Arc::try_unwrap(path) |
There was a problem hiding this comment.
Could this be Arc::unwrap_or_clone?
| /// | ||
| /// Represents the type of an entity in Cedar (e.g., `User`, `Photo`, `Namespace::Resource`) | ||
| #[derive(Debug, Clone, PartialEq, Eq, Hash)] | ||
| pub struct EntityType(pub Name); |
There was a problem hiding this comment.
Is there a way this could be the existing EntityTypeName struct?
There was a problem hiding this comment.
I guess it represents the same thing (and EntityTypeName wraps ast::EntityType) although I don't think you can construct the name, you can only parse it? One goal was also to make the PST's representation completely decoupled from the internal representations, but this of course creates duplication. Not sure what's the best solution in this case.
This solution allows creating a name from a list of strings for the namespace and doesn't require parsing.
There was a problem hiding this comment.
I'll keep it as-is in this PR, we can change our mind later.
| got, | ||
| }); | ||
| } | ||
| Ok(match args.len() { |
There was a problem hiding this comment.
Might be able to write this with match args.as_slice() and avoid unwraps
There was a problem hiding this comment.
I don't know if I can without adding a .clone() because .as_slice() gives a reference. Current solution is a little verbose but doesn't clone
| Expr::Literal(lit) => match lit { | ||
| Literal::Bool(b) => write!(f, "{}", b), | ||
| Literal::Long(i) => write!(f, "{}", i), | ||
| Literal::String(s) => write!(f, "\"{}\"", s.escape_default()), |
There was a problem hiding this comment.
We've used escape_debug more often in other places. Not sure how they're different
There was a problem hiding this comment.
Documentation indicates they're handling unicode differently. I'll use escape_debug for consistency then.
Signed-off-by: Victor Nicolet <victornl@amazon.com>
Coverage ReportHead Commit: Base Commit: Download the full coverage report. Coverage of Added or Modified Lines of Rust CodeRequired coverage: 80.00% Actual coverage: 89.64% Status: PASSED ✅ Details
Coverage of All Lines of Rust CodeRequired coverage: 80.00% Actual coverage: 86.38% Status: PASSED ✅ Details
|
Description of changes
This is a first PR for the implementation of a public AST (motivated by #816), the "PST".
This PR defines:
ExprBuilderfor the PST.The goal in the design of the PST here was to decouple the public representation from internal data structures and data structures meant for serialization/deserialization.
From<...>functions,Issue #, if available
Checklist for requesting a review
The change in this PR is (choose one, and delete the other options):
cedar-policy-core,cedar-validator, etc.)I confirm that this PR (choose one, and delete the other options):
I confirm that
cedar-spec(choose one, and delete the other options):I confirm that
docs.cedarpolicy.com(choose one, and delete the other options):