Normative: Refactor for atomic Deviations and simpler CompareOptions#15
Normative: Refactor for atomic Deviations and simpler CompareOptions#15gibson042 wants to merge 8 commits into
Conversation
Represent the path as an array rather than a string and replace the `reason` object with a simple `disposition` property. type/value mismatch can be detected from `actual` and `expected` properties, and any other deviations should have a distinct path.
Drop `reasons`, separate `iterate` (whether to return an iterator) from `mode` (whether to inspect property descriptors), and promote an enum-valued `prototypes` (whether to match/ignore/recurse [[Prototype]]s). Also mention (but do not introduce) other possible configuration options.
|
"disposition" isn't a word that evokes clarity imo; separately, i don't want to have to infer type/value mismatch by comparing the values - the point of the api is to do all the comparing for me. |
|
Woh, okee this will take a minute to review—it contains a very large amount of radical changes 😵💫 |
Name bikeshedding is welcome, of course. We could also have separate
Well, the fact of an iteration result already means that it has done the comparing for you, and found a Deviation at some |
Indeed, which I've hinted at but not written down until now. I just updated the PR description to be a better summary. |
I mean during Thursday's presentation/defence. Fine to do outside in a dedicated issue/discussion.
This wouldn't make me think of anything related to this API 😅 disposition to me is someone's mood ("a blithe disposition", "a foul disposition"). |
| mode?: | ||
| | 'value' // (default) limit to the values of enumerable properties | ||
| | 'descriptor' // descend from an object to all of its property | ||
| // descriptors rather than its enumerable property values | ||
| | 'descriptor-and-value' // like 'descriptor', but also invoke each getter into a value | ||
| // associated with the node's path (i.e., as a sibling to the | ||
| // descriptor path) |
There was a problem hiding this comment.
This seems quite a limiting change vs the reasons dict, and forces them to be mutually exclusive (but they aren't inherently so) or cumbersome (like 'descriptor-and-value').
There was a problem hiding this comment.
I contend that they are inherently exclusive—the graph changes from $object --[$key]--> $propertyValue to $object --[$key]--> $descriptor --[{value,writable,get,set,enumerable,configurable}]--> $propertyAttributeValue. And the reason for "descriptor-and-value" is to augment the latter with the results of getter invocation which would otherwise not happen when traversing by descriptors.
| prototypes?: | ||
| | 'same-value' // (default) compare the [[Prototype]]s of non-primitive values by SameValue | ||
| | 'ignore' // ignore [[Prototype]] | ||
| | 'recurse' // structurally compare the [[Prototype]]s of non-primitive values |
There was a problem hiding this comment.
Sure, those values for prototype seems fine. But I would merely change it within the reasons dict rather than flattening that—which makes their purpose less clear / more ambiguous (the nesting provides clear scope, specially because it maps reasons input to reasons output).
There was a problem hiding this comment.
I don't see how reasons could be extended to accommodate this, especially to expose what the [[Prototype]] on each side actually is.
There was a problem hiding this comment.
For example, consider the deviations from comparing an expected empty array of realm A to an actual empty array of realm B in which Array.prototype has been extended with a new enumerable method "smoosh".
- "same-value"
- { path:
[{ special: "prototype" }], expected: %A.Array.prototype%, actual: %B.Array.prototype%, disposition: "normal" }
- { path:
- "ignore"
- [no deviations]
- "recurse"
- { path:
[{ special: "prototype" }, "smoosh"], expected:undefined, actual: <function>, disposition: "extra" }
- { path:
There was a problem hiding this comment.
I don't see how reasons could be extended to accommodate this
exactly as you did here, just nested in reason
reasons: {
…
prototypes?:
| 'same-value' // (default) compare the [[Prototype]]s of non-primitive values by SameValue
| 'ignore' // ignore [[Prototype]]
| 'recurse' // structurally compare the [[Prototype]]s of non-primitive values
…
}There was a problem hiding this comment.
This PR is dropping reasons from both input and output... what would you expect compare(arrFromA, arrFromB, { reasons: { prototypes: "recurse" } }) to produce?
| type CompareOptions = { | ||
| mode?: | ||
| | 'fast' // (default) return => boolean | ||
| iterate?: |
There was a problem hiding this comment.
Not super opposed—the new name does more concretely describe what it does, but it also narrows its potential. I originally chose "mode" because it was not narrow (meaning extensible without a breaking change).
There was a problem hiding this comment.
mode just seemed like a better name for the option that differentiates between value traversal vs. descriptor traversal. I'm open to bikeshedding here.
There was a problem hiding this comment.
Might there be more modes in the future? e.g. 'text' that produces what inspect does below?
There was a problem hiding this comment.
@hildjj hmm? inspect is a different proposal for a different API. It would work something like
inspect(compare(a, b)) →
[
{
foo: [
…
+ 'chocolate',
- 'strawberry',
],
bar: [
…
+ 'vanilla',
],
},
]There was a problem hiding this comment.
I can certainly imagine a third kind of return shape, which would be controlled by this parameter. Maybe it should be returns?: "has-deviations" | "iterate-deviations"? Or maybe we should remove this option in favor of separate functions, e.g. deviates: <T>(expected: T, actual: T, options: CompareOptions): (true | undefined) and compare: <T>(expected: T, actual: T, options: CompareOptions): (IterableIterator<Deviation> | undefined). Again, bikeshedding welcome.
There was a problem hiding this comment.
bikeshed: "diff" might be better than "deviations".
Re inspect, yes, I get that it's a different proposal, but it would be nice if the code for that in this case was trivial because the formatted bits were easily available when needed, since it would be easier to produce that diff text while doing the diff than it would be to reconstruct it later.
There was a problem hiding this comment.
bikeshed: "diff" might be better than "deviations".
we considered that but specifically avoided diff because it's already loaded. diff is what inspect produces; deviations is information about differences.
Re inspect, yes, I get that it's a different proposal, but it would be nice if the code for that in this case was trivial because the formatted bits were easily available when needed, since it would be easier to produce that diff text while doing the diff than it would be to reconstruct it later.
are you proposing combining that into this? if so, that's a no-go: the committee already insisted they be separated.
| }, | ||
| >; | ||
| type Deviations = IterableIterator<{ | ||
| path: Array<string | symbol | { special: "descriptor" | "prototype" }>, |
There was a problem hiding this comment.
This is interesting. I was speaking with Ruben this morning about a related interop problem: when inspect subsequently consumes Deviations, it doesn't have sufficient info to reconstruct the input:
Expected:
[
{
foo: ['vanilla', 'strawberry'],
bar: ['chocolate', 'strawberry', 'vanilla'],
},
]Actual:
[
{
foo: ['vanilla', 'chocolate'],
bar: ['chocolate', 'strawberry'],
},
]Deviations:
[
['0.foo.1', {
actual: 'chocolate',
expected: 'strawberry',
reason: {
equality: true,
},
}],
['0.bar.2', {
actual: undefined,
expected: 'vanilla',
reason: {
missing: true,
},
}],
]Inspect:
[
{
foo: [
…
+ 'chocolate',
- 'strawberry',
],
bar: [
…
+ 'vanilla',
],
},
]Inspect doesn't concretely know '0' represents an array (it could be an object with a '0', and 'foo' could be on a RegExp instance).
I was thinking of adjusting path to something like what you've done here but to provide that reconstruction info.
There was a problem hiding this comment.
Reconstruction in general is only possible if you have all nodes rather than just those that deviate, e.g. actual { id: 1 } vs. expected { id: 0 } is distinct from actual { __proto__: null, id: 1 } vs. expected { __proto__: null, id: 0 } but their deviations are indistinguishable.
But regardless, it's universally more useful to represent the path structure as an array rather than expecting consumers to parse strings (which they'll invariably screw up when property names contain punctuation that appears elsewhere in the string, such as " / ( / ) / . / [ / ]), and in fact the string form is incapable of distinguishing properties keyed by distinct symbols with the same description (e.g., in { [Symbol("foo")]: 1, [Symbol("foo")]: 2 }).
There was a problem hiding this comment.
Reconstruction in general is only possible if you have all nodes rather than just those that deviate
Why? I think that's not true/necessary.
e.g. actual
{ id: 1 }vs. expected{ id: 0 }is distinct from actual{ __proto__: null, id: 1 }vs. expected{ __proto__: null, id: 0 }but their deviations are indistinguishable.
We only care about that when reason.prototype is enabled, in which case the parent of id would deviate (reason: 'prototype') if one had a null prototype and the other didn't, and then the value of id also deviates (reason: 'equality').
But regardless, it's universally more useful to represent the path structure as an array rather than expecting consumers to parse strings (which they'll invariably screw up when property names contain punctuation that appears elsewhere in the string, such as
"/(/)/./[/]), and in fact the string form is incapable of distinguishing properties keyed by distinct symbols with the same description (e.g., in{ [Symbol("foo")]: 1, [Symbol("foo")]: 2 }).
👍
There was a problem hiding this comment.
Reconstruction in general is only possible if you have all nodes rather than just those that deviate
Why? I think that's not true/necessary.
e.g. actual
{ id: 1 }vs. expected{ id: 0 }is distinct from actual{ __proto__: null, id: 1 }vs. expected{ __proto__: null, id: 0 }but their deviations are indistinguishable.We only care about that when
reason.prototypeis enabled, in which case the parent ofidwould deviate (reason: 'prototype') if one had a null prototype and the other didn't, and then the value ofidalso deviates (reason: 'equality').
Here's another example: actual { id: 1, label: "foo" } vs. expected { id: 1, label: "bar" } is distinct from actual { label: "foo" } vs. expected { label: "bar" } but their deviations are indistinguishable.
There was a problem hiding this comment.
What?? The deviations are not indistinguishable at all: [label] => { actual: 'foo', expected: 'bar' reason: 'equality' }
There was a problem hiding this comment.
Right, that's the deviation for both { id: 1, label: "foo" } vs. expected { id: 1, label: "bar" } and actual { label: "foo" } vs. expected { label: "bar" }. So in what sense are those deviations distinguishable?
|
What I'd expect is the reason for the deviation :-) iow, either a human or machine readable way to explain why it was considered a deviation. |
Would that be satisfied by renaming
|
|
That's slightly better than a single boolean, but since the implementation has all the information about exactly why it's different, why not encode that so I don't have to replicate internal logic? |
Not human-readable, please. I18n considerations, the inevitable desire to parse it, etc. |
|
Fair enough! I'm also happy with lots of machine-readable info that i can use to construct my own english message. |
In what way does |
|
The latest commits rename |
|
"mismatch" doesn't tell me in what way it's a mismatch. is the type different, is the value different? if it's an object, are its property names different, or its property values under a given key? It's possible that what you have already is granular enough to cover it; but it's tough to review this pre-merge, so i may need to wait til after to list concrete examples. |
Different type implies different value, so the latter covers the former. And "type" on its own seems too coarse to specifically promote when any consumer who actually wants that can use
If property names or values differ, then compare(
{ foo: { bar: 'a' } },
{ foo: { bar: 'b', qux: 'c' } },
{ iterate: 'deviations' },
);
Iterator => IterableIterator(2) {
{
path: ['foo', 'bar'],
expected: 'a',
actual: 'b',
kind: 'mismatch',
},
{
path: ['foo', 'qux'],
expected: undefined,
actual: 'c',
kind: 'extra',
},
}
Here's what it looks like now: https://github.com/gibson042/tc39-proposal-comparisons/blob/gh-5-followup-2/README.md |
| * TypedArrays containing the _same values in the same sequence_ are equal, except when `CompareOptions.reasons.constructor` is enabled. | ||
| * A box primitive (eg `new Boolean(true)`) equals its primitive (eg `true`), except when `CompareOptions.reasons.constructor` is enabled. | ||
| * TypedArrays containing the _same values in the same sequence_ are equal, except when not ignoring prototypes. | ||
| * A boxed primitive (eg `new Boolean(true)`) never equals an actual primitive (eg `true`). |
There was a problem hiding this comment.
-1 for the "never"
Not necessarily opposed to collapsing constructor into prototype, but that seems like conflating things that have separate value. What's your rationale for it?
There was a problem hiding this comment.
-1 for the "never"
Reworded to be less absolute.
Not necessarily opposed to collapsing constructor into prototype, but that seems like conflating things that have separate value. What's your rationale for it?
What do you mean by "collapsing constructor into prototype"? This PR has no special treatment for constructor, which in most cases is an own property on the [[Prototype]] and would therefore be exposed directly only by prototypes: "recurse" and indirectly by prototypes: "same-value"... it removes conflation of separate nodes in the reference graph.
There was a problem hiding this comment.
What do you mean by "collapsing constructor into prototype"? This PR has no special treatment for
constructor
reasons.constructor
false|true
Whether to consider constructor. This affects, amongst others, Box Primitives (new Boolean(true)vstrue) and TypedArraysnew Int8Array([1,2])vsnew Uint8Array([1,2])) where differenes are pedantic.
and
Equality
[…]
- TypedArrays containing the same values in the same sequence are equal, except when
CompareOptions.reasons.constructoris enabled.- A box primitive (eg
new Boolean(true)) equals its primitive (egtrue), except whenCompareOptions.reasons.constructoris enabled.
There was a problem hiding this comment.
What do you mean by "collapsing constructor into prototype"? This PR has no special treatment for
constructorit does:
No, this PR has no special treatment for constructor: https://github.com/gibson042/tc39-proposal-comparisons/blob/gh-5-followup-2/README.md
The only remaining mention is "CompareOptions might also be extended… Treat the constructor of a non-primitive value as a child for either leaf comparison or recursion, even when ignoring [[Prototype]]" (which would likely be reintroduced if we consider the above boxed primitive use cases sufficiently important and/or common).
| path: Array<string | symbol | { special: "descriptor" | "prototype" | "value" }>, | ||
| actual: unknown, | ||
| expected: unknown, | ||
| kind: "extra" | "missing" | "mismatch", |
There was a problem hiding this comment.
kind (enum) instead of reasons (dict) would have a perf benefit that I think @o- would like. But do we lose valuable information? I think more than 1 reason would never happen—I think we said the engine gets to pick whichever is most relevant? (following "general to specific")
If we do this, I think reason is a better name than kind (which is very ambiguous), using the existing keys from reasons (constructor, enumerability, equality, prototype, type, etc).
There was a problem hiding this comment.
You might be misunderstanding the essence of this change... fundamentally, it is not even possible to have more than 1 reason at the same path. Differing prototypes is a Deviation at path [{ special: "prototype" }], differing constructors is a Deviation at path [{ special: "prototype" }, "constructor"], differing enumerabilities is a mismatch Deviation at path [propertyKey, { special: "descriptor" }, "enumerable"], etc.
There was a problem hiding this comment.
You might be misunderstanding the essence of this change... fundamentally, it is not even possible to have more than 1 reason at the same path
What does "at the same path" mean? There could be multiple deviations, but maybe only if somewhere in a branch prototype is different (and prototype differentiation is enabled).
I don't understand why you would put deviation info in the path list. That seems conflating (and also confusing AF).
There was a problem hiding this comment.
I'm not putting deviation info in the path list, I'm making Deviation atomic and scoping each instance to a path.
| | 'fast' // (default) return => boolean | ||
| | 'full' // return => Iterator<Deviation> | ||
| | 'value' // (default) limit to the values of enumerable properties | ||
| | 'descriptor' // descend from an object to all of its property | ||
| // descriptors rather than its enumerable property values | ||
| | 'descriptor-and-value' // like 'descriptor', but also invoke each getter into a value |
There was a problem hiding this comment.
I think this is much too limiting and also creates multiple sources of truth for what it's doing, which may end up conflicting.
There was a problem hiding this comment.
The problem is that without this, relevant differences can be undetectable. Consider the following:
let lastValue = 0n;
const counter = () => ++lastValue;
const a = Object.defineProperty({}, "counter", { enumerable: true, get: counter });
const b = Object.defineProperty({}, "counter", { enumerable: true, get: counter });
compare(a, b, { iterate: "deviations" });
// => Deviation { path: ["counter"], expected: 1n, actual: 2n, kind: "mismatch" }
compare(a, b, { iterate: "deviations", mode: "descriptor" });
// => no Deviations
compare(a, b, { iterate: "deviations", mode: "descriptor-and-value" });
// => Deviation { path: ["counter", { special: "value" }], expected: 1n, actual: 2n, kind: "mismatch" }Switching to use descriptors hides the fact that the referentially-equal property getter returns different values, which is fine in many cases but in others should be considered as constituting a deviation.
There was a problem hiding this comment.
Ahh, okay. I understand what you're getting at. This is already possible in the current though. Why is this better (or why is the current worse)?
There was a problem hiding this comment.
Ahh, okay. I understand what you're getting at. This is already possible in the current though.
I don't see anything in the current README that seems to cover configurability of whether to consider a property value in addition to property descriptor attributes. Can you provide a concrete example to demonstrate what I'm missing?
Why is this better (or why is the current worse)?
I don't think I can answer that narrowly without seeing an answer to the above request, but what makes this overall PR better is that atomic Deviations with reason being exactly one of extra vs. missing vs. mismatch are simpler to consume and comprehend than complex/polymorphic Deviations.
There was a problem hiding this comment.
CompareOptions.reasons.descriptors enable it to consider descriptors or disable it to consider values
There was a problem hiding this comment.
Concretely, how can the proposed CompareOptions.reasons.descriptors be used to switch between realizing a and b as deviating vs. not in the above example, and with what output?
let lastValue = 0n; const counter = () => ++lastValue; const a = Object.defineProperty({}, "counter", { enumerable: true, get: counter }); const b = Object.defineProperty({}, "counter", { enumerable: true, get: counter });
I see three cases (only [[Get]] result [value] vs. only [[GetOwnProperty]] result [property descriptor] vs. both), and am interested to understand how a single boolean option can accommodate all of them.
key is the identifier, a value is happenstance, so the key is what matters for determining what to pair for the comparison. |
|
We discussed this synchronously, and I have some homework...
|
Belated (but as promised) followup to #5.
iterate(whether to return an iterator) frommode(whether to inspect property descriptors).prototypes(whether to match/ignore/recurse [[Prototype]]s).reasonobject with a simplekindproperty describing what differs atpath(exhaustively, "extra" for only-in-actual, "missing" for only-in-expected, or "mismatch" for present-in-both but not matching). Deviations are always simple at any given path, with complexity represented instead via distinct paths.Excludes the SameValueZero → SameValue change, which is in #14 and will likely require conflict resolution one way or the other.