Spec: Add named budgets#179
Conversation
7b0e33e to
a1a07dc
Compare
a1a07dc to
83005e7
Compare
|
Hey @alexmturner, here's a rough draft of named budgets. PTAL! I had to spec out the concept of budgeting scopes, but I think there's room for improvement. I added struct fields like the duration of time and the calling API to demonstrate that the max budget (the field we use) is scoped to those things. I'm curious what you think. FYI, I've also done a few novel things (well, novel to this spec). I used the "advisement" class to make a warning and then used MathML tags to make the equation look nice. |
Aha, I guess I should mention issue #141. |
| Issue: Currently, this spec only references [=budgeting scope/max budget=]. We | ||
| need to decide whether deleting the unused fields will improve clarity. | ||
|
|
||
| <h3 dfn-type=dfn>Named budget ledger</h3> |
There was a problem hiding this comment.
I suppose "ledger" is inaccurate because it doesn't record individual "transactions". WDYT about calling it a Named budget balance or maybe just a Named balance?
There was a problem hiding this comment.
Wondering if we should instead track a "named budget reservation"? The current flow in the spec doesn't take into account persisting to disk / comparing the usage to what's already on disk (as you note below). So, to track how much budget each named budget has 'left', we'd need to first query what's in the budget storage. Instead, it might make more sense to track each reservation and then sort out whether contributions need to be dropped in the usual budgeting step.
There was a problem hiding this comment.
I think I might want to generalize even further. Sometimes I want these maps from strings to int to track budget remaining, but sometimes it's useful to track budget consumed. I think it might reduce the cognitive load of reading some of the new algorithm additions if we're able to say things like [=ledger/Subtract=] |budgetConsumed| from |budgetRemaining|. Anyway, that's still TBD.
There was a problem hiding this comment.
Ack -- I think my one concern is just that the description "A named budget ledger tracks the remaining budget per named budget" doesn't quite address that the ledger entry starts unadjusted for previous usage, and that we also need to limit based on overall usage (not just per-name). But if we clarify that, then I think counting 'down' is fine. (Another naming option: named budget allocation?)
| A [=user agent=] holds a <dfn>named budget ledgers map</dfn>, which is a [=map=] | ||
| from [=batching scopes=] to [=lists=] of [=named budget ledgers=]. | ||
|
|
There was a problem hiding this comment.
I realized I omitted a key part, which is that budget consumption against named budgets is persisted on disk and that implementations will set practical limits, like "no more than 25 distinct named budgets over 24 hours".
It's not clear to me what the "lifetime" of the user agent is. Can we model storing to the disk just by updating these structures held by the user agent? If so, maybe I should add a new "persistent storage" object held by the user agent and explicitly store records of budget consumption per minute.
There was a problem hiding this comment.
In general, I think we've used the "Storage" section of the explainer for data that is persisted. The spec could do a better job of specifying the budgeting query process, but I think for this PR, we could just do a more scoped change to the "query the budget" algorithm by plumbing the reservations to that point and perhaps adding some additional explanatory notes (or else detailing the process a bit more)
There was a problem hiding this comment.
Gotcha. Still mulling on this, but in the meantime I left an issue about moving the data away from the user agent and onto the {{PrivateAggregation}} object. But yeah, we might be able to avoid the field entirely?
There was a problem hiding this comment.
One other issue that comes to mind here is that for Protected Audience "batching scopes" are auctions, but the explainer says reservations are scoped to a single worklet/script runner context. Should we be using "debug scopes" here?
One other complication with moving reservations onto the {{PrivateAggregation}} object is handling contributions triggered by fenced frames via window.fence.reportEvent() (as those contributions outlive the auction).
| required bigint bucket; | ||
| required long value; | ||
| bigint filteringId = 0; | ||
| DOMString budgetName; |
There was a problem hiding this comment.
Oops, I think we landed on namedBudget but I forgot to update this. TBD.
|
Nice! Did a first round of review :) |
alexmturner
left a comment
There was a problem hiding this comment.
Helps if I actually flush my comments:
| are: | ||
|
|
||
| 1. Let |scopingDetails| be [=this=]'s [=PrivateAggregation/scoping details=]. | ||
| 1. Let |namedBudgetExists| be the result of [=checking that a contribution's |
There was a problem hiding this comment.
Should we move this into "validate a histogram contribution"? (nit: maybe also a slight preference for validating this after the bucket/value/filtering ID, just to match our declaration order)
There was a problem hiding this comment.
Ah, the reason I didn't do that is that contributeToHistogramOnEvent() optionally defers to "should perform default contributeToHistogramOnEvent() processing", and in that case it would skip our named-budget validation logic.
Although maybe we really want to completely defer to that algorithm? If so, let's expose a "validate a contributions' named budget" algorithm that they can optionally use.
There was a problem hiding this comment.
I see that argument, but we should be consistent between this and other parameters (like filtering IDs). So, yeah I'd lean towards exposing a named budget validation algorithm that anyone can use.
(However, if we change our IDL definitions so something like this, that might affect how much we need to delegate.)
There was a problem hiding this comment.
OK, so it looks like that ExtendedHistogramContribution change was merged into the Protected Audience spec. I didn't really understand how it would affect how much we delegate?
There was a problem hiding this comment.
I was thinking we could potentially do the validation before calling the delegation function, but I don't think that would align with our likely implementation. So, yeah probably doesn't affect it.
| 1. Let |remainingBudget| be |maxBudget|. | ||
| 1. Let |ledger| be a new [=named budget ledger=]. | ||
| 1. [=map/For each=] |name| → |fraction| of |namedBudgets|: | ||
| 1. If |fraction| is NaN, ±Inf, or -0, [=exception/throw=] a |
There was a problem hiding this comment.
nit: a double already can't be a NaN or infinite.
There was a problem hiding this comment.
Amazing! I didn't even check the WebIDL definition because usually "double" usually refers to IEEE-754 binary64.
It's kind of funky that the WebIDL {{double}} type is not closed under addition (and other operations). Like, if we said that a {{double}} |x| is
Also, being finite doesn't rule out -0, but I guess that's inconsequential since -0 compares equal to +0.
There was a problem hiding this comment.
Yeah, it not being closed under simple mathematical operations is a bit funny, but treating the values as "complete" doubles after the WebIDL validation seems reasonable to me.
| 1. [=map/For each=] |name| → |fraction| of |namedBudgets|: | ||
| 1. If |fraction| is NaN, ±Inf, or -0, [=exception/throw=] a | ||
| "{{DataError}}" {{DOMException}}. | ||
| 1. If |fraction| is negative or greater than 1, [=exception/throw=] a |
There was a problem hiding this comment.
Wdyt about requiring |fraction| to be non-negative?
There was a problem hiding this comment.
Hmm, I think we already require it to be non-negative, since we throw an exception when it's negative.
I can see the appeal of throwing when it's equal to zero as well. Although, here's a potential use case for zero-valued budgets: suppose a site has two worklet scripts, but one is never supposed to make contributions to the "foo" budget. They could enforce this with privateAggregation.reserveNamedBudgets({foo: 0});.
However, the "enforcement" would mean silently dropping contributions that reference the "foo" named, rather than throwing an exception from contributeToHistogram*(). So maybe not all that useful.
If we want to support this use case, then maybe we should reevaluate console warning conditioned on "reallocating" named budgets between isolated contexts.
WDYT? I'm okay with disallowing zero and relaxing the requirement in the future if someone wants it, but I also don't see much danger in allowing zero either.
There was a problem hiding this comment.
Ah oops yep I meant positive, not non-negative! I think I'd slightly lean towards disallowing zero as using the budgeting layer to ignore certain contributions seems a bit janky, but I do see your point that there might be potential uses.
There was a problem hiding this comment.
That's fair. I'll disallow zero.
| A [=user agent=] holds a <dfn>named budget ledgers map</dfn>, which is a [=map=] | ||
| from [=batching scopes=] to [=lists=] of [=named budget ledgers=]. | ||
|
|
There was a problem hiding this comment.
In general, I think we've used the "Storage" section of the explainer for data that is persisted. The spec could do a better job of specifying the budgeting query process, but I think for this PR, we could just do a more scoped change to the "query the budget" algorithm by plumbing the reservations to that point and perhaps adding some additional explanatory notes (or else detailing the process a bit more)
| 1. If |allocation| is greater than |remainingBudget|, | ||
| [=exception/throw=] a "{{DataError}}" {{DOMException}}. | ||
| 1. Set |remainingBudget| to |remainingBudget| - |allocation|. | ||
| 1. [=map/Set=] |ledger|[|name|] to |allocation|. |
There was a problem hiding this comment.
I don't think this works for multiple budgeting scopes. See other comments about just plumbing the reservations around. For this, I suppose we could have a constant that is the "maximum budget allocation" or similar, i.e. 2^20 for us, and set the reservation to floor(|maxBudgetAllocation| * |reservation|) / |maxBudgetAllocation|. (But maybe that's a bit too jankily expressed.)
There was a problem hiding this comment.
I didn't follow that, can you elaborate on the issue with multiple budgeting scopes?
What I'm trying to do here is map the list of budgeting scopes to a list of ledgers. Each ledger is a map of named budgets to remaining integer values of remaining budget. This list<map<string?, int>> structure is a little unwieldy, but I think it's necessary to sidestep any floating point gotchas.
There was a problem hiding this comment.
I think I missed a level of nesting here. But still, I think that (if we're requiring budgets of the form 2^k, k <= 52) we should be able to get away with just one constant here.
I guess it depends a little on how we expect to expand our spec budget handling in the future.
| [=budgeting scope=], be aware that certain values of [=budgeting scope/max | ||
| budget=] are incompatible with the way that | ||
| {{PrivateAggregation/reserveNamedBudgets}} interprets budget fractions. | ||
| Implementers must verify that every slice <math><mi>n</mi></math> of their |
There was a problem hiding this comment.
Wondering if we just declare that only powers of two are ok? E.g. along the lines of "Implementers must use a max budget of the form 2^k with k <= 52. Note: This ensures that i/|maxBudget| can be exactly represented with a Number for all integral i in [0,|maxBudget|]."
Alternatively, I think we could probably support any value by incrementing the fraction up to the next valid double (i.e. by adding Math.floor(Math.log2(fraction)) * Number.EPSILON), but that's probably too hacky to be worth exploring if we can just require powers of two.
(Side note: do you know why pr-preview doesn't seem to be working? Would be nice to see all the rendered equations!)
There was a problem hiding this comment.
Wondering if we just declare that only powers of two are ok?
I struggled to thread the needle here, but I erred on the side of too much detail because were inside the warning box. I'll give it another shot.
Alternatively, I think we could probably support any value by incrementing the fraction up to the next valid double (i.e. by adding Math.floor(Math.log2(fraction)) * Number.EPSILON), but that's probably too hacky to be worth exploring if we can just require powers of two.
Yeah, I strongly suspect that there's a simple function that achieves what we want. I just can't prove it!
(Side note: do you know why pr-preview doesn't seem to be working? Would be nice to see all the rendered equations!)
Gah, no idea. It's extra steps, but you can always go to Checks > Validate Spec > Artifacts and download a .zip containing the rendered HTML.
Edit: It's back!
|
Hey @alexmturner, new "patchset" (fixup commit). PTAL when you get a chance! |
| are: | ||
|
|
||
| 1. Let |scopingDetails| be [=this=]'s [=PrivateAggregation/scoping details=]. | ||
| 1. Let |namedBudgetExists| be the result of [=checking that a contribution's |
There was a problem hiding this comment.
OK, so it looks like that ExtendedHistogramContribution change was merged into the Protected Audience spec. I didn't really understand how it would affect how much we delegate?
| 1. [=map/For each=] |name| → |fraction| of |namedBudgets|: | ||
| 1. If |fraction| is NaN, ±Inf, or -0, [=exception/throw=] a | ||
| "{{DataError}}" {{DOMException}}. | ||
| 1. If |fraction| is negative or greater than 1, [=exception/throw=] a |
There was a problem hiding this comment.
That's fair. I'll disallow zero.
| : <dfn export>embedding API supports named budgets</dfn> (default false) | ||
| :: A [=boolean=]. Embedding APIs that set this to true should define a | ||
| "<code>namedBudget</code>" field in their custom contribution dictionary | ||
| that accepts nullable {{DOMString}} values with a default of null. |
There was a problem hiding this comment.
@alexmturner Thoughts on this new field attached to {{PrivateAggregation}}?
I'm thinking that if embedding APIs use something other than {{PAHistogramContribution}} and also want to support named budgets...
- Their dictionary must contain a field for the budget name. I can't imagine they'd need something other than
DOMString? namedBudget = null;. - If they override default processing of
contributeToHistogramOnEvent(event, contribution)whencontribution != null, it's their responsibility to run our [=validate a contribution's named budget=] algorithm from their hook algorithm.
I've addressed the namedBudget field issue in a kind of strange way. Maybe we should define a WebIDL "trait" dictionary:
dictionary PAWithNamedBudget {
DOMString? namedBudget = null;
};
// --- Embedding API's spec ---
dictionary MyCustomContribution : PAWithNamedBudget {
Foo bucket;
Bar value;
Baz filteringId;
};I tried to address the requirement to call our named budget validation algorithm down in the section titled
"Overriding contributeToHistogramOnEvent() processing".
There was a problem hiding this comment.
I'm not sure I understand the need for this boolean -- aren't we planning on all contexts supporting the feature?
On the general point about overloads, I think the current approach is good (at least for now). It also aligns with how we handle filteringIds. If we wanted to restrict the contribution dictionary definition, I think it's probably easier to use one of the options in WICG/turtledove#1412 (comment) to restrict some of the fields' types without restricting others. (In that case, we would still need to delegate validation.)
| should not return an [=exception=] in those cases. However, the embedding API | ||
| may accept additional `event`s or `contribution`s that would not be accepted by | ||
| the default processing defined in this spec. | ||
| <table class="complex data longlastcol"> |
There was a problem hiding this comment.
The table's not strictly necessary, but I thought it's kind of nice for clarity.
| scope/max budget=] of 103. | ||
| </div> | ||
|
|
||
| <dfn>Maximum number of named budgets</dfn> is a positive integer that defines an |
There was a problem hiding this comment.
Yeah, I guess that dueling implementations with different values would not play well unless we had some rectification logic (like don't reject overly-long budget names, try truncating and see if they're all still unique...). I'll move these to constants.
| 1. Let |allUnmergedContributions| be the result of [=compiling all unmerged | ||
| contributions=], given |reportingOrigin|, |api|, |pendingContributions|, | ||
| |preSpecifiedParams|, |timeout| and |currentWallTime|. | ||
| 1. If [=named budget ledgers map=][|batchingScope|] [=map/exists=]: |
There was a problem hiding this comment.
OK, so I still have this in place, but "query the budget" synthesizes ledgers on demand when reserveNamedBudgets() was never called. I'm open to moving that logic or encapsulating it in a helper algorithm, like "get or create named budget ledgers".
alexmturner
left a comment
There was a problem hiding this comment.
Thanks, Dan! Did a pass :)
| 1. Let |ledgers| be a new [=list=] of [=named budget ledgers=]. | ||
| 1. [=list/iterate|For each=] |budgetingScope| in [=all budgeting scopes=]: | ||
| 1. Let |maxBudget| be |budgetingScope|'s [=budgeting scope/max budget=]. | ||
| 1. Let |maxBudgetDouble| be a {{double}} value equal to |maxBudget|. |
There was a problem hiding this comment.
nit: do we need this? (When we multiply fraction, we'll be using a double anyway)
| |maxBudgetDouble|. | ||
| 1. Let |allocation| be an integer equal to the floor of | ||
| |allocationDouble|. | ||
| 1. If |allocation| is not contained in [=the inclusive range|the range=] |
There was a problem hiding this comment.
This seems to duplicate the restrictions on |fraction|
| 1. If |allocation| is greater than |remainingBudget|, | ||
| [=exception/throw=] a "{{DataError}}" {{DOMException}}. | ||
| 1. Set |remainingBudget| to |remainingBudget| - |allocation|. | ||
| 1. [=map/Set=] |ledger|[|name|] to |allocation|. |
There was a problem hiding this comment.
I think I missed a level of nesting here. But still, I think that (if we're requiring budgets of the form 2^k, k <= 52) we should be able to get away with just one constant here.
I guess it depends a little on how we expect to expand our spec budget handling in the future.
| : <dfn export>embedding API supports named budgets</dfn> (default false) | ||
| :: A [=boolean=]. Embedding APIs that set this to true should define a | ||
| "<code>namedBudget</code>" field in their custom contribution dictionary | ||
| that accepts nullable {{DOMString}} values with a default of null. |
There was a problem hiding this comment.
I'm not sure I understand the need for this boolean -- aren't we planning on all contexts supporting the feature?
On the general point about overloads, I think the current approach is good (at least for now). It also aligns with how we handle filteringIds. If we wanted to restrict the contribution dictionary definition, I think it's probably easier to use one of the options in WICG/turtledove#1412 (comment) to restrict some of the fields' types without restricting others. (In that case, we would still need to delegate validation.)
| Issue: Currently, this spec only references [=budgeting scope/max budget=]. We | ||
| need to decide whether deleting the unused fields will improve clarity. | ||
|
|
||
| <h3 dfn-type=dfn>Named budget ledger</h3> |
There was a problem hiding this comment.
Ack -- I think my one concern is just that the description "A named budget ledger tracks the remaining budget per named budget" doesn't quite address that the ledger entry starts unadjusted for previous usage, and that we also need to limit based on overall usage (not just per-name). But if we clarify that, then I think counting 'down' is fine. (Another naming option: named budget allocation?)
| [=budgeting scope/max budget=]. | ||
| 1. [=list/Append=] |remainingBudgetLedger| to |remainingBudgetLedgers|. | ||
| 1. Let |numScopes| be the [=list/size=] of [=all budgeting scopes=]. | ||
| 1. [=Assert=] that |numScopes| is equal to the [=list/size=] of |
There was a problem hiding this comment.
Wondering if the ledgers should be a map or something if we go the route of actually enumerating the two 'scopes' (I.e. to avoid the parallel lists).
If we wanted to simplify this logic, we could also consider just computing allocations based off the 2^16 10-min limit and scaling it up 16x for the daily limit. We could also consider having the implementation-defined algorithm below just return a "budget left" for each name, taking into account both scopes. But it's up to you how much detail you want to put in this PR.
| 1. Let |duration| be |budgetingScope|'s [=budgeting scope/duration=]. | ||
| 1. Let |remainingBudgetLedger| be |remainingBudgetLedgers|[|i|]. | ||
| 1. [=map/For each=] |name| → |maxBudget| of |remainingBudgetLedger|: | ||
| 1. Let |budgetConsumed| be an integer determined by an |
There was a problem hiding this comment.
We also need to consider the case where |budgetConsumed| is small for a particular named budget, but usage of other names limits this one more than its current reservation. (This could happen if reservations aren't always consistent between calls.)
For this, I think we need an overall previous consumption or an overall budget left.
|
|
||
| 1. Let |resultForEachContribution| be a new [=list=] of [=booleans=]. | ||
| 1. Let |approvedValueSum| be 0. | ||
| 1. Let |ledgerConsumed| be a new [=named budget ledger=]. |
There was a problem hiding this comment.
This conflicts a bit with the named budget ledger definition above as "track[ing] the remaining budget". We might want to adjust the definition (or just maintain a map here).
| are: | ||
|
|
||
| 1. Let |scopingDetails| be [=this=]'s [=PrivateAggregation/scoping details=]. | ||
| 1. Let |namedBudgetExists| be the result of [=checking that a contribution's |
There was a problem hiding this comment.
I was thinking we could potentially do the validation before calling the delegation function, but I don't think that would align with our likely implementation. So, yeah probably doesn't affect it.
Preview | Diff