Consider a randomized-response-like privacy mechanism

I want to propose a new privacy mechanism to IPA that:
- Serves to help solve the _optimization_ use-case
- Does so under the existing differential privacy constraints of IPA as I understand them

This mechanism is inspired by the following research papers:
- [Deep Learning with Label Differential Privacy](https://arxiv.org/abs/2102.06062) (NeurIPS 2021)
- [Regression with Label Differential Privacy](https://arxiv.org/abs/2212.06074) (ICLR 2023)

This research operates in the “label DP” setting, where model features are publicly known, but labels are not. This matches the setting of IPA well, because the report collector has full information about the raw impression event (and thus the corresponding feature vector), but the label (e.g. whether the impression led to a conversion) is protected with differential privacy. While these techniques are in the “local” differential privacy regime, they (somewhat surprisingly) perform close to the state of the art in private model training depending on the task.

Here is an outline of how we could implement one of the algorithms described here, `RROnBins` in the IPA setting:
- For every per-source breakdown key, pass along a list of possible output bins. For example, a bucketized range of values like `{[0, 10], [11, 100], [101+]}`
- After aggregation*, rather than applying a fixed noise distribution like Gaussian to the sum, perform k-ary Randomized Response on the specified k output bins for that breakdown key. That is, with a probability `p = k/(k -1 + e^epsilon)`, pick a bin at random, otherwise pick the correct bin the aggregate falls in. This mechanism satisfies epsilon differential privacy.
- Note: the mechanism can also be extended to do randomized response over a restricted set of [trigger-side breakdown keys](https://github.com/patcg-individual-drafts/ipa/issues/55) if those become supported by IPA.

**\*Note: for practical purposes, we would likely need to support a unique breakdown key per source to take advantage of this research, i.e “aggregate” over only single sources.** While IPA currently only mentions “aggregate” queries, as far as I can tell there are no restrictions in the existing design to aggregate only over single events (i.e. a unique breakdown key per source), as long as the output is protected by differential privacy. The mechanism described above offers the same protection.

For an overview of how we’re thinking about these capabilities in ARA, see the [flexible event-level explainer](https://github.com/WICG/attribution-reporting-api/blob/main/flexible_event_config.md) we recently published for ARA.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider a randomized-response-like privacy mechanism #60

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Consider a randomized-response-like privacy mechanism #60

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions