Fix INDEPENDENT mechanism crash on duplicate cliques#4
Open
gghatano wants to merge 1 commit into
Open
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
`independent.run_mechanism` builds `measurements` from `initial_measurements`
(the one-way marginals that `data_generation_v2.generate` always passes in)
and then appends a freshly measured one-way marginal for every attribute. When
`initial_potentials` is not None (e.g. the empty CliqueVector returned by
`constraints.get_initial_parameters` for the no-constraints case), the code
calls `potentials.expand([m.clique for m in measurements])` with a clique list
that now contains duplicate one-way cliques.
With current `mbi`, `CliqueVector.expand` requires unique cliques and raises:
ValueError: Cliques must be unique.
so `dpsynth.generate(..., discrete_config=IndependentConfig())` fails. De-dup
the clique list (order-preserving) before expanding. Measurements themselves are
left unchanged, so the estimation/accounting are unaffected.
fc87d86 to
1755e90
Compare
ryan112358
requested changes
Jun 11, 2026
| potentials = initial_potentials | ||
| if potentials is not None: | ||
| potentials = potentials.expand([m.clique for m in measurements]) | ||
| # `measurements` can contain the same clique more than once (the one-way |
Collaborator
There was a problem hiding this comment.
thanks for the bug fix, but can you remove this inline comment or reduce it to 1 line?
Collaborator
There was a problem hiding this comment.
Can you also add test coverage for this case?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes the
ValueError: Cliques must be unique.raised when runningdpsynth.generate(..., discrete_config=IndependentConfig())(see issue:"IndependentConfig synthesis raises 'Cliques must be unique.'").
Cause
independent.run_mechanismbuildsmeasurementsfrominitial_measurements(the one-way marginals that
data_generation_v2.generatealways provides) andthen appends a freshly measured one-way marginal for every attribute, so the
list contains each one-way clique twice. When
initial_potentialsis notNone(e.g. the emptyCliqueVectorfromconstraints.get_initial_parametersin the no-constraints case),
potentials.expand([m.clique for m in measurements])is called with duplicate cliques, which currentmbirejects.Change
De-duplicate the clique list (order-preserving) before
expand:measurementsitself is unchanged, so the marginals fed tomirror_descentand the privacy accounting are unaffected — only the clique set used to expand
the (zero) potentials is de-duplicated.
Verification
The repro in the issue (a 2-column categorical frame with
IndependentConfig()) now returns a syntheticDataFrameinstead of raising.MST/AIM are unaffected.
Fixes #2