Distance metrics compare 1-D marginals only: cross-world coupling (PN/PS) is invisible

_A complete, tested branch implementing the proposal below is ready at https://github.com/fabio-rovai/causal-perception-implementation/tree/pn-ps-identification-bounds (15 passing tests, additive and opt-in). I can open it as a PR whenever you prefer._

# distances.py compares marginals, so cross-world counterfactual coupling is invisible: add PN/PS + Fréchet bounds (opt-in)?

Hi, and thanks for open-sourcing this. I have been reading through the causal
perception implementation and I think I have spotted a subtle but important
identification gap, and I would like to check whether you would welcome a small
opt-in PR before I send one.

## What I think is happening

`distances.py` (W2, KL, TV) takes two 1D sample arrays, and
`perception.run_perception` feeds it the per-individual outcome-probability
vectors as 1D marginals. The cross-world joint P(Y_0, Y_1) of the binary outcome
is never formed. On top of that, `LinearANM.abduct` explicitly does no noise
abduction for Y (the comment says the counterfactual probability is computed from
the classifier on counterfactual parents). So the cross-world coupling of the
binary outcome is not pinned down by anything in the pipeline, and any two SCMs
that share the two interventional marginals but differ in how they couple the two
worlds will look identical to W2/KL/TV.

## A small witness

I ran a quick check with two binary potential-outcome models that share their
marginals exactly (R0 = 0.5, R1 = 0.7):

- monotone coupling, p11 = P(Y_0=1, Y_1=1) = 0.50 -> PN = P(Y_0=0 | Y_1=1) = 0.286
- independent outcomes, p11 = 0.35 -> PN = 0.500

`compute_all_distances` on the 1D marginals reads ~0 for W2, KL and TV in both
cases (the marginals are identical), but the probability of necessity separates
the two models by about 0.214. The marginal distances are blind to exactly that.

This is not a bug in the distances, it is an identification fact: with only the
two marginals and no abducted outcome noise, P(Y_0, Y_1) is only Fréchet-bounded.
In the fair-credit framing this matters, because a point counterfactual on a
protected attribute quietly hides an interval.

## Proposal

Would you welcome a small, additive, opt-in PR that:

- reports PN and PS alongside their sharp Fréchet identification bounds from the
  two marginals (with the assumption stated explicitly in the docstrings),
- names the two endpoint couplings (monotone and independent) as point estimates
  inside the bounds, and
- adds a `run_*` script and tests, including the witness above,

with zero change to any existing module, output or default? I would keep it
entirely separate from the current distance pipeline so nothing you rely on
moves.

Happy to sign the CLA. If this is useful I will open the PR; if you would rather
shape it differently first, I am glad to discuss here.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Distance metrics compare 1-D marginals only: cross-world coupling (PN/PS) is invisible #7

distances.py compares marginals, so cross-world counterfactual coupling is invisible: add PN/PS + Fréchet bounds (opt-in)?

What I think is happening

A small witness

Proposal

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Distance metrics compare 1-D marginals only: cross-world coupling (PN/PS) is invisible #7

Description

distances.py compares marginals, so cross-world counterfactual coupling is invisible: add PN/PS + Fréchet bounds (opt-in)?

What I think is happening

A small witness

Proposal

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions