Skip to content

Separate the initialization distribution from the prior (PyBNF conflates objective prior and start-point sampling) #413

@wshlavacek

Description

@wshlavacek

Problem

PyBNF uses one Prior object for two distinct roles (see the Prior glossary entry): the Bayesian / objective prior that samplers add to the posterior, and the initial-sampling distribution that optimizers and samplers draw their start points from (sample_value, the latin-hypercube seeding). These are different concepts:

  • the objective prior regularizes the fit / is the Bayesian prior in the posterior;
  • the initialization distribution seeds the search and the MCMC chains.

Standard practice — and PEtab — keeps them separate. PEtab v1 had distinct objectivePrior* and initializationPrior*; PEtab v2 dropped initializationPrior* and uses the single prior for the objective only, initializing from the bounds. Either way, the two roles are not the same object, and PyBNF conflating them has real consequences.

Consequences

  • PEtab import faithfulness (PEtab v2 problem importer — the 'two-adapter' proof (first step: parameters table → FreeParameter/Prior) #407). Importing a normal objective prior makes PyBNF initialize from that normal (start points concentrated near the mean), whereas PEtab/standard practice initializes uniformly within the bounds. The target posterior is identical, but the initialization differs — and arguably for the worse (below).
  • Convergence diagnostics. R-hat / ESS (ADR-0009, the Vehtari et al. 2021 conventions) assume over-dispersed initial points across chains to detect non-convergence. Initializing from a tight prior under-disperses the starts, weakening R-hat. A deliberately over-dispersed (or bounds-uniform) initialization is better practice here.
  • Optimizers. Seeding a global optimizer from a concentrated prior can under-explore versus a bounds-spanning start.

Direction (for discussion / ADR)

  • Introduce an explicit Initialization Distribution concept, defaulting to today's behavior (draw from the prior) for backward compatibility, but overridable — e.g. uniform-over-bounds, or an over-dispersed variant of the prior.
  • Decide the config surface (global vs per-parameter) and how it threads through sample_value / the latin-hypercube seeding / the samplers' start-point generation.
  • For the importer, map PEtab's initialization semantics (bounds-uniform in v2) onto it, so an imported problem starts where PEtab intends.

Scope & priority

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions