frs_habitat_classify path-1 should accept known-spawning as a rearing trigger

## Problem

`frs_habitat_classify` path-1 (rearing-on-spawning) only triggers when `h.spawning IS TRUE` (modelled, rule-based). bcfishpass's `habitat_linear_<sp>` path-1 is broader:

```sql
WHERE (h.spawning IS TRUE or coalesce(hk.spawning_st, 0) = 1)
```

`hk.spawning_st` comes from `bcfishpass.streams_habitat_known`, populated from `user_habitat_classification.csv`. So bcfp's "rule-based" `habitat_linear_<sp>` already includes operator-known spawning as a path-1 trigger.

Link's `apply_habitat_overlay: no` config is meant to match bcfp's rule-based output. But because fresh's classify doesn't read user_habitat_classification, link misses the known-spawning-trigger rearing credits that bcfp produces.

## Three logical modes (only two exist in fresh)

| Mode | Modelled rule | Known-spawning triggers rearing | Final overlay |
|---|---|---|---|
| bcfp `habitat_linear_<sp>` | yes | **yes** | no |
| bcfp `streams_habitat_linear` | yes | yes | yes |
| fresh `apply_habitat_overlay: no` | yes | **no** | no |
| fresh `apply_habitat_overlay: yes` | yes | no (only post-overlay) | yes |

Bcfp's rule-based table corresponds to a third mode that fresh doesn't currently produce.

## Concrete case

link MORR ST gap, 2026-04-30. bcfp credits ~60 km of ST rearing across the top 10 streams alone via the hk-trigger path:

| blue_line_key | n rearing segs (with hk.spawning_st = 1) | rearing km |
|---|---|---|
| 360885316 (Morice River) | 179 | 35.24 |
| 360885021 (Gosnell Creek) | 59 | 12.14 |
| 360819468 | 24 | 5.44 |
| 360837468 | 16 | 3.71 |
| ... | | |

bcfishpass.streams_habitat_known on MORR has 353 ST segments with `spawning_st = 1` (from 31 distinct rows in user_habitat_classification.csv expanded across DRM ranges). Link's classify path-1 doesn't see them, so the rearing-on-spawning credit on those streams doesn't fire.

## Proposed solution — reuse `frs_habitat_overlay`, just call it earlier

`frs_habitat_overlay` already does exactly the OR-additive operation we need: source-table shape `(blue_line_key, drm, urm, species_code, spawning, ...)` matches `user_habitat_classification.csv`; bridge mode does a 3-way range join into `streams_habitat`; updates are `FALSE → TRUE` only, never reversed.

Today's pipeline order:

```
classify → cluster → connected_waterbody → overlay (apply_habitat_overlay=yes only)
```

Today's `apply_habitat_overlay=yes` mode mutates `streams_habitat.spawning` AFTER cluster + connected_waterbody have already run. By the time those phases query `WHERE h.spawning IS TRUE`, they only see modelled spawning — which is why fresh's `apply_habitat_overlay=yes` mode produces a different output than bcfp's `habitat_linear_<sp>` despite reaching the same final overlay state.

**Fix**: shift the overlay call to run BEFORE cluster + connected_waterbody when the user wants bcfp-style hk-trigger semantics. Once `streams_habitat.spawning` carries known-spawning at that point, every downstream phase reads it naturally — `cluster`'s `label_connect IS TRUE` checks, `.frs_connected_waterbody` Phase 2's `WHERE hs.spawning IS TRUE`, classify's path-1 rearing-on-spawning. No code changes inside any of those phases.

### One real gap in `frs_habitat_overlay`

bcfp's hk-trigger uses **range overlap**, not strict containment:

```sql
-- frs_habitat_overlay's current bridge predicate (CONTAINMENT)
s.downstream_route_measure >= k.downstream_route_measure
AND s.upstream_route_measure   <= k.upstream_route_measure

-- bcfp hk-trigger semantics (OVERLAP)
s.upstream_route_measure   >= k.downstream_route_measure
AND s.downstream_route_measure <= k.upstream_route_measure
```

Add a `range_mode = c("contain", "overlap")` arg to `frs_habitat_overlay`. Default `"contain"` preserves today's `apply_habitat_overlay=yes` behaviour exactly. New `"overlap"` mode matches bcfp.

### Orchestration — when to apply

Add a `known_habitat_when = c("post-cluster", "post-classify", "both")` option to `frs_habitat` (defaults to `"post-cluster"` = today's behaviour). When `"post-classify"` or `"both"`, `frs_habitat` calls `frs_habitat_overlay(known_habitat, range_mode = "overlap")` between classify and cluster. With `"both"`, it calls overlay twice — once post-classify, once post-cluster — and idempotent OR makes that safe (TRUE → TRUE is a no-op).

`apply_habitat_overlay = no` config keeps `known_habitat_when = "post-cluster"` and skips the overlay entirely (status quo).

## Implementation surface

| Change | File | Approx lines |
|---|---|---|
| `range_mode` arg + the alternative SQL predicate | `R/frs_habitat_overlay.R` | ~10–15 |
| `known_habitat_when` option threaded through `frs_habitat` orchestrator (with conditional pre-cluster overlay call) | `R/frs_habitat.R` (around line 1153 `.frs_run_connectivity`) | ~30 |
| Tests | `tests/testthat/test-frs_habitat_overlay.R` (new range-mode cases) + `tests/testthat/test-frs_habitat.R` (timing-mode cases) | ~50–80 |

**Zero changes** to `frs_habitat_classify`, `frs_cluster`, `.frs_connected_waterbody`. They keep reading `h.spawning IS TRUE` as before; the OR-in is upstream of them when timing is `"post-classify"` or `"both"`.

## Safety property

`frs_habitat_overlay` already mutates `streams_habitat.spawning` (in `apply_habitat_overlay=yes` mode). The proposed change shifts WHEN that mutation fires, not WHETHER. With default `known_habitat_when = "post-cluster"`, behaviour is bit-identical to today. With `"post-classify"`, the mutation happens earlier in the pipeline, giving cluster + connected_waterbody the augmented spawning set — same eventual `streams_habitat.spawning` content, just visible to those phases.

OR-additive throughout: rows can flip FALSE → TRUE, never reverse. No risk of shrinking output.

## Test plan (fresh-side, self-contained)

| Test | Bar |
|---|---|
| All existing tests pass with default `known_habitat_when = "post-cluster"` | Bit-identical fresh suite output |
| `range_mode = "overlap"` produces the larger expected row set on a synthetic fixture (segment range partially intersects known range) | Unit test in `test-frs_habitat_overlay.R` |
| `range_mode = "contain"` (default) produces existing behaviour byte-identical | Unit test |
| `known_habitat_when = "post-classify"` on a synthetic fixture: cluster sees augmented spawning → segments that would have been stripped get preserved | Integration test in `test-frs_habitat.R` |
| `known_habitat_when = "both"` is idempotent (output identical to `"post-classify"` on the same input) | Integration test |
| Defensive: empty `known_habitat`, missing species column, table doesn't exist | Standard error-path tests |

These prove the safety property + the range-mode SQL + the timing semantics without needing the live bcfp tunnel.

## Parity verification (link-side, follow-up)

The bcfp parity claim itself — "MORR ST credits ~60 km of rule-based rearing matching `bcfishpass.habitat_linear_st` via the hk-trigger path" — requires link's `compare_bcfishpass_wsg.R` + the live tunnel. Tracked separately in [link#132](https://github.com/NewGraphEnvironment/link/issues/132). After fresh ships the timing arg, link's `lnk_pipeline_classify` exposes `known_habitat_when = "post-classify"` to bundles that want bcfp parity (and `"post-cluster"` for bundles that want today's apply_habitat_overlay semantics). MORR ST runs through the compare apparatus to confirm.

## Reproduction

Tunnel DB: `bcfishpass` on `localhost:63333` (db_newgraph, rebuilt Mondays). WSG: MORR. Species: ST. `bcfishpass.streams_habitat_known` provides the trigger source. Link's bcfishpass-bundle ships `inst/extdata/configs/bcfishpass/overrides/user_habitat_classification.csv` — same data, just not currently consumed by classify until link's wiring lands in #132.

## Related

- [fresh#186](https://github.com/NewGraphEnvironment/fresh/issues/186) (closed in #188 / v0.25.0) — `frs_cluster` phase-1 + confluence-boost
- [fresh#187](https://github.com/NewGraphEnvironment/fresh/issues/187) (closed in #188 / v0.25.0) — `frs_trace_downstream` averaged-FWA gradient
- [fresh#147](https://github.com/NewGraphEnvironment/fresh/issues/147) (closed) — original `.frs_connected_waterbody` for SK lake-proximity

This is the third mechanism in the same parity-investigation slice. fresh#186/#187 fixed link's over-credits; this fixes the under-credit. Together they should close most of the remaining MORR ST / BABL ST / MORR CO gap.


Test	Bar
All existing tests pass with default `known_habitat_when = "post-cluster"`	Bit-identical fresh suite output
`range_mode = "overlap"` produces the larger expected row set on a synthetic fixture (segment range partially intersects known range)	Unit test in `test-frs_habitat_overlay.R`
`range_mode = "contain"` (default) produces existing behaviour byte-identical	Unit test
`known_habitat_when = "post-classify"` on a synthetic fixture: cluster sees augmented spawning → segments that would have been stripped get preserved	Integration test in `test-frs_habitat.R`
`known_habitat_when = "both"` is idempotent (output identical to `"post-classify"` on the same input)	Integration test
Defensive: empty `known_habitat`, missing species column, table doesn't exist	Standard error-path tests

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

frs_habitat_classify path-1 should accept known-spawning as a rearing trigger #189

Problem

Three logical modes (only two exist in fresh)

Concrete case

Proposed solution — reuse `frs_habitat_overlay`, just call it earlier

One real gap in `frs_habitat_overlay`

Orchestration — when to apply

Implementation surface

Safety property

Test plan (fresh-side, self-contained)

Parity verification (link-side, follow-up)

Reproduction

Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Mode	Modelled rule	Known-spawning triggers rearing	Final overlay
bcfp `habitat_linear_<sp>`	yes	yes	no
bcfp `streams_habitat_linear`	yes	yes	yes
fresh `apply_habitat_overlay: no`	yes	no	no
fresh `apply_habitat_overlay: yes`	yes	no (only post-overlay)	yes

blue_line_key	n rearing segs (with hk.spawning_st = 1)	rearing km
360885316 (Morice River)	179	35.24
360885021 (Gosnell Creek)	59	12.14
360819468	24	5.44
360837468	16	3.71
...

Change	File	Approx lines
`range_mode` arg + the alternative SQL predicate	`R/frs_habitat_overlay.R`	~10–15
`known_habitat_when` option threaded through `frs_habitat` orchestrator (with conditional pre-cluster overlay call)	`R/frs_habitat.R` (around line 1153 `.frs_run_connectivity`)	~30
Tests	`tests/testthat/test-frs_habitat_overlay.R` (new range-mode cases) + `tests/testthat/test-frs_habitat.R` (timing-mode cases)	~50–80

frs_habitat_classify path-1 should accept known-spawning as a rearing trigger #189

Description

Problem

Three logical modes (only two exist in fresh)

Concrete case

Proposed solution — reuse frs_habitat_overlay, just call it earlier

One real gap in frs_habitat_overlay

Orchestration — when to apply

Implementation surface

Safety property

Test plan (fresh-side, self-contained)

Parity verification (link-side, follow-up)

Reproduction

Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Proposed solution — reuse `frs_habitat_overlay`, just call it earlier

One real gap in `frs_habitat_overlay`