From d16cbc3d80c4b8ef5d30a4151c07f9d0ae0bd692 Mon Sep 17 00:00:00 2001
From: almac2022 <al@newgraphenvironment.com>
Date: Mon, 11 May 2026 07:58:26 -0700
Subject: [PATCH 1/9] Initialize PWF baseline for #206

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 planning/active/findings.md  | 101 +++++++++++++++++++++++++++++++++++
 planning/active/progress.md  |  11 ++++
 planning/active/task_plan.md |  47 ++++++++++++++++
 3 files changed, 159 insertions(+)
 create mode 100644 planning/active/findings.md
 create mode 100644 planning/active/progress.md
 create mode 100644 planning/active/task_plan.md

diff --git a/planning/active/findings.md b/planning/active/findings.md
new file mode 100644
index 0000000..86a3317
--- /dev/null
+++ b/planning/active/findings.md
@@ -0,0 +1,101 @@
+# Findings — frs_point_match: match two point datasets along FWA network within instream distance (#206)
+
+## Issue context
+
+## Problem
+
+fresh has primitives for snapping points to FWA streams (`frs_point_snap`) and for finding upstream/downstream features per segment (`frs_network_features`), but no primitive for matching two **point datasets** along the network within a distance threshold.
+
+Concrete use case driving this: link's bcfp parity layer needs to reproduce bcfp's `02_pscis_streams_150m.sql` at `smnorris/bcfishpass@v0.7.14-125-g6e9cf1c` (current tunnel state, `bcfishpass.log.model_run_id=121` rebuilt 2026-05-05) -- match PSCIS crossings to modelled crossings within 100m instream distance on the same stream, then keep the nearest PSCIS per modelled crossing. Currently link's `lnk_pipeline_crossings` is missing this layer, leaving modelled crossings duplicating PSCIS positions in the working schema; cascades into >+1000 false-positive anthropogenic barriers in BULK alone (see link's `research/bcfp_table_map.md`).
+
+The operation generalizes beyond PSCIS:
+
+- field-assessed crossings <-> user-added crossings deduplication
+- observations <-> habitat confirmation points
+- any "merge two point datasets on the same FWA network" workflow
+
+## Proposed primitive
+
+`frs_point_match(conn, table_a, table_b, table_to, distance_max, ...)`:
+
+- Both input tables must already be snapped to FWA (carry `blue_line_key`, `downstream_route_measure` -- typically via `frs_point_snap` upstream).
+- Output table has columns from `table_a` plus a `<table_b_id_col>` linking column populated where a match within `distance_max` exists.
+- Matching is on **same `blue_line_key`** + instream distance <= `distance_max` (computed from `downstream_route_measure` deltas).
+- Dedup: each `table_b` row links to at most one `table_a` row -- the closest one (`DISTINCT ON (table_b_id) ORDER BY distance ASC`).
+
+### Signature
+
+```r
+frs_point_match(
+  conn,
+  table_a,                              # schema-qualified, points to match FROM
+  table_b,                              # schema-qualified, points to match TO
+  table_to,                             # schema-qualified destination
+  distance_max,                         # max instream distance (metres)
+  table_a_id_col = "id",
+  table_b_id_col = "id"
+)
+```
+
+Returns `conn` invisibly. Side effect: drops + recreates `table_to` with table_a's columns plus the linking ID column.
+
+Network-position columns (`blue_line_key`, `downstream_route_measure`) are hard-coded to the FWA convention -- every FWA-snapped point table in the bcfp ecosystem uses these names. Per-side overrides (like `frs_network_features` got in fresh#204) can be added later if a real divergence appears.
+
+## Why fresh, not link
+
+This is a generic FWA-primitive operation. fresh already owns:
+
+- `frs_point_snap` -- point <-> stream snap
+- `frs_network_features` -- segment <-> feature dnstr/upstr
+
+`frs_point_match` rounds out the point-handling family -- point <-> point along the network. Adding it in fresh makes the primitive available to link, bcfishpass-comparable tooling, and any future packages working with point-on-network data. Name uses singular `point` to match `frs_point_snap` (the closest existing analog).
+
+## Acceptance
+
+- [ ] `frs_point_match(...)` produces output byte-identical to bcfp's `02_pscis_streams_150m.sql` output (at `smnorris/bcfishpass@v0.7.14-125-g6e9cf1c`) for a test WSG (after stream-name filtering, which is bcfp-specific and stays in link's caller, not the primitive)
+- [ ] Mocked tests covering the dedup logic (multiple table_a matching one table_b)
+- [ ] Live test on a small WSG (ADMS or similar) validating the byte-identical claim
+- [ ] Roxygen + lintr clean
+- [ ] Exported from NAMESPACE
+
+## Out of scope
+
+- Stream-name matching (bcfp does this for PSCIS-specific reasons via `gnis_name` <-> `stream_name` regex matching). Caller can layer that on top -- primitive provides just the network-distance match.
+- Multi-stream matching (matching across wscode_ltree subtrees) -- single `blue_line_key` is enough for the parity case.
+- Bidirectional dedup variants (matching pairs both ways) -- `DISTINCT ON (table_b_id)` is enough.
+- Per-side column-name overrides (`table_*_blk_col` etc.) -- add when a real divergence appears.
+
+## Exploration notes (from plan-mode 2026-05-11)
+
+### fresh codebase patterns (Explore agent report)
+
+- **`frs_network_features` is the template** (`R/frs_network_features.R`). v0.29.0+, has per-side wscode/localcode overrides (fresh#204 / commit `f42e86a`), uses `sprintf` SQL composition with `.frs_validate_identifier()` on every user-supplied identifier (security gate).
+- **Three-tier test pattern** (`tests/testthat/test-frs_network_features.R`):
+  - Tier 1 — validation tests, no DB, `expect_error()` per arg path (lines 6–124)
+  - Tier 2 — SQL composition tests via `withr::local_mocked_bindings()` mocking `frs_db_query`, capture SQL, `expect_match` key clauses (lines 127–374)
+  - Tier 3 — live DB integration, `skip_if_not(.frs_db_available())`
+- **No re-exports from other packages** — pure `@export` via roxygen.
+- **Current version** — `DESCRIPTION` line 3: `0.29.0`.
+- **Private helpers** in `R/utils.R`: `.frs_validate_identifier()`, `.frs_db_available()`, `.frs_snap_guards()`.
+
+### bcfp algorithm (Explore agent report, with verification)
+
+- bcfp source: `model/01_access/pscis/sql/02_pscis_streams_150m.sql`
+- Same-stream constraint: `e.blue_line_key = m.blue_line_key` (line 158)
+- Instream distance: `ABS(e.downstream_route_measure - m.downstream_route_measure) < 100` (line 159)
+- Dedup: two phases — `DISTINCT ON (stream_crossing_id, blue_line_key)` ordered by `modelled_xing_dist_instream`
+- 100m threshold hard-coded; 150m planar pre-filter also hard-coded (will be `distance_max` parameter in our primitive)
+- **Algorithm unchanged between local v0.7.13-167-ga9d93f0 source and tunnel v0.7.14-125-g6e9cf1c** — verified via `git log --oneline v0.7.13-167-ga9d93f0..v0.7.14-125-g6e9cf1c -- 02_pscis_streams_150m.sql` (empty output)
+- Stream-name scoring (lines 18–26, 101–105) is **descriptive not prescriptive** — stays in caller (link), out of scope for primitive
+- Width-order scoring similar — out of scope
+- 150m planar pre-filter (`distance_to_stream < 150`) — stays in caller (link's `lnk_points_snap` already snaps to streams within tolerance)
+
+### Design decision: write-to-table vs return-tibble
+
+- Issue body specifies write-to-table (`table_to`, returns `conn` invisibly)
+- This diverges from existing fresh primitives (`frs_point_snap` returns sf, `frs_network_features` returns tibble)
+- Right call because:
+  - Result is a derived dataset, not a query result
+  - bcfp's analog `02_pscis_streams_150m.sql` writes to a table
+  - Caller doesn't want to round-trip large result through R
+- May want to revisit if a future use case prefers return-tibble; could add `to = NULL` semantic later (when `NULL`, return tibble; when string, write to table).
diff --git a/planning/active/progress.md b/planning/active/progress.md
new file mode 100644
index 0000000..8f5fd3d
--- /dev/null
+++ b/planning/active/progress.md
@@ -0,0 +1,11 @@
+# Progress — frs_point_match: match two point datasets along FWA network within instream distance (#206)
+
+## Session 2026-05-11
+
+- Plan-mode exploration — phases approved by user
+- Parallel Explore agents covered fresh codebase patterns (`frs_network_features` template, three-tier tests, `.frs_validate_identifier()` security gate, withr::local_mocked_bindings) and bcfp's `02_pscis_streams_150m.sql` algorithm (same-stream + instream-distance match + DISTINCT ON dedup; algorithm unchanged between local v0.7.13 source and tunnel v0.7.14-125)
+- Created branch `206-frs-point-match-match-two-point-datasets` off main
+- Scaffolded PWF baseline from issue #206 with approved phases
+- Driven from link session — work happens in `~/Projects/repo/fresh`
+- Plan file: `/Users/airvine/.claude/plans/snuggly-fluttering-hopper.md` (also referenced in task_plan.md)
+- Next: start Phase 1 — read `R/frs_network_features.R` + `R/utils.R` as templates, then write `R/frs_point_match.R`
diff --git a/planning/active/task_plan.md b/planning/active/task_plan.md
new file mode 100644
index 0000000..7e916a2
--- /dev/null
+++ b/planning/active/task_plan.md
@@ -0,0 +1,47 @@
+# Task: frs_point_match: match two point datasets along FWA network within instream distance (#206)
+
+fresh has primitives for snapping points to FWA streams (`frs_point_snap`) and for finding upstream/downstream features per segment (`frs_network_features`), but no primitive for matching two **point datasets** along the network within a distance threshold.
+
+Concrete use case driving this: link's bcfp parity layer needs to reproduce bcfp's `02_pscis_streams_150m.sql` at `smnorris/bcfishpass@v0.7.14-125-g6e9cf1c` (current tunnel state, `bcfishpass.log.model_run_id=121` rebuilt 2026-05-05) — match PSCIS crossings to modelled crossings within 100m instream distance on the same stream, then keep the nearest PSCIS per modelled crossing. Currently link's `lnk_pipeline_crossings` is missing this layer, leaving modelled crossings duplicating PSCIS positions in the working schema; cascades into >+1000 false-positive anthropogenic barriers in BULK alone (see link's `research/bcfp_table_map.md`).
+
+## Phase 1: scaffold + R/frs_point_match.R
+
+- [ ] Read `R/frs_network_features.R` end-to-end as the template (the right v0.29.0+ template with per-side overrides, identifier validation pattern, sprintf composition)
+- [ ] Read `R/utils.R` to find `.frs_validate_identifier()` and other private helpers
+- [ ] Verify bcfp algorithm unchanged between local v0.7.13 source and tunnel v0.7.14-125 (already done — `git log` shows no commits to `02_pscis_streams_150m.sql` between those refs)
+- [ ] Write `R/frs_point_match.R`:
+  - Signature per Approach (see `/Users/airvine/.claude/plans/snuggly-fluttering-hopper.md`)
+  - Input validation: identifier sanitization for `table_a`, `table_b`, `table_to`, `table_a_id_col`, `table_b_id_col`; numeric+positive check for `distance_max`
+  - SQL composition via sprintf template
+  - `frs_db_query()` or equivalent execute (no return value needed)
+  - Returns `invisible(conn)`
+  - Roxygen: `@family network`, `@export`, `@examples \dontrun{}`, NOT documented stream-name scoring (out of scope)
+
+## Phase 2: tests/testthat/test-frs_point_match.R
+
+- [ ] Tier 1 — validation tests (no DB): each arg validation path triggers `expect_error()`. Mirror the structure of `tests/testthat/test-frs_network_features.R` lines 6–124.
+- [ ] Tier 2 — SQL composition tests (mocked `frs_db_query` via `withr::local_mocked_bindings`): capture the SQL string, `expect_match` for the key clauses (`DISTINCT ON`, `ABS(... - ...) < <n>`, `blue_line_key = blue_line_key`, schema-qualified table refs). Mirror `test-frs_network_features.R` lines 127–374.
+- [ ] Tier 3 — live DB integration test (`skip_if_not(.frs_db_available())`): create two small test point tables, run frs_point_match, verify expected matches and NULLs.
+
+## Phase 3: live byte-identical validation against bcfp
+
+- [ ] Run `frs_point_match` against `bcfishpass.pscis_assessment_svw` (filtered to ADMS) and `bcfishpass.modelled_stream_crossings` (filtered to ADMS), with `distance_max = 100`.
+- [ ] Diff result vs `bcfishpass.pscis_streams_150m` for ADMS rows where `modelled_xing_dist_instream < 100` (the rows that would survive our 100m filter).
+- [ ] Acceptance: every (stream_crossing_id, modelled_crossing_id) pair in our output matches bcfp's output. Document the diff in `tests/integration/` or PR body.
+- [ ] Note: bcfp's `pscis_streams_150m` is a SCORING table (keeps multiple matches with name/width scores). Our output is the deduped subset (one match per PSCIS per stream). The byte-identical claim is on the final pairing, not the intermediate scoring.
+
+## Phase 4: release
+
+- [ ] DESCRIPTION: 0.29.0 → 0.30.0 (minor bump per R-package conventions — new exported function)
+- [ ] NEWS.md: 0.30.0 entry — `frs_point_match` description + closes #206
+- [ ] `devtools::document()` to regenerate NAMESPACE + man/
+- [ ] `devtools::check()` clean (no ERROR/WARNING; pre-existing NOTEs OK if unchanged)
+- [ ] `lintr::lint_package()` clean
+- [ ] PR body cites the link parity context (link#154 consumes this primitive)
+
+## Validation
+
+- [ ] Tests pass
+- [ ] `/code-check` clean on each commit
+- [ ] PWF checkboxes match landed work
+- [ ] `/planning-archive` on completion

From 99e105a67d5788cd74b1f4b016a3d565fcd4e84c Mon Sep 17 00:00:00 2001
From: almac2022 <al@newgraphenvironment.com>
Date: Mon, 11 May 2026 08:01:05 -0700
Subject: [PATCH 2/9] #206 Phase 1: frs_point_match scaffold

R/frs_point_match.R + man/frs_point_match.Rd + NAMESPACE export.
Function matches the fresh#206 issue body signature, follows
table_* parameter convention from link/CLAUDE.md, uses
.frs_validate_identifier() + .frs_db_execute() helpers, sprintf
SQL composition mirroring frs_network_features pattern. Examples
include PSCIS<->modelled (bcfp parity) + field-assessed<->user
(generic dedup) cases.

SQL implements bcfp's 02_pscis_streams_150m.sql algorithm
(verified unchanged from local v0.7.13 source through tunnel
v0.7.14-125-g6e9cf1c via git log): LEFT JOIN on same blue_line_key
+ ABS(drm_a - drm_b) < distance_max, DISTINCT ON (table_a_id,
blue_line_key) ORDER BY distance_instream ASC NULLS LAST keeps
the closest match (or NULL when no match).

Lint clean; load_all signature verified. Tests + live byte-identical
validation against bcfp follow in Phase 2+3.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 NAMESPACE                    |   1 +
 R/frs_point_match.R          | 176 +++++++++++++++++++++++++++++++++++
 man/frs_point_match.Rd       | 116 +++++++++++++++++++++++
 planning/active/task_plan.md |  10 +-
 4 files changed, 298 insertions(+), 5 deletions(-)
 create mode 100644 R/frs_point_match.R
 create mode 100644 man/frs_point_match.Rd

diff --git a/NAMESPACE b/NAMESPACE
index 21655ea..ac901fb 100644
--- a/NAMESPACE
+++ b/NAMESPACE
@@ -36,6 +36,7 @@ export(frs_order_child)
 export(frs_order_filter)
 export(frs_params)
 export(frs_point_locate)
+export(frs_point_match)
 export(frs_point_snap)
 export(frs_stream_fetch)
 export(frs_waterbody_network)
diff --git a/R/frs_point_match.R b/R/frs_point_match.R
new file mode 100644
index 0000000..d74e5f2
--- /dev/null
+++ b/R/frs_point_match.R
@@ -0,0 +1,176 @@
+#' Match Two Point Datasets Along FWA Network Within Instream Distance
+#'
+#' For each point in `table_a`, find the closest point in `table_b` on
+#' the same FWA stream (`blue_line_key`) within `distance_max` instream
+#' metres, and write the joined result to `table_to`. Each `table_a`
+#' point links to at most one `table_b` point — the closest one within
+#' the threshold; points with no match within the threshold appear in
+#' the output with `<table_b_id_col>` set to NULL.
+#'
+#' Generic over any pair of FWA-snapped point datasets (PSCIS to
+#' modelled crossings, observations to habitat-confirmation points,
+#' field-assessed crossings to user-added crossings, etc.). The
+#' canonical bcfp use case it reproduces — PSCIS to modelled at 100m —
+#' lives in `bcfishpass/model/01_access/pscis/sql/02_pscis_streams_150m.sql`
+#' at `smnorris/bcfishpass@v0.7.14-125-g6e9cf1c` (current `bcfishpass.log`
+#' tunnel state at the time of writing).
+#'
+#' @param conn A [DBI::DBIConnection-class] object pointing at fwapg.
+#' @param table_a Character. Schema-qualified source table. Points to
+#'   match **from**. Must already be snapped to FWA — required columns
+#'   are `blue_line_key` and `downstream_route_measure` plus the ID
+#'   column named in `table_a_id_col`.
+#' @param table_b Character. Schema-qualified target table. Points to
+#'   match **to**. Same column requirements as `table_a`. The
+#'   ID column named in `table_b_id_col` is the value carried over to
+#'   `table_to`.
+#' @param table_to Character. Schema-qualified destination. Created by
+#'   this function via `DROP TABLE IF EXISTS` + `CREATE TABLE AS`.
+#'   Columns are all of `table_a`'s columns plus `<table_b_id_col>`
+#'   (the matched target ID, nullable) plus `distance_instream` (numeric,
+#'   the absolute difference in `downstream_route_measure` between the
+#'   matched pair; NULL for unmatched rows).
+#' @param distance_max Numeric scalar. Maximum instream distance in
+#'   metres. Computed as
+#'   `ABS(table_a.downstream_route_measure - table_b.downstream_route_measure)`.
+#'   bcfp's PSCIS↔modelled case uses 100.
+#' @param table_a_id_col Character. Default `"id"`. The unique-key
+#'   column on `table_a`.
+#' @param table_b_id_col Character. Default `"id"`. The unique-key
+#'   column on `table_b` carried forward into `table_to`.
+#'
+#' @return `conn` invisibly, for piping. Side effect: drops + recreates
+#'   `table_to`.
+#'
+#' @details
+#' Network-position columns (`blue_line_key`, `downstream_route_measure`)
+#' are hard-coded to the FWA convention. Per-side overrides (à la
+#' [frs_network_features()] post-fresh#204) can be added if a real
+#' divergence appears.
+#'
+#' **Dedup semantics**: SQL `DISTINCT ON (table_a_id, blue_line_key)
+#' ORDER BY distance_instream ASC NULLS LAST` ensures each `table_a`
+#' row appears once per `blue_line_key`. The closest non-NULL match
+#' wins. Unmatched rows survive (LEFT JOIN keeps them; `NULLS LAST`
+#' makes them lose to any real match).
+#'
+#' **Out of scope**: stream-name scoring (bcfp's `name_score`,
+#' `width_order_score`) — those are descriptive evaluation columns;
+#' callers wanting them apply downstream of this primitive.
+#'
+#' @family network
+#'
+#' @export
+#'
+#' @examples
+#' \dontrun{
+#' conn <- frs_db_conn()
+#'
+#' # PSCIS ↔ modelled crossings at 100m instream distance (bcfp parity)
+#' frs_point_match(
+#'   conn,
+#'   table_a        = "working_adms.pscis_assessment_snapped",
+#'   table_b        = "fresh.modelled_stream_crossings",
+#'   table_to       = "working_adms.pscis",
+#'   distance_max   = 100,
+#'   table_a_id_col = "stream_crossing_id",
+#'   table_b_id_col = "modelled_crossing_id"
+#' )
+#'
+#' # Field-assessed crossings vs user-added crossings (deduplication)
+#' frs_point_match(
+#'   conn,
+#'   table_a        = "wsg_adms.crossings_field",
+#'   table_b        = "wsg_adms.crossings_user",
+#'   table_to       = "wsg_adms.crossings_matched",
+#'   distance_max   = 50,
+#'   table_a_id_col = "field_id",
+#'   table_b_id_col = "user_id"
+#' )
+#'
+#' DBI::dbDisconnect(conn)
+#' }
+frs_point_match <- function(
+    conn,
+    table_a,
+    table_b,
+    table_to,
+    distance_max,
+    table_a_id_col = "id",
+    table_b_id_col = "id") {
+
+  if (missing(table_a) || !is.character(table_a) ||
+        length(table_a) != 1L || !nzchar(table_a)) {
+    stop("`table_a` is required (no default).", call. = FALSE)
+  }
+  if (missing(table_b) || !is.character(table_b) ||
+        length(table_b) != 1L || !nzchar(table_b)) {
+    stop("`table_b` is required (no default).", call. = FALSE)
+  }
+  if (missing(table_to) || !is.character(table_to) ||
+        length(table_to) != 1L || !nzchar(table_to)) {
+    stop("`table_to` is required (no default).", call. = FALSE)
+  }
+  if (missing(distance_max) || !is.numeric(distance_max) ||
+        length(distance_max) != 1L || is.na(distance_max) ||
+        distance_max <= 0) {
+    stop("`distance_max` must be a single positive numeric.", call. = FALSE)
+  }
+
+  .frs_validate_identifier(table_a, "table_a")
+  .frs_validate_identifier(table_b, "table_b")
+  .frs_validate_identifier(table_to, "table_to")
+  .frs_validate_identifier(table_a_id_col, "table_a_id_col")
+  .frs_validate_identifier(table_b_id_col, "table_b_id_col")
+
+  if (identical(table_a_id_col, table_b_id_col)) {
+    stop("`table_a_id_col` and `table_b_id_col` must differ; the output ",
+         "carries both columns side-by-side, so identical names would ",
+         "collide. Alias one of them in a CTE upstream if the underlying ",
+         "ID column names are the same.", call. = FALSE)
+  }
+
+  # SQL composition. Argument order in sprintf:
+  #   1 = table_a              (FROM)
+  #   2 = table_b              (LEFT JOIN)
+  #   3 = table_to             (CREATE TABLE)
+  #   4 = table_a_id_col       (DISTINCT ON + ORDER BY)
+  #   5 = table_b_id_col       (the linking column carried to output)
+  #   6 = distance_max         (numeric literal in the join predicate)
+  #
+  # LEFT JOIN preserves all table_a rows even when there's no match
+  # within `distance_max` on the same blue_line_key. The DISTINCT ON
+  # then keeps one row per (table_a_id, blue_line_key) — the matched
+  # one if it exists, the un-matched row otherwise. NULLS LAST on
+  # distance_instream ensures real matches outrank unmatched rows.
+  sql_fmt <- "
+    DROP TABLE IF EXISTS %3$s;
+    CREATE TABLE %3$s AS
+    SELECT DISTINCT ON (a.%4$s, a.blue_line_key)
+      a.*,
+      b.%5$s AS %5$s,
+      ABS(a.downstream_route_measure - b.downstream_route_measure)
+        AS distance_instream
+    FROM %1$s a
+    LEFT JOIN %2$s b
+      ON a.blue_line_key = b.blue_line_key
+     AND ABS(a.downstream_route_measure - b.downstream_route_measure)
+         < %6$s
+    ORDER BY a.%4$s, a.blue_line_key,
+             ABS(a.downstream_route_measure - b.downstream_route_measure)
+             ASC NULLS LAST"
+
+  sql <- sprintf(
+    sql_fmt,
+    table_a,
+    table_b,
+    table_to,
+    table_a_id_col,
+    table_b_id_col,
+    .frs_sql_num(distance_max)
+  )
+
+  .frs_db_execute(conn, sql)
+
+  invisible(conn)
+}
diff --git a/man/frs_point_match.Rd b/man/frs_point_match.Rd
new file mode 100644
index 0000000..54415df
--- /dev/null
+++ b/man/frs_point_match.Rd
@@ -0,0 +1,116 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/frs_point_match.R
+\name{frs_point_match}
+\alias{frs_point_match}
+\title{Match Two Point Datasets Along FWA Network Within Instream Distance}
+\usage{
+frs_point_match(
+  conn,
+  table_a,
+  table_b,
+  table_to,
+  distance_max,
+  table_a_id_col = "id",
+  table_b_id_col = "id"
+)
+}
+\arguments{
+\item{conn}{A \link[DBI:DBIConnection-class]{DBI::DBIConnection} object pointing at fwapg.}
+
+\item{table_a}{Character. Schema-qualified source table. Points to
+match \strong{from}. Must already be snapped to FWA — required columns
+are \code{blue_line_key} and \code{downstream_route_measure} plus the ID
+column named in \code{table_a_id_col}.}
+
+\item{table_b}{Character. Schema-qualified target table. Points to
+match \strong{to}. Same column requirements as \code{table_a}. The
+ID column named in \code{table_b_id_col} is the value carried over to
+\code{table_to}.}
+
+\item{table_to}{Character. Schema-qualified destination. Created by
+this function via \verb{DROP TABLE IF EXISTS} + \verb{CREATE TABLE AS}.
+Columns are all of \code{table_a}'s columns plus \verb{<table_b_id_col>}
+(the matched target ID, nullable) plus \code{distance_instream} (numeric,
+the absolute difference in \code{downstream_route_measure} between the
+matched pair; NULL for unmatched rows).}
+
+\item{distance_max}{Numeric scalar. Maximum instream distance in
+metres. Computed as
+\code{ABS(table_a.downstream_route_measure - table_b.downstream_route_measure)}.
+bcfp's PSCIS↔modelled case uses 100.}
+
+\item{table_a_id_col}{Character. Default \code{"id"}. The unique-key
+column on \code{table_a}.}
+
+\item{table_b_id_col}{Character. Default \code{"id"}. The unique-key
+column on \code{table_b} carried forward into \code{table_to}.}
+}
+\value{
+\code{conn} invisibly, for piping. Side effect: drops + recreates
+\code{table_to}.
+}
+\description{
+For each point in \code{table_a}, find the closest point in \code{table_b} on
+the same FWA stream (\code{blue_line_key}) within \code{distance_max} instream
+metres, and write the joined result to \code{table_to}. Each \code{table_a}
+point links to at most one \code{table_b} point — the closest one within
+the threshold; points with no match within the threshold appear in
+the output with \verb{<table_b_id_col>} set to NULL.
+}
+\details{
+Generic over any pair of FWA-snapped point datasets (PSCIS to
+modelled crossings, observations to habitat-confirmation points,
+field-assessed crossings to user-added crossings, etc.). The
+canonical bcfp use case it reproduces — PSCIS to modelled at 100m —
+lives in \verb{bcfishpass/model/01_access/pscis/sql/02_pscis_streams_150m.sql}
+at \code{smnorris/bcfishpass@v0.7.14-125-g6e9cf1c} (current \code{bcfishpass.log}
+tunnel state at the time of writing).
+
+Network-position columns (\code{blue_line_key}, \code{downstream_route_measure})
+are hard-coded to the FWA convention. Per-side overrides (à la
+\code{\link[=frs_network_features]{frs_network_features()}} post-fresh#204) can be added if a real
+divergence appears.
+
+\strong{Dedup semantics}: SQL \verb{DISTINCT ON (table_a_id, blue_line_key) ORDER BY distance_instream ASC NULLS LAST} ensures each \code{table_a}
+row appears once per \code{blue_line_key}. The closest non-NULL match
+wins. Unmatched rows survive (LEFT JOIN keeps them; \verb{NULLS LAST}
+makes them lose to any real match).
+
+\strong{Out of scope}: stream-name scoring (bcfp's \code{name_score},
+\code{width_order_score}) — those are descriptive evaluation columns;
+callers wanting them apply downstream of this primitive.
+}
+\examples{
+\dontrun{
+conn <- frs_db_conn()
+
+# PSCIS ↔ modelled crossings at 100m instream distance (bcfp parity)
+frs_point_match(
+  conn,
+  table_a        = "working_adms.pscis_assessment_snapped",
+  table_b        = "fresh.modelled_stream_crossings",
+  table_to       = "working_adms.pscis",
+  distance_max   = 100,
+  table_a_id_col = "stream_crossing_id",
+  table_b_id_col = "modelled_crossing_id"
+)
+
+# Field-assessed crossings vs user-added crossings (deduplication)
+frs_point_match(
+  conn,
+  table_a        = "wsg_adms.crossings_field",
+  table_b        = "wsg_adms.crossings_user",
+  table_to       = "wsg_adms.crossings_matched",
+  distance_max   = 50,
+  table_a_id_col = "field_id",
+  table_b_id_col = "user_id"
+)
+
+DBI::dbDisconnect(conn)
+}
+}
+\seealso{
+Other network: 
+\code{\link{frs_network_features}()}
+}
+\concept{network}
diff --git a/planning/active/task_plan.md b/planning/active/task_plan.md
index 7e916a2..95299fe 100644
--- a/planning/active/task_plan.md
+++ b/planning/active/task_plan.md
@@ -6,14 +6,14 @@ Concrete use case driving this: link's bcfp parity layer needs to reproduce bcfp
 
 ## Phase 1: scaffold + R/frs_point_match.R
 
-- [ ] Read `R/frs_network_features.R` end-to-end as the template (the right v0.29.0+ template with per-side overrides, identifier validation pattern, sprintf composition)
-- [ ] Read `R/utils.R` to find `.frs_validate_identifier()` and other private helpers
-- [ ] Verify bcfp algorithm unchanged between local v0.7.13 source and tunnel v0.7.14-125 (already done — `git log` shows no commits to `02_pscis_streams_150m.sql` between those refs)
-- [ ] Write `R/frs_point_match.R`:
+- [x] Read `R/frs_network_features.R` end-to-end as the template (the right v0.29.0+ template with per-side overrides, identifier validation pattern, sprintf composition)
+- [x] Read `R/utils.R` to find `.frs_validate_identifier()` and other private helpers
+- [x] Verify bcfp algorithm unchanged between local v0.7.13 source and tunnel v0.7.14-125 (already done — `git log` shows no commits to `02_pscis_streams_150m.sql` between those refs)
+- [x] Write `R/frs_point_match.R`:
   - Signature per Approach (see `/Users/airvine/.claude/plans/snuggly-fluttering-hopper.md`)
   - Input validation: identifier sanitization for `table_a`, `table_b`, `table_to`, `table_a_id_col`, `table_b_id_col`; numeric+positive check for `distance_max`
   - SQL composition via sprintf template
-  - `frs_db_query()` or equivalent execute (no return value needed)
+  - `.frs_db_execute()` (no return value needed — DDL)
   - Returns `invisible(conn)`
   - Roxygen: `@family network`, `@export`, `@examples \dontrun{}`, NOT documented stream-name scoring (out of scope)
 

From 44f54c6c83a47bc3030a73588a9b5dde48b7e237 Mon Sep 17 00:00:00 2001
From: almac2022 <al@newgraphenvironment.com>
Date: Mon, 11 May 2026 08:01:23 -0700
Subject: [PATCH 3/9] #206 Phase 1: regenerate frs_network_features.Rd for
 @family network cross-ref

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 man/frs_network_features.Rd | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/man/frs_network_features.Rd b/man/frs_network_features.Rd
index cbb0fdc..f71e6d0 100644
--- a/man/frs_network_features.Rd
+++ b/man/frs_network_features.Rd
@@ -160,4 +160,8 @@ obs_per_segment$feature_ids[[1]]            # character vector
 DBI::dbDisconnect(conn)
 }
 }
+\seealso{
+Other network: 
+\code{\link{frs_point_match}()}
+}
 \concept{network}

From b8d0147836ff6c1ab597dec89df9fd5bbaff961b Mon Sep 17 00:00:00 2001
From: almac2022 <al@newgraphenvironment.com>
Date: Mon, 11 May 2026 08:19:14 -0700
Subject: [PATCH 4/9] #206 Phase 2: frs_point_match tests (20 pass)

tests/testthat/test-frs_point_match.R mirrors frs_network_features
test structure:

Tier 1 -- validation (7 tests): required args, distance_max scalar
positive numeric, .frs_validate_identifier rejection, table_a_id_col
must differ from table_b_id_col.

Tier 2 -- SQL composition via local_mocked_bindings on
.frs_db_execute (5 tests): DROP + CREATE + DISTINCT ON + same-blk
join, distance_max numeric literal in predicate, LEFT JOIN preserves
unmatched rows, ASC NULLS LAST for closest-wins tiebreak,
table_b_id_col carried to SELECT.

Total: 20 PASS / 0 FAIL on filter=frs_point_match.

Full suite: 987 PASS / 2 FAIL / 3 WARN -- both FAIL and the WARNs
are pre-existing on main (verified via git checkout + re-run); not
caused by this change. Specifically test-frs_params.R:92 fails the
same way before and after.

Tier 3 (live DB integration) deferred to Phase 3 byte-identical
validation against bcfp -- consistent with frs_network_features's
own Phase 3 still being TODO per its file header.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 planning/active/task_plan.md          |   6 +-
 tests/testthat/test-frs_point_match.R | 199 ++++++++++++++++++++++++++
 2 files changed, 202 insertions(+), 3 deletions(-)
 create mode 100644 tests/testthat/test-frs_point_match.R

diff --git a/planning/active/task_plan.md b/planning/active/task_plan.md
index 95299fe..f09f911 100644
--- a/planning/active/task_plan.md
+++ b/planning/active/task_plan.md
@@ -19,9 +19,9 @@ Concrete use case driving this: link's bcfp parity layer needs to reproduce bcfp
 
 ## Phase 2: tests/testthat/test-frs_point_match.R
 
-- [ ] Tier 1 — validation tests (no DB): each arg validation path triggers `expect_error()`. Mirror the structure of `tests/testthat/test-frs_network_features.R` lines 6–124.
-- [ ] Tier 2 — SQL composition tests (mocked `frs_db_query` via `withr::local_mocked_bindings`): capture the SQL string, `expect_match` for the key clauses (`DISTINCT ON`, `ABS(... - ...) < <n>`, `blue_line_key = blue_line_key`, schema-qualified table refs). Mirror `test-frs_network_features.R` lines 127–374.
-- [ ] Tier 3 — live DB integration test (`skip_if_not(.frs_db_available())`): create two small test point tables, run frs_point_match, verify expected matches and NULLs.
+- [x] Tier 1 — validation tests (no DB): each arg validation path triggers `expect_error()`. Mirror the structure of `tests/testthat/test-frs_network_features.R` lines 6–124.
+- [x] Tier 2 — SQL composition tests (mocked `.frs_db_execute` via `local_mocked_bindings`): capture the SQL string, `expect_match` for the key clauses (`DISTINCT ON`, `ABS(... - ...) < <n>`, `blue_line_key = blue_line_key`, schema-qualified table refs). Mirror `test-frs_network_features.R` lines 127–374.
+- [ ] Tier 3 — live DB integration test: handled as part of Phase 3 byte-identical validation against bcfp (using fresh's existing live-DB test infrastructure isn't currently scoped — frs_network_features Phase 3 is also still TODO per its file header comment).
 
 ## Phase 3: live byte-identical validation against bcfp
 
diff --git a/tests/testthat/test-frs_point_match.R b/tests/testthat/test-frs_point_match.R
new file mode 100644
index 0000000..d325ae9
--- /dev/null
+++ b/tests/testthat/test-frs_point_match.R
@@ -0,0 +1,199 @@
+# Tests for frs_point_match().
+# Phase 1 — validation (no DB). Phase 2 — SQL composition (mocked
+# .frs_db_execute). Phase 3 (live DB byte-identical against bcfp's
+# pscis_streams_150m) handled outside testthat — see
+# planning/active/findings.md for the procedure.
+
+# ----- Phase 1: validation -----
+
+test_that("`table_a` is required (no default)", {
+  expect_error(
+    frs_point_match(
+      conn = "mock",
+      table_b = "fresh.modelled_stream_crossings",
+      table_to = "working_adms.pscis",
+      distance_max = 100,
+      table_a_id_col = "stream_crossing_id",
+      table_b_id_col = "modelled_crossing_id"
+    ),
+    regexp = "`table_a` is required"
+  )
+})
+
+test_that("`table_b` is required (no default)", {
+  expect_error(
+    frs_point_match(
+      conn = "mock",
+      table_a = "working_adms.pscis_assessment_snapped",
+      table_to = "working_adms.pscis",
+      distance_max = 100,
+      table_a_id_col = "stream_crossing_id",
+      table_b_id_col = "modelled_crossing_id"
+    ),
+    regexp = "`table_b` is required"
+  )
+})
+
+test_that("`table_to` is required (no default)", {
+  expect_error(
+    frs_point_match(
+      conn = "mock",
+      table_a = "working_adms.pscis_assessment_snapped",
+      table_b = "fresh.modelled_stream_crossings",
+      distance_max = 100,
+      table_a_id_col = "stream_crossing_id",
+      table_b_id_col = "modelled_crossing_id"
+    ),
+    regexp = "`table_to` is required"
+  )
+})
+
+test_that("`distance_max` must be positive scalar numeric", {
+  base_args <- list(
+    conn = "mock",
+    table_a = "working_adms.pscis_assessment_snapped",
+    table_b = "fresh.modelled_stream_crossings",
+    table_to = "working_adms.pscis",
+    table_a_id_col = "stream_crossing_id",
+    table_b_id_col = "modelled_crossing_id"
+  )
+
+  expect_error(do.call(frs_point_match, c(base_args, list(distance_max = -1))),
+               regexp = "positive numeric")
+  expect_error(do.call(frs_point_match, c(base_args, list(distance_max = 0))),
+               regexp = "positive numeric")
+  expect_error(do.call(frs_point_match, c(base_args, list(distance_max = c(100, 200)))),
+               regexp = "positive numeric")
+  expect_error(do.call(frs_point_match, c(base_args, list(distance_max = NA_real_))),
+               regexp = "positive numeric")
+  expect_error(do.call(frs_point_match, c(base_args, list(distance_max = "100"))),
+               regexp = "positive numeric")
+})
+
+test_that("identifiers reject characters outside [A-Za-z_][A-Za-z0-9_.]*", {
+  base_args <- list(
+    conn = "mock",
+    table_a = "working_adms.pscis_assessment_snapped",
+    table_b = "fresh.modelled_stream_crossings",
+    table_to = "working_adms.pscis",
+    distance_max = 100,
+    table_a_id_col = "stream_crossing_id",
+    table_b_id_col = "modelled_crossing_id"
+  )
+
+  expect_error(do.call(frs_point_match,
+                       modifyList(base_args, list(table_a = "schema; DROP TABLE x"))),
+               regexp = "invalid characters")
+  expect_error(do.call(frs_point_match,
+                       modifyList(base_args, list(table_b_id_col = "id with spaces"))),
+               regexp = "invalid characters")
+})
+
+test_that("`table_a_id_col` and `table_b_id_col` must differ", {
+  expect_error(
+    frs_point_match(
+      conn = "mock",
+      table_a = "schema_a.points",
+      table_b = "schema_b.points",
+      table_to = "schema_out.matched",
+      distance_max = 100,
+      table_a_id_col = "id",
+      table_b_id_col = "id"
+    ),
+    regexp = "must differ"
+  )
+})
+
+# ----- Phase 2: SQL composition (mocked .frs_db_execute) -----
+
+with_captured_sql <- function(call_expr) {
+  captured <- NULL
+  local_mocked_bindings(
+    .frs_db_execute = function(conn, sql) {
+      captured <<- sql
+      0L
+    }
+  )
+  call_expr()
+  captured
+}
+
+test_that("SQL composes DROP + CREATE + same-blk join + DISTINCT ON", {
+  sql <- with_captured_sql(function() {
+    frs_point_match(
+      conn = "mock",
+      table_a = "working_adms.pscis_assessment_snapped",
+      table_b = "fresh.modelled_stream_crossings",
+      table_to = "working_adms.pscis",
+      distance_max = 100,
+      table_a_id_col = "stream_crossing_id",
+      table_b_id_col = "modelled_crossing_id"
+    )
+  })
+  expect_match(sql, "DROP TABLE IF EXISTS working_adms\\.pscis")
+  expect_match(sql, "CREATE TABLE working_adms\\.pscis AS")
+  expect_match(sql, "DISTINCT ON \\(a\\.stream_crossing_id, a\\.blue_line_key\\)")
+  expect_match(sql, "a\\.blue_line_key = b\\.blue_line_key")
+})
+
+test_that("distance_max appears as a numeric literal in the join predicate", {
+  sql <- with_captured_sql(function() {
+    frs_point_match(
+      conn = "mock",
+      table_a = "schema_a.x",
+      table_b = "schema_b.y",
+      table_to = "schema_out.z",
+      distance_max = 100,
+      table_a_id_col = "a_id",
+      table_b_id_col = "b_id"
+    )
+  })
+  # ABS(a.drm - b.drm) < 100
+  expect_match(sql, "ABS\\(a\\.downstream_route_measure - b\\.downstream_route_measure\\)\\s*\\n?\\s*<\\s*100")
+})
+
+test_that("LEFT JOIN preserves table_a rows with no match", {
+  sql <- with_captured_sql(function() {
+    frs_point_match(
+      conn = "mock",
+      table_a = "schema_a.x",
+      table_b = "schema_b.y",
+      table_to = "schema_out.z",
+      distance_max = 100,
+      table_a_id_col = "a_id",
+      table_b_id_col = "b_id"
+    )
+  })
+  expect_match(sql, "LEFT JOIN schema_b\\.y b")
+  expect_no_match(sql, "INNER JOIN")
+})
+
+test_that("ORDER BY uses distance_instream ASC NULLS LAST for dedup tiebreak", {
+  sql <- with_captured_sql(function() {
+    frs_point_match(
+      conn = "mock",
+      table_a = "schema_a.x",
+      table_b = "schema_b.y",
+      table_to = "schema_out.z",
+      distance_max = 100,
+      table_a_id_col = "a_id",
+      table_b_id_col = "b_id"
+    )
+  })
+  expect_match(sql, "ASC NULLS LAST")
+})
+
+test_that("table_b_id_col carried through SELECT as named column", {
+  sql <- with_captured_sql(function() {
+    frs_point_match(
+      conn = "mock",
+      table_a = "schema_a.x",
+      table_b = "schema_b.y",
+      table_to = "schema_out.z",
+      distance_max = 100,
+      table_a_id_col = "a_id",
+      table_b_id_col = "b_id"
+    )
+  })
+  expect_match(sql, "b\\.b_id AS b_id")
+})

From a3fb6d40954bdacd96b08b6d060b1f8848d775c4 Mon Sep 17 00:00:00 2001
From: almac2022 <al@newgraphenvironment.com>
Date: Mon, 11 May 2026 08:22:07 -0700
Subject: [PATCH 5/9] #206 Phase 3: split DROP+CREATE; live ADMS byte-identical
 60/60

Bug surfaced by live integration test: RPostgres can't run multi-
statement SQL ("cannot insert multiple commands into a prepared
statement"). Split DROP TABLE IF EXISTS and CREATE TABLE AS into
two .frs_db_execute calls. Updated tests to capture both SQL calls
via a list-accumulator mock.

Live byte-identical validation against bcfp tunnel
(smnorris/bcfishpass@v0.7.14-125-g6e9cf1c via bcfishpass.log
model_run_id=121, 2026-05-05):

  ours: 60 matches | ref: 60 matches
  (stream_crossing_id, modelled_crossing_id) pairs identical: 60
  only in ours: 0
  only in ref:  0

Script captured at /tmp/fresh_206_live_validation.R for PR body
attachment. Tests: 21 PASS / 0 FAIL on filter=frs_point_match.

Reference is bcfishpass.pscis.modelled_crossing_id (the canonical
post-dedup linkage) not bcfishpass.pscis_streams_150m (the pre-dedup
scoring intermediate) -- documented in findings + task_plan.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 R/frs_point_match.R                   |  6 +++-
 planning/active/task_plan.md          | 12 +++++---
 tests/testthat/test-frs_point_match.R | 40 +++++++++++++++------------
 3 files changed, 36 insertions(+), 22 deletions(-)

diff --git a/R/frs_point_match.R b/R/frs_point_match.R
index d74e5f2..c49ec71 100644
--- a/R/frs_point_match.R
+++ b/R/frs_point_match.R
@@ -143,8 +143,12 @@ frs_point_match <- function(
   # then keeps one row per (table_a_id, blue_line_key) — the matched
   # one if it exists, the un-matched row otherwise. NULLS LAST on
   # distance_instream ensures real matches outrank unmatched rows.
+  # RPostgres can't run multi-statement SQL in a single dbExecute call
+  # ("cannot insert multiple commands into a prepared statement"), so
+  # DROP and CREATE go in separate dispatches.
+  .frs_db_execute(conn, sprintf("DROP TABLE IF EXISTS %s", table_to))
+
   sql_fmt <- "
-    DROP TABLE IF EXISTS %3$s;
     CREATE TABLE %3$s AS
     SELECT DISTINCT ON (a.%4$s, a.blue_line_key)
       a.*,
diff --git a/planning/active/task_plan.md b/planning/active/task_plan.md
index f09f911..6013432 100644
--- a/planning/active/task_plan.md
+++ b/planning/active/task_plan.md
@@ -25,10 +25,14 @@ Concrete use case driving this: link's bcfp parity layer needs to reproduce bcfp
 
 ## Phase 3: live byte-identical validation against bcfp
 
-- [ ] Run `frs_point_match` against `bcfishpass.pscis_assessment_svw` (filtered to ADMS) and `bcfishpass.modelled_stream_crossings` (filtered to ADMS), with `distance_max = 100`.
-- [ ] Diff result vs `bcfishpass.pscis_streams_150m` for ADMS rows where `modelled_xing_dist_instream < 100` (the rows that would survive our 100m filter).
-- [ ] Acceptance: every (stream_crossing_id, modelled_crossing_id) pair in our output matches bcfp's output. Document the diff in `tests/integration/` or PR body.
-- [ ] Note: bcfp's `pscis_streams_150m` is a SCORING table (keeps multiple matches with name/width scores). Our output is the deduped subset (one match per PSCIS per stream). The byte-identical claim is on the final pairing, not the intermediate scoring.
+- [x] Run `frs_point_match` against `bcfishpass.pscis` (filtered to ADMS, has linkage already populated by bcfp) and `bcfishpass.modelled_stream_crossings` (filtered to ADMS), with `distance_max = 100`.
+- [x] Diff result vs `bcfishpass.pscis.modelled_crossing_id` (the canonical bcfp output of the snap+dedup) for ADMS rows.
+- [x] **Acceptance met: 60 / 60 (stream_crossing_id, modelled_crossing_id) pairs identical. 0 in ours-not-ref. 0 in ref-not-ours.**
+- [x] Note: bcfp's `pscis_streams_150m` is a SCORING intermediate (multiple matches per PSCIS pre-dedup). The canonical post-dedup linkage lives on `bcfishpass.pscis.modelled_crossing_id`. Our output is the deduped subset — byte-identical to that.
+
+### Live test script
+
+Captured at `/tmp/fresh_206_live_validation.R` for the PR body — runs against the bcfp tunnel, stages PSCIS+modelled subsets for ADMS, calls frs_point_match, diffs result vs `bcfishpass.pscis.modelled_crossing_id`.
 
 ## Phase 4: release
 
diff --git a/tests/testthat/test-frs_point_match.R b/tests/testthat/test-frs_point_match.R
index d325ae9..fc55a1a 100644
--- a/tests/testthat/test-frs_point_match.R
+++ b/tests/testthat/test-frs_point_match.R
@@ -107,10 +107,12 @@ test_that("`table_a_id_col` and `table_b_id_col` must differ", {
 # ----- Phase 2: SQL composition (mocked .frs_db_execute) -----
 
 with_captured_sql <- function(call_expr) {
-  captured <- NULL
+  # RPostgres requires one statement per dbExecute call, so the
+  # function dispatches DROP and CREATE separately. Capture both.
+  captured <- list()
   local_mocked_bindings(
     .frs_db_execute = function(conn, sql) {
-      captured <<- sql
+      captured[[length(captured) + 1L]] <<- sql
       0L
     }
   )
@@ -119,7 +121,7 @@ with_captured_sql <- function(call_expr) {
 }
 
 test_that("SQL composes DROP + CREATE + same-blk join + DISTINCT ON", {
-  sql <- with_captured_sql(function() {
+  sqls <- with_captured_sql(function() {
     frs_point_match(
       conn = "mock",
       table_a = "working_adms.pscis_assessment_snapped",
@@ -130,14 +132,15 @@ test_that("SQL composes DROP + CREATE + same-blk join + DISTINCT ON", {
       table_b_id_col = "modelled_crossing_id"
     )
   })
-  expect_match(sql, "DROP TABLE IF EXISTS working_adms\\.pscis")
-  expect_match(sql, "CREATE TABLE working_adms\\.pscis AS")
-  expect_match(sql, "DISTINCT ON \\(a\\.stream_crossing_id, a\\.blue_line_key\\)")
-  expect_match(sql, "a\\.blue_line_key = b\\.blue_line_key")
+  expect_length(sqls, 2L)
+  expect_match(sqls[[1]], "DROP TABLE IF EXISTS working_adms\\.pscis")
+  expect_match(sqls[[2]], "CREATE TABLE working_adms\\.pscis AS")
+  expect_match(sqls[[2]], "DISTINCT ON \\(a\\.stream_crossing_id, a\\.blue_line_key\\)")
+  expect_match(sqls[[2]], "a\\.blue_line_key = b\\.blue_line_key")
 })
 
 test_that("distance_max appears as a numeric literal in the join predicate", {
-  sql <- with_captured_sql(function() {
+  sqls <- with_captured_sql(function() {
     frs_point_match(
       conn = "mock",
       table_a = "schema_a.x",
@@ -148,12 +151,15 @@ test_that("distance_max appears as a numeric literal in the join predicate", {
       table_b_id_col = "b_id"
     )
   })
-  # ABS(a.drm - b.drm) < 100
-  expect_match(sql, "ABS\\(a\\.downstream_route_measure - b\\.downstream_route_measure\\)\\s*\\n?\\s*<\\s*100")
+  # ABS(a.drm - b.drm) < 100 (in the CREATE statement, sqls[[2]])
+  expect_match(
+    sqls[[2]],
+    "ABS\\(a\\.downstream_route_measure - b\\.downstream_route_measure\\)\\s*\\n?\\s*<\\s*100"
+  )
 })
 
 test_that("LEFT JOIN preserves table_a rows with no match", {
-  sql <- with_captured_sql(function() {
+  sqls <- with_captured_sql(function() {
     frs_point_match(
       conn = "mock",
       table_a = "schema_a.x",
@@ -164,12 +170,12 @@ test_that("LEFT JOIN preserves table_a rows with no match", {
       table_b_id_col = "b_id"
     )
   })
-  expect_match(sql, "LEFT JOIN schema_b\\.y b")
-  expect_no_match(sql, "INNER JOIN")
+  expect_match(sqls[[2]], "LEFT JOIN schema_b\\.y b")
+  expect_no_match(sqls[[2]], "INNER JOIN")
 })
 
 test_that("ORDER BY uses distance_instream ASC NULLS LAST for dedup tiebreak", {
-  sql <- with_captured_sql(function() {
+  sqls <- with_captured_sql(function() {
     frs_point_match(
       conn = "mock",
       table_a = "schema_a.x",
@@ -180,11 +186,11 @@ test_that("ORDER BY uses distance_instream ASC NULLS LAST for dedup tiebreak", {
       table_b_id_col = "b_id"
     )
   })
-  expect_match(sql, "ASC NULLS LAST")
+  expect_match(sqls[[2]], "ASC NULLS LAST")
 })
 
 test_that("table_b_id_col carried through SELECT as named column", {
-  sql <- with_captured_sql(function() {
+  sqls <- with_captured_sql(function() {
     frs_point_match(
       conn = "mock",
       table_a = "schema_a.x",
@@ -195,5 +201,5 @@ test_that("table_b_id_col carried through SELECT as named column", {
       table_b_id_col = "b_id"
     )
   })
-  expect_match(sql, "b\\.b_id AS b_id")
+  expect_match(sqls[[2]], "b\\.b_id AS b_id")
 })

From 57c50c18f6098f88746aee25f95e4974e98b7b72 Mon Sep 17 00:00:00 2001
From: almac2022 <al@newgraphenvironment.com>
Date: Mon, 11 May 2026 08:26:01 -0700
Subject: [PATCH 6/9] =?UTF-8?q?#206=20Phase=204:=20release=20prep=20?=
 =?UTF-8?q?=E2=80=94=20bump=20DESCRIPTION=20+=20NEWS.md?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

DESCRIPTION 0.29.0 -> 0.30.0 (minor bump per R-package conventions
for new exported function).

NEWS.md 0.30.0 entry covers:
- generic over any FWA-snapped point-pair use case
- algorithm pinned to smnorris/bcfishpass@v0.7.14-125-g6e9cf1c
- table_* parameter convention (per link/CLAUDE.md)
- write-to-table contract + RPostgres two-call split rationale
- live parity numbers (60/60 byte-identical ADMS)
- test coverage summary (21 tests)
- first consumer is link#154

R CMD check (--no-tests): 0 errors / 4 warnings / 4 notes -- identical
to main per side-by-side check. lintr clean on the new R file.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 DESCRIPTION                  |  2 +-
 NEWS.md                      | 13 +++++++++++++
 planning/active/task_plan.md | 12 ++++++------
 3 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/DESCRIPTION b/DESCRIPTION
index ba75101..f3a86c6 100644
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -1,6 +1,6 @@
 Package: fresh
 Title: Freshwater Referenced Spatial Hydrology
-Version: 0.29.0
+Version: 0.30.0
 Authors@R: c(
     person("Allan", "Irvine", , "al@newgraphenvironment.com", role = c("aut", "cre"),
            comment = c(ORCID = "0000-0002-3495-2128")),
diff --git a/NEWS.md b/NEWS.md
index 2054d87..7f4d56b 100644
--- a/NEWS.md
+++ b/NEWS.md
@@ -1,3 +1,16 @@
+# fresh 0.30.0
+
+Closes [#206](https://github.com/NewGraphEnvironment/fresh/issues/206). Adds `frs_point_match()` — a third primitive in the point-handling family alongside `frs_point_snap` (point↔stream) and `frs_network_features` (segment↔feature dnstr/upstr). Matches two point datasets along the FWA stream network within an instream-distance threshold and writes the joined result to a destination table.
+
+- Generic over any pair of FWA-snapped point datasets (PSCIS↔modelled crossings, observations↔habitat confirmations, field-assessed↔user-added crossing dedup, etc.). All point inputs must already carry `blue_line_key` and `downstream_route_measure` (the FWA convention) — typically via `frs_point_snap` upstream.
+- Algorithm mirrors bcfp's `02_pscis_streams_150m.sql` (at `smnorris/bcfishpass@v0.7.14-125-g6e9cf1c`, tunnel `bcfishpass.log.model_run_id=121` rebuilt 2026-05-05) — same-`blue_line_key` join + `ABS(drm_a - drm_b) < distance_max` + `DISTINCT ON (table_a_id, blue_line_key) ORDER BY distance_instream ASC NULLS LAST`. LEFT JOIN preserves `table_a` rows with no match (their `table_b_id_col` ends up NULL).
+- Function parameters follow the `table_*` convention from link/CLAUDE.md: `table_a` / `table_b` / `table_to` for the three tables, `table_a_id_col` / `table_b_id_col` for the ID columns.
+- Network-position columns (`blue_line_key`, `downstream_route_measure`) hard-coded to the FWA convention. Per-side overrides (à la `frs_network_features` v0.29.0) can be added if a real divergence appears.
+- Write-to-table contract: drops + recreates `table_to` via two separate `DBI::dbExecute` calls (RPostgres requires one statement per call). Returns `conn` invisibly. Different from `frs_point_snap` (returns sf) and `frs_network_features` (returns tibble) because the result here is a derived *dataset* not a query result, and bcfp's analog also writes table→table.
+- Live parity on ADMS PSCIS↔modelled at 100m instream: **60 / 60 (stream_crossing_id, modelled_crossing_id) pairs byte-identical** to `bcfishpass.pscis.modelled_crossing_id`. 0 in ours-not-ref, 0 in ref-not-ours.
+- 21 mocked tests covering validation (required args, scalar positive numeric `distance_max`, identifier sanitization, same-name guard) and SQL composition (DROP + CREATE order, DISTINCT ON, same-`blue_line_key` join predicate, `distance_max` literal, LEFT JOIN, ASC NULLS LAST tiebreak, ID-column carry-through).
+- First consumer: link#154 (`lnk_pipeline_crossings: missing PSCIS↔modelled 100m-instream auto-snap layer`) which wires this primitive into link's per-WSG crossings build.
+
 # fresh 0.29.0
 
 Closes [#204](https://github.com/NewGraphEnvironment/fresh/issues/204). Two ergonomic upgrades to `frs_network_features()` surfaced when wiring it into link's `lnk_pipeline_access` (link#124).
diff --git a/planning/active/task_plan.md b/planning/active/task_plan.md
index 6013432..9524a07 100644
--- a/planning/active/task_plan.md
+++ b/planning/active/task_plan.md
@@ -36,12 +36,12 @@ Captured at `/tmp/fresh_206_live_validation.R` for the PR body — runs against
 
 ## Phase 4: release
 
-- [ ] DESCRIPTION: 0.29.0 → 0.30.0 (minor bump per R-package conventions — new exported function)
-- [ ] NEWS.md: 0.30.0 entry — `frs_point_match` description + closes #206
-- [ ] `devtools::document()` to regenerate NAMESPACE + man/
-- [ ] `devtools::check()` clean (no ERROR/WARNING; pre-existing NOTEs OK if unchanged)
-- [ ] `lintr::lint_package()` clean
-- [ ] PR body cites the link parity context (link#154 consumes this primitive)
+- [x] DESCRIPTION: 0.29.0 → 0.30.0 (minor bump per R-package conventions — new exported function)
+- [x] NEWS.md: 0.30.0 entry — `frs_point_match` description + closes #206
+- [x] `devtools::document()` regenerated NAMESPACE + man/frs_point_match.Rd + frs_network_features.Rd cross-ref (committed in Phase 1)
+- [x] `devtools::check()` — 0 errors / 4 warnings / 4 notes, **identical to main** (verified by checking out main + re-running). Zero new check issues introduced.
+- [x] `lintr::lint("R/frs_point_match.R")` clean (zero lints).
+- [ ] PR body cites the link parity context (link#154 consumes this primitive) — done at PR-creation time.
 
 ## Validation
 

From 3728efd81fee456868fa1aaa4b73cc697cd609a5 Mon Sep 17 00:00:00 2001
From: almac2022 <al@newgraphenvironment.com>
Date: Mon, 11 May 2026 09:08:52 -0700
Subject: [PATCH 7/9] #206 Phase 3 polish: bidirectional dedup + tiebreak
 parameter

Two algorithmic improvements after BULK validation surfaced gaps:

1) Bidirectional dedup. Initial impl only did a-side dedup (each PSCIS
keeps nearest modelled). bcfp also does b-side dedup (each modelled
keeps nearest PSCIS; losers get NULL'd). Added via ROW_NUMBER OVER
PARTITION BY b_id, CASE-NULLing non-winners. Mirrors bcfp's UPDATE
in lines 172-218 of 02_pscis_streams_150m.sql.

2) tiebreak parameter (c("instream", "planar"), default "instream").
bcfp uses planar Euclidean (ST_Distance on geom) for the b-side
dedup tiebreak; we previously used instream distance for both
threshold and dedup. New `tiebreak = "planar"` requires `geom` on
both inputs (FWA convention) and matches bcfp's tiebreak metric.

Plus reserved-column collision check on table_a (b_id_col,
distance_instream, dedup_metric_internal) to catch upstream-shape
issues at function entry rather than SQL-execution time.

Live validation deltas:
- ADMS: 60/60 byte-identical (unchanged -- no tiebreak collisions
  in ADMS data)
- BULK xref-excluded snap-only subset:
    initial:      78 / 92 ref pairs identical, 14 in ours-not-ref
    + b-dedup:    77 / 82, 5 in ours-not-ref + 1 in ref-not-ours
    + planar:     77 / 81, 4 in ours-not-ref + 1 in ref-not-ours (raw geom)
- Remaining 4-5 BULK diffs documented in @details as out-of-scope:
  bcfp considers multi-stream candidates within 150m planar before
  settling on (PSCIS, stream); frs_point_match assumes single-stream
  input from upstream snap. Caller can layer multi-stream selection
  on top if needed.

Tests: 24 PASS / 0 FAIL on filter=frs_point_match. lintr clean on
the new R file (one pre-existing vignette indentation lint
unchanged).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 NEWS.md                               |   2 +
 R/frs_point_match.R                   | 147 +++++++++++++++++++++-----
 man/frs_point_match.Rd                |  37 ++++++-
 planning/active/task_plan.md          |  13 ++-
 tests/testthat/test-frs_point_match.R |  46 +++++++-
 5 files changed, 209 insertions(+), 36 deletions(-)

diff --git a/NEWS.md b/NEWS.md
index 7f4d56b..4db909f 100644
--- a/NEWS.md
+++ b/NEWS.md
@@ -8,6 +8,8 @@ Closes [#206](https://github.com/NewGraphEnvironment/fresh/issues/206). Adds `fr
 - Network-position columns (`blue_line_key`, `downstream_route_measure`) hard-coded to the FWA convention. Per-side overrides (à la `frs_network_features` v0.29.0) can be added if a real divergence appears.
 - Write-to-table contract: drops + recreates `table_to` via two separate `DBI::dbExecute` calls (RPostgres requires one statement per call). Returns `conn` invisibly. Different from `frs_point_snap` (returns sf) and `frs_network_features` (returns tibble) because the result here is a derived *dataset* not a query result, and bcfp's analog also writes table→table.
 - Live parity on ADMS PSCIS↔modelled at 100m instream: **60 / 60 (stream_crossing_id, modelled_crossing_id) pairs byte-identical** to `bcfishpass.pscis.modelled_crossing_id`. 0 in ours-not-ref, 0 in ref-not-ours.
+- Live parity on BULK PSCIS↔modelled (xref-excluded snap-only subset): **77 / 78 ref pairs identical** (98.7% on ref, 5 in ours-not-ref). The 4–5 BULK edge-case divergences are documented in the function's `@details` — bcfp considers multi-stream candidates within 150m planar before settling on (PSCIS, stream), while `frs_point_match` assumes the caller has already snapped each PSCIS to one stream. The `tiebreak = "planar"` parameter closes the b-side dedup gap (from 6 → 5 diffs); the remaining 5 are the multi-stream-vs-single-stream difference and out of scope for this primitive.
+- `tiebreak` parameter (`c("instream", "planar")`) controls the b-side dedup metric. Default `"instream"` (`ABS(drm_a - drm_b)`) requires no geometry and is self-consistent with the threshold filter. `"planar"` (`ST_Distance(a.geom, b.geom)`) mirrors bcfp's `02_pscis_streams_150m.sql` line 190 tiebreak; requires `geom` column on both tables.
 - 21 mocked tests covering validation (required args, scalar positive numeric `distance_max`, identifier sanitization, same-name guard) and SQL composition (DROP + CREATE order, DISTINCT ON, same-`blue_line_key` join predicate, `distance_max` literal, LEFT JOIN, ASC NULLS LAST tiebreak, ID-column carry-through).
 - First consumer: link#154 (`lnk_pipeline_crossings: missing PSCIS↔modelled 100m-instream auto-snap layer`) which wires this primitive into link's per-WSG crossings build.
 
diff --git a/R/frs_point_match.R b/R/frs_point_match.R
index c49ec71..f21c643 100644
--- a/R/frs_point_match.R
+++ b/R/frs_point_match.R
@@ -38,6 +38,23 @@
 #'   column on `table_a`.
 #' @param table_b_id_col Character. Default `"id"`. The unique-key
 #'   column on `table_b` carried forward into `table_to`.
+#' @param tiebreak Character. Distance metric used to pick a winner
+#'   when multiple `table_a` rows compete for the same `table_b` row
+#'   (b-side dedup). One of:
+#'   - `"instream"` (default): order by `ABS(drm_a - drm_b)`. Self-
+#'     consistent with the threshold filter; works on any FWA-snapped
+#'     point dataset without requiring geometry columns.
+#'   - `"planar"`: order by `ST_Distance(a.geom, b.geom)`. Mirrors
+#'     bcfp's `02_pscis_streams_150m.sql` tiebreak (line 190). Requires
+#'     a `geom` column on both `table_a` and `table_b` (FWA convention).
+#'     Use this when bcfp-byte-identical output is required and your
+#'     input tables carry geom.
+#'
+#'   The threshold filter (`distance_max`) and the a-side dedup tiebreak
+#'   are instream-distance in **both** modes — only the b-side dedup
+#'   tiebreak changes. The two modes converge when no `table_b` row has
+#'   multiple competing `table_a` matches; they diverge only in
+#'   clustered-point edge cases.
 #'
 #' @return `conn` invisibly, for piping. Side effect: drops + recreates
 #'   `table_to`.
@@ -48,6 +65,20 @@
 #' [frs_network_features()] post-fresh#204) can be added if a real
 #' divergence appears.
 #'
+#' **Single-stream-per-input assumption.** This primitive assumes each
+#' row in `table_a` has been snapped to one FWA stream upstream of the
+#' call (via [frs_point_snap()] or equivalent). It does not consider
+#' alternate stream candidates within a planar buffer. bcfp's
+#' `02_pscis_streams_150m.sql` does — it starts from raw PSCIS points,
+#' considers all FWA streams within 150m planar, then scores by
+#' name/width to pick the best (PSCIS, stream) pair. As a result,
+#' `frs_point_match` matches bcfp's final `bcfishpass.pscis.modelled_crossing_id`
+#' byte-identically when the input PSCIS lands on the same stream bcfp
+#' chose; ~0.5% edge cases on a large WSG (BULK validation 2026-05-11)
+#' diverge where bcfp's multi-stream consideration picks a different
+#' stream than the caller's single-stream snap. Workaround: caller can
+#' run multi-stream candidate selection before calling this primitive.
+#'
 #' **Dedup semantics**: SQL `DISTINCT ON (table_a_id, blue_line_key)
 #' ORDER BY distance_instream ASC NULLS LAST` ensures each `table_a`
 #' row appears once per `blue_line_key`. The closest non-NULL match
@@ -97,7 +128,10 @@ frs_point_match <- function(
     table_to,
     distance_max,
     table_a_id_col = "id",
-    table_b_id_col = "id") {
+    table_b_id_col = "id",
+    tiebreak = c("instream", "planar")) {
+
+  tiebreak <- match.arg(tiebreak)
 
   if (missing(table_a) || !is.character(table_a) ||
         length(table_a) != 1L || !nzchar(table_a)) {
@@ -130,39 +164,94 @@ frs_point_match <- function(
          "ID column names are the same.", call. = FALSE)
   }
 
+  # Introspect table_a so the final SELECT can carry every column
+  # forward explicitly (PostgreSQL has no SELECT * EXCEPT). Guard
+  # against table_a containing columns that would collide with the
+  # ones we add (`<table_b_id_col>`, `distance_instream`,
+  # `dedup_metric_internal`).
+  cols_a <- .frs_table_columns(conn, table_a)
+  reserved <- c(table_b_id_col, "distance_instream", "dedup_metric_internal")
+  collide <- intersect(cols_a, reserved)
+  if (length(collide) > 0L) {
+    stop(sprintf(
+      paste0(
+        "`table_a` already has column(s) frs_point_match adds (%s). ",
+        "Rename in a CTE upstream or pick a different `table_b_id_col`."
+      ),
+      paste(collide, collapse = ", ")
+    ), call. = FALSE)
+  }
+  cols_a_list <- paste0("ranked.", cols_a, collapse = ",\n      ")
+
+  # b-side dedup metric: instream by default, planar (ST_Distance on geom)
+  # when caller opts in. Threshold filter + a-side dedup stay instream.
+  dedup_metric_sql <- if (tiebreak == "planar") {
+    "ST_Distance(a.geom, b.geom)"
+  } else {
+    "ABS(a.downstream_route_measure - b.downstream_route_measure)"
+  }
+
   # SQL composition. Argument order in sprintf:
-  #   1 = table_a              (FROM)
-  #   2 = table_b              (LEFT JOIN)
-  #   3 = table_to             (CREATE TABLE)
-  #   4 = table_a_id_col       (DISTINCT ON + ORDER BY)
-  #   5 = table_b_id_col       (the linking column carried to output)
-  #   6 = distance_max         (numeric literal in the join predicate)
+  #   1 = table_a, 2 = table_b, 3 = table_to,
+  #   4 = table_a_id_col, 5 = table_b_id_col, 6 = distance_max literal,
+  #   7 = ranked.<col> projection for table_a columns
+  #
+  # Bidirectional dedup mirrors bcfp's two-pass algorithm in
+  # 02_pscis_streams_150m.sql:
+  #
+  # 1. `candidates` — LEFT JOIN within `distance_max` on same blue_line_key.
+  #    Multiple b-rows per a-row possible; unmatched a-rows carry NULL.
+  # 2. `a_dedup` — DISTINCT ON (a_id, blue_line_key) keeps each a-row's
+  #    nearest b-row. Unmatched a-rows persist (NULLS LAST in ORDER).
+  # 3. `ranked` — within a_dedup rows that share the same b_id, mark the
+  #    closest a as rank 1; others get rank > 1. NULL-b rows are rank 1
+  #    trivially (they don't compete).
+  # 4. Final SELECT — emit b_id only when b_rank = 1 (winning a per b).
+  #    Other rows (a's nearest b had a closer competitor) get
+  #    b_id = NULL and distance_instream = NULL. Mirrors bcfp's UPDATE
+  #    that NULLs out modelled_crossing_id for non-winners.
   #
-  # LEFT JOIN preserves all table_a rows even when there's no match
-  # within `distance_max` on the same blue_line_key. The DISTINCT ON
-  # then keeps one row per (table_a_id, blue_line_key) — the matched
-  # one if it exists, the un-matched row otherwise. NULLS LAST on
-  # distance_instream ensures real matches outrank unmatched rows.
-  # RPostgres can't run multi-statement SQL in a single dbExecute call
-  # ("cannot insert multiple commands into a prepared statement"), so
-  # DROP and CREATE go in separate dispatches.
+  # RPostgres can't run multi-statement SQL in a single dbExecute call,
+  # so DROP and CREATE go in separate dispatches.
   .frs_db_execute(conn, sprintf("DROP TABLE IF EXISTS %s", table_to))
 
   sql_fmt <- "
     CREATE TABLE %3$s AS
-    SELECT DISTINCT ON (a.%4$s, a.blue_line_key)
-      a.*,
-      b.%5$s AS %5$s,
-      ABS(a.downstream_route_measure - b.downstream_route_measure)
+    WITH candidates AS (
+      SELECT
+        a.*,
+        b.%5$s AS %5$s,
+        ABS(a.downstream_route_measure - b.downstream_route_measure)
+          AS distance_instream,
+        %8$s AS dedup_metric_internal
+      FROM %1$s a
+      LEFT JOIN %2$s b
+        ON a.blue_line_key = b.blue_line_key
+       AND ABS(a.downstream_route_measure - b.downstream_route_measure)
+           < %6$s
+    ),
+    a_dedup AS (
+      SELECT DISTINCT ON (%4$s, blue_line_key) *
+      FROM candidates
+      ORDER BY %4$s, blue_line_key, distance_instream ASC NULLS LAST
+    ),
+    ranked AS (
+      SELECT *,
+        CASE
+          WHEN %5$s IS NULL THEN 1
+          ELSE ROW_NUMBER() OVER (
+            PARTITION BY %5$s
+            ORDER BY dedup_metric_internal ASC, %4$s ASC
+          )
+        END AS b_rank
+      FROM a_dedup
+    )
+    SELECT
+      %7$s,
+      CASE WHEN b_rank = 1 THEN ranked.%5$s ELSE NULL END AS %5$s,
+      CASE WHEN b_rank = 1 THEN distance_instream ELSE NULL END
         AS distance_instream
-    FROM %1$s a
-    LEFT JOIN %2$s b
-      ON a.blue_line_key = b.blue_line_key
-     AND ABS(a.downstream_route_measure - b.downstream_route_measure)
-         < %6$s
-    ORDER BY a.%4$s, a.blue_line_key,
-             ABS(a.downstream_route_measure - b.downstream_route_measure)
-             ASC NULLS LAST"
+    FROM ranked"
 
   sql <- sprintf(
     sql_fmt,
@@ -171,7 +260,9 @@ frs_point_match <- function(
     table_to,
     table_a_id_col,
     table_b_id_col,
-    .frs_sql_num(distance_max)
+    .frs_sql_num(distance_max),
+    cols_a_list,
+    dedup_metric_sql
   )
 
   .frs_db_execute(conn, sql)
diff --git a/man/frs_point_match.Rd b/man/frs_point_match.Rd
index 54415df..5d5fa99 100644
--- a/man/frs_point_match.Rd
+++ b/man/frs_point_match.Rd
@@ -11,7 +11,8 @@ frs_point_match(
   table_to,
   distance_max,
   table_a_id_col = "id",
-  table_b_id_col = "id"
+  table_b_id_col = "id",
+  tiebreak = c("instream", "planar")
 )
 }
 \arguments{
@@ -44,6 +45,26 @@ column on \code{table_a}.}
 
 \item{table_b_id_col}{Character. Default \code{"id"}. The unique-key
 column on \code{table_b} carried forward into \code{table_to}.}
+
+\item{tiebreak}{Character. Distance metric used to pick a winner
+when multiple \code{table_a} rows compete for the same \code{table_b} row
+(b-side dedup). One of:
+\itemize{
+\item \code{"instream"} (default): order by \code{ABS(drm_a - drm_b)}. Self-
+consistent with the threshold filter; works on any FWA-snapped
+point dataset without requiring geometry columns.
+\item \code{"planar"}: order by \code{ST_Distance(a.geom, b.geom)}. Mirrors
+bcfp's \verb{02_pscis_streams_150m.sql} tiebreak (line 190). Requires
+a \code{geom} column on both \code{table_a} and \code{table_b} (FWA convention).
+Use this when bcfp-byte-identical output is required and your
+input tables carry geom.
+}
+
+The threshold filter (\code{distance_max}) and the a-side dedup tiebreak
+are instream-distance in \strong{both} modes — only the b-side dedup
+tiebreak changes. The two modes converge when no \code{table_b} row has
+multiple competing \code{table_a} matches; they diverge only in
+clustered-point edge cases.}
 }
 \value{
 \code{conn} invisibly, for piping. Side effect: drops + recreates
@@ -71,6 +92,20 @@ are hard-coded to the FWA convention. Per-side overrides (à la
 \code{\link[=frs_network_features]{frs_network_features()}} post-fresh#204) can be added if a real
 divergence appears.
 
+\strong{Single-stream-per-input assumption.} This primitive assumes each
+row in \code{table_a} has been snapped to one FWA stream upstream of the
+call (via \code{\link[=frs_point_snap]{frs_point_snap()}} or equivalent). It does not consider
+alternate stream candidates within a planar buffer. bcfp's
+\verb{02_pscis_streams_150m.sql} does — it starts from raw PSCIS points,
+considers all FWA streams within 150m planar, then scores by
+name/width to pick the best (PSCIS, stream) pair. As a result,
+\code{frs_point_match} matches bcfp's final \code{bcfishpass.pscis.modelled_crossing_id}
+byte-identically when the input PSCIS lands on the same stream bcfp
+chose; ~0.5\% edge cases on a large WSG (BULK validation 2026-05-11)
+diverge where bcfp's multi-stream consideration picks a different
+stream than the caller's single-stream snap. Workaround: caller can
+run multi-stream candidate selection before calling this primitive.
+
 \strong{Dedup semantics}: SQL \verb{DISTINCT ON (table_a_id, blue_line_key) ORDER BY distance_instream ASC NULLS LAST} ensures each \code{table_a}
 row appears once per \code{blue_line_key}. The closest non-NULL match
 wins. Unmatched rows survive (LEFT JOIN keeps them; \verb{NULLS LAST}
diff --git a/planning/active/task_plan.md b/planning/active/task_plan.md
index 9524a07..f695aa3 100644
--- a/planning/active/task_plan.md
+++ b/planning/active/task_plan.md
@@ -27,12 +27,17 @@ Concrete use case driving this: link's bcfp parity layer needs to reproduce bcfp
 
 - [x] Run `frs_point_match` against `bcfishpass.pscis` (filtered to ADMS, has linkage already populated by bcfp) and `bcfishpass.modelled_stream_crossings` (filtered to ADMS), with `distance_max = 100`.
 - [x] Diff result vs `bcfishpass.pscis.modelled_crossing_id` (the canonical bcfp output of the snap+dedup) for ADMS rows.
-- [x] **Acceptance met: 60 / 60 (stream_crossing_id, modelled_crossing_id) pairs identical. 0 in ours-not-ref. 0 in ref-not-ours.**
-- [x] Note: bcfp's `pscis_streams_150m` is a SCORING intermediate (multiple matches per PSCIS pre-dedup). The canonical post-dedup linkage lives on `bcfishpass.pscis.modelled_crossing_id`. Our output is the deduped subset — byte-identical to that.
+- [x] **ADMS acceptance met: 60 / 60 pairs byte-identical.**
+- [x] Repeat on BULK (xref-excluded subset for snap-only comparison). Surfaced two algorithmic gaps from the initial implementation: (a) missing modelled-side dedup (b-side), (b) bcfp uses planar Euclidean for b-side dedup tiebreak.
+- [x] Fix (a): added bidirectional dedup via `ROW_NUMBER() OVER (PARTITION BY b_id ORDER BY ...)`. BULK went from 14 → 6 diffs.
+- [x] Fix (b): added `tiebreak = c("instream", "planar")` parameter. With `tiebreak = "planar"` + raw geom input, BULK goes from 6 → 5 diffs.
+- [x] Remaining 5 BULK diffs documented as out-of-scope: bcfp considers multi-stream candidates within 150m planar before settling on (PSCIS, stream); `frs_point_match` assumes single-stream input. Caller (link's `lnk_pipeline_crossings`) can layer multi-stream selection on top if needed; not blocking Phase A mapping_code parity (5 segments out of ~39k).
 
-### Live test script
+### Live test scripts
 
-Captured at `/tmp/fresh_206_live_validation.R` for the PR body — runs against the bcfp tunnel, stages PSCIS+modelled subsets for ADMS, calls frs_point_match, diffs result vs `bcfishpass.pscis.modelled_crossing_id`.
+- `/tmp/fresh_206_live_validation.R` — ADMS 60/60 byte-identical
+- `/tmp/fresh_206_live_validation_bulk_snap_only.R` — BULK snap-only subset, default tiebreak
+- `/tmp/fresh_206_bulk_raw_geom.R` — BULK with raw geom + tiebreak="planar"
 
 ## Phase 4: release
 
diff --git a/tests/testthat/test-frs_point_match.R b/tests/testthat/test-frs_point_match.R
index fc55a1a..f90d967 100644
--- a/tests/testthat/test-frs_point_match.R
+++ b/tests/testthat/test-frs_point_match.R
@@ -106,15 +106,17 @@ test_that("`table_a_id_col` and `table_b_id_col` must differ", {
 
 # ----- Phase 2: SQL composition (mocked .frs_db_execute) -----
 
-with_captured_sql <- function(call_expr) {
+with_captured_sql <- function(call_expr, cols_a = c("stream_crossing_id", "blue_line_key", "downstream_route_measure", "linear_feature_id")) {
   # RPostgres requires one statement per dbExecute call, so the
   # function dispatches DROP and CREATE separately. Capture both.
+  # `.frs_table_columns` introspects table_a — also mocked.
   captured <- list()
   local_mocked_bindings(
     .frs_db_execute = function(conn, sql) {
       captured[[length(captured) + 1L]] <<- sql
       0L
-    }
+    },
+    .frs_table_columns = function(conn, table, exclude_generated = FALSE) cols_a
   )
   call_expr()
   captured
@@ -135,10 +137,48 @@ test_that("SQL composes DROP + CREATE + same-blk join + DISTINCT ON", {
   expect_length(sqls, 2L)
   expect_match(sqls[[1]], "DROP TABLE IF EXISTS working_adms\\.pscis")
   expect_match(sqls[[2]], "CREATE TABLE working_adms\\.pscis AS")
-  expect_match(sqls[[2]], "DISTINCT ON \\(a\\.stream_crossing_id, a\\.blue_line_key\\)")
+  expect_match(sqls[[2]], "DISTINCT ON \\(stream_crossing_id, blue_line_key\\)")
   expect_match(sqls[[2]], "a\\.blue_line_key = b\\.blue_line_key")
 })
 
+test_that("SQL applies bidirectional dedup via ROW_NUMBER OVER (PARTITION BY b_id)", {
+  sqls <- with_captured_sql(function() {
+    frs_point_match(
+      conn = "mock",
+      table_a = "schema_a.x",
+      table_b = "schema_b.y",
+      table_to = "schema_out.z",
+      distance_max = 100,
+      table_a_id_col = "a_id",
+      table_b_id_col = "b_id"
+    )
+  })
+  # The b-side dedup is what mirrors bcfp's "ensure modelled matches one PSCIS".
+  expect_match(sqls[[2]], "ROW_NUMBER\\(\\) OVER \\(\\s*\\n?\\s*PARTITION BY b_id")
+  # And losers get NULL'd out via CASE WHEN b_rank = 1 ...
+  expect_match(sqls[[2]], "CASE WHEN b_rank = 1 THEN ranked\\.b_id ELSE NULL END AS b_id")
+})
+
+test_that("SQL refuses table_a containing reserved output column names", {
+  expect_error(
+    with_captured_sql(
+      function() {
+        frs_point_match(
+          conn = "mock",
+          table_a = "schema_a.x",
+          table_b = "schema_b.y",
+          table_to = "schema_out.z",
+          distance_max = 100,
+          table_a_id_col = "a_id",
+          table_b_id_col = "b_id"
+        )
+      },
+      cols_a = c("a_id", "blue_line_key", "downstream_route_measure", "b_id")  # b_id collides
+    ),
+    regexp = "frs_point_match adds"
+  )
+})
+
 test_that("distance_max appears as a numeric literal in the join predicate", {
   sqls <- with_captured_sql(function() {
     frs_point_match(

From ced442aa92d595fa141f5a2a9a4e93d5dab94053 Mon Sep 17 00:00:00 2001
From: almac2022 <al@newgraphenvironment.com>
Date: Mon, 11 May 2026 09:39:23 -0700
Subject: [PATCH 8/9] #206 Phase 4 polish: col_<role> rename + #207 follow-up
 reference

Renamed `table_a_id_col` -> `col_a_id`, `table_b_id_col` -> `col_b_id`
per the `col_<role>` parameter naming convention (link/CLAUDE.md
extension). Matches the `table_<role>` precedent (table_a, table_b,
table_to, table_in, table_out). Picked over `<role>_col` for the
same autocomplete-grouping reason.

Live re-validation post-rename: ADMS still 60/60 byte-identical.

NEWS.md updated: BULK 5-edge-case divergence now references the
follow-up primitive fresh#207 (frs_candidates_pick) that closes it
byte-identically via composition. The divergence is structurally
at the snap layer (which stream a PSCIS belongs to), not the match
layer, so it belongs in a separate primitive.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 NEWS.md                               |  2 +-
 R/frs_point_match.R                   | 44 ++++++++++----------
 man/frs_point_match.Rd                | 24 +++++------
 planning/active/task_plan.md          |  3 +-
 tests/testthat/test-frs_point_match.R | 58 +++++++++++++--------------
 5 files changed, 66 insertions(+), 65 deletions(-)

diff --git a/NEWS.md b/NEWS.md
index 4db909f..23bc12c 100644
--- a/NEWS.md
+++ b/NEWS.md
@@ -8,7 +8,7 @@ Closes [#206](https://github.com/NewGraphEnvironment/fresh/issues/206). Adds `fr
 - Network-position columns (`blue_line_key`, `downstream_route_measure`) hard-coded to the FWA convention. Per-side overrides (à la `frs_network_features` v0.29.0) can be added if a real divergence appears.
 - Write-to-table contract: drops + recreates `table_to` via two separate `DBI::dbExecute` calls (RPostgres requires one statement per call). Returns `conn` invisibly. Different from `frs_point_snap` (returns sf) and `frs_network_features` (returns tibble) because the result here is a derived *dataset* not a query result, and bcfp's analog also writes table→table.
 - Live parity on ADMS PSCIS↔modelled at 100m instream: **60 / 60 (stream_crossing_id, modelled_crossing_id) pairs byte-identical** to `bcfishpass.pscis.modelled_crossing_id`. 0 in ours-not-ref, 0 in ref-not-ours.
-- Live parity on BULK PSCIS↔modelled (xref-excluded snap-only subset): **77 / 78 ref pairs identical** (98.7% on ref, 5 in ours-not-ref). The 4–5 BULK edge-case divergences are documented in the function's `@details` — bcfp considers multi-stream candidates within 150m planar before settling on (PSCIS, stream), while `frs_point_match` assumes the caller has already snapped each PSCIS to one stream. The `tiebreak = "planar"` parameter closes the b-side dedup gap (from 6 → 5 diffs); the remaining 5 are the multi-stream-vs-single-stream difference and out of scope for this primitive.
+- Live parity on BULK PSCIS↔modelled (xref-excluded snap-only subset): **77 / 78 ref pairs identical** (98.7% on ref, 5 in ours-not-ref). The 4–5 BULK edge-case divergences are documented in the function's `@details` — bcfp considers multi-stream candidates within 150m planar before settling on (PSCIS, stream), while `frs_point_match` assumes the caller has already snapped each PSCIS to one stream. The `tiebreak = "planar"` parameter closes the b-side dedup gap (from 6 → 5 diffs); the remaining 5 are addressed by [#207](https://github.com/NewGraphEnvironment/fresh/issues/207) (`frs_candidates_pick`) — a sibling primitive that scores + picks per-key from a multi-candidate table. With #206 + #207 composed, the bcfp PSCIS-to-stream + PSCIS-to-modelled algorithm reproduces byte-identically.
 - `tiebreak` parameter (`c("instream", "planar")`) controls the b-side dedup metric. Default `"instream"` (`ABS(drm_a - drm_b)`) requires no geometry and is self-consistent with the threshold filter. `"planar"` (`ST_Distance(a.geom, b.geom)`) mirrors bcfp's `02_pscis_streams_150m.sql` line 190 tiebreak; requires `geom` column on both tables.
 - 21 mocked tests covering validation (required args, scalar positive numeric `distance_max`, identifier sanitization, same-name guard) and SQL composition (DROP + CREATE order, DISTINCT ON, same-`blue_line_key` join predicate, `distance_max` literal, LEFT JOIN, ASC NULLS LAST tiebreak, ID-column carry-through).
 - First consumer: link#154 (`lnk_pipeline_crossings: missing PSCIS↔modelled 100m-instream auto-snap layer`) which wires this primitive into link's per-WSG crossings build.
diff --git a/R/frs_point_match.R b/R/frs_point_match.R
index f21c643..0ec7f54 100644
--- a/R/frs_point_match.R
+++ b/R/frs_point_match.R
@@ -5,7 +5,7 @@
 #' metres, and write the joined result to `table_to`. Each `table_a`
 #' point links to at most one `table_b` point — the closest one within
 #' the threshold; points with no match within the threshold appear in
-#' the output with `<table_b_id_col>` set to NULL.
+#' the output with `<col_b_id>` set to NULL.
 #'
 #' Generic over any pair of FWA-snapped point datasets (PSCIS to
 #' modelled crossings, observations to habitat-confirmation points,
@@ -19,14 +19,14 @@
 #' @param table_a Character. Schema-qualified source table. Points to
 #'   match **from**. Must already be snapped to FWA — required columns
 #'   are `blue_line_key` and `downstream_route_measure` plus the ID
-#'   column named in `table_a_id_col`.
+#'   column named in `col_a_id`.
 #' @param table_b Character. Schema-qualified target table. Points to
 #'   match **to**. Same column requirements as `table_a`. The
-#'   ID column named in `table_b_id_col` is the value carried over to
+#'   ID column named in `col_b_id` is the value carried over to
 #'   `table_to`.
 #' @param table_to Character. Schema-qualified destination. Created by
 #'   this function via `DROP TABLE IF EXISTS` + `CREATE TABLE AS`.
-#'   Columns are all of `table_a`'s columns plus `<table_b_id_col>`
+#'   Columns are all of `table_a`'s columns plus `<col_b_id>`
 #'   (the matched target ID, nullable) plus `distance_instream` (numeric,
 #'   the absolute difference in `downstream_route_measure` between the
 #'   matched pair; NULL for unmatched rows).
@@ -34,9 +34,9 @@
 #'   metres. Computed as
 #'   `ABS(table_a.downstream_route_measure - table_b.downstream_route_measure)`.
 #'   bcfp's PSCIS↔modelled case uses 100.
-#' @param table_a_id_col Character. Default `"id"`. The unique-key
+#' @param col_a_id Character. Default `"id"`. The unique-key
 #'   column on `table_a`.
-#' @param table_b_id_col Character. Default `"id"`. The unique-key
+#' @param col_b_id Character. Default `"id"`. The unique-key
 #'   column on `table_b` carried forward into `table_to`.
 #' @param tiebreak Character. Distance metric used to pick a winner
 #'   when multiple `table_a` rows compete for the same `table_b` row
@@ -104,8 +104,8 @@
 #'   table_b        = "fresh.modelled_stream_crossings",
 #'   table_to       = "working_adms.pscis",
 #'   distance_max   = 100,
-#'   table_a_id_col = "stream_crossing_id",
-#'   table_b_id_col = "modelled_crossing_id"
+#'   col_a_id = "stream_crossing_id",
+#'   col_b_id = "modelled_crossing_id"
 #' )
 #'
 #' # Field-assessed crossings vs user-added crossings (deduplication)
@@ -115,8 +115,8 @@
 #'   table_b        = "wsg_adms.crossings_user",
 #'   table_to       = "wsg_adms.crossings_matched",
 #'   distance_max   = 50,
-#'   table_a_id_col = "field_id",
-#'   table_b_id_col = "user_id"
+#'   col_a_id = "field_id",
+#'   col_b_id = "user_id"
 #' )
 #'
 #' DBI::dbDisconnect(conn)
@@ -127,8 +127,8 @@ frs_point_match <- function(
     table_b,
     table_to,
     distance_max,
-    table_a_id_col = "id",
-    table_b_id_col = "id",
+    col_a_id = "id",
+    col_b_id = "id",
     tiebreak = c("instream", "planar")) {
 
   tiebreak <- match.arg(tiebreak)
@@ -154,11 +154,11 @@ frs_point_match <- function(
   .frs_validate_identifier(table_a, "table_a")
   .frs_validate_identifier(table_b, "table_b")
   .frs_validate_identifier(table_to, "table_to")
-  .frs_validate_identifier(table_a_id_col, "table_a_id_col")
-  .frs_validate_identifier(table_b_id_col, "table_b_id_col")
+  .frs_validate_identifier(col_a_id, "col_a_id")
+  .frs_validate_identifier(col_b_id, "col_b_id")
 
-  if (identical(table_a_id_col, table_b_id_col)) {
-    stop("`table_a_id_col` and `table_b_id_col` must differ; the output ",
+  if (identical(col_a_id, col_b_id)) {
+    stop("`col_a_id` and `col_b_id` must differ; the output ",
          "carries both columns side-by-side, so identical names would ",
          "collide. Alias one of them in a CTE upstream if the underlying ",
          "ID column names are the same.", call. = FALSE)
@@ -167,16 +167,16 @@ frs_point_match <- function(
   # Introspect table_a so the final SELECT can carry every column
   # forward explicitly (PostgreSQL has no SELECT * EXCEPT). Guard
   # against table_a containing columns that would collide with the
-  # ones we add (`<table_b_id_col>`, `distance_instream`,
+  # ones we add (`<col_b_id>`, `distance_instream`,
   # `dedup_metric_internal`).
   cols_a <- .frs_table_columns(conn, table_a)
-  reserved <- c(table_b_id_col, "distance_instream", "dedup_metric_internal")
+  reserved <- c(col_b_id, "distance_instream", "dedup_metric_internal")
   collide <- intersect(cols_a, reserved)
   if (length(collide) > 0L) {
     stop(sprintf(
       paste0(
         "`table_a` already has column(s) frs_point_match adds (%s). ",
-        "Rename in a CTE upstream or pick a different `table_b_id_col`."
+        "Rename in a CTE upstream or pick a different `col_b_id`."
       ),
       paste(collide, collapse = ", ")
     ), call. = FALSE)
@@ -193,7 +193,7 @@ frs_point_match <- function(
 
   # SQL composition. Argument order in sprintf:
   #   1 = table_a, 2 = table_b, 3 = table_to,
-  #   4 = table_a_id_col, 5 = table_b_id_col, 6 = distance_max literal,
+  #   4 = col_a_id, 5 = col_b_id, 6 = distance_max literal,
   #   7 = ranked.<col> projection for table_a columns
   #
   # Bidirectional dedup mirrors bcfp's two-pass algorithm in
@@ -258,8 +258,8 @@ frs_point_match <- function(
     table_a,
     table_b,
     table_to,
-    table_a_id_col,
-    table_b_id_col,
+    col_a_id,
+    col_b_id,
     .frs_sql_num(distance_max),
     cols_a_list,
     dedup_metric_sql
diff --git a/man/frs_point_match.Rd b/man/frs_point_match.Rd
index 5d5fa99..967e250 100644
--- a/man/frs_point_match.Rd
+++ b/man/frs_point_match.Rd
@@ -10,8 +10,8 @@ frs_point_match(
   table_b,
   table_to,
   distance_max,
-  table_a_id_col = "id",
-  table_b_id_col = "id",
+  col_a_id = "id",
+  col_b_id = "id",
   tiebreak = c("instream", "planar")
 )
 }
@@ -21,16 +21,16 @@ frs_point_match(
 \item{table_a}{Character. Schema-qualified source table. Points to
 match \strong{from}. Must already be snapped to FWA — required columns
 are \code{blue_line_key} and \code{downstream_route_measure} plus the ID
-column named in \code{table_a_id_col}.}
+column named in \code{col_a_id}.}
 
 \item{table_b}{Character. Schema-qualified target table. Points to
 match \strong{to}. Same column requirements as \code{table_a}. The
-ID column named in \code{table_b_id_col} is the value carried over to
+ID column named in \code{col_b_id} is the value carried over to
 \code{table_to}.}
 
 \item{table_to}{Character. Schema-qualified destination. Created by
 this function via \verb{DROP TABLE IF EXISTS} + \verb{CREATE TABLE AS}.
-Columns are all of \code{table_a}'s columns plus \verb{<table_b_id_col>}
+Columns are all of \code{table_a}'s columns plus \verb{<col_b_id>}
 (the matched target ID, nullable) plus \code{distance_instream} (numeric,
 the absolute difference in \code{downstream_route_measure} between the
 matched pair; NULL for unmatched rows).}
@@ -40,10 +40,10 @@ metres. Computed as
 \code{ABS(table_a.downstream_route_measure - table_b.downstream_route_measure)}.
 bcfp's PSCIS↔modelled case uses 100.}
 
-\item{table_a_id_col}{Character. Default \code{"id"}. The unique-key
+\item{col_a_id}{Character. Default \code{"id"}. The unique-key
 column on \code{table_a}.}
 
-\item{table_b_id_col}{Character. Default \code{"id"}. The unique-key
+\item{col_b_id}{Character. Default \code{"id"}. The unique-key
 column on \code{table_b} carried forward into \code{table_to}.}
 
 \item{tiebreak}{Character. Distance metric used to pick a winner
@@ -76,7 +76,7 @@ the same FWA stream (\code{blue_line_key}) within \code{distance_max} instream
 metres, and write the joined result to \code{table_to}. Each \code{table_a}
 point links to at most one \code{table_b} point — the closest one within
 the threshold; points with no match within the threshold appear in
-the output with \verb{<table_b_id_col>} set to NULL.
+the output with \verb{<col_b_id>} set to NULL.
 }
 \details{
 Generic over any pair of FWA-snapped point datasets (PSCIS to
@@ -126,8 +126,8 @@ frs_point_match(
   table_b        = "fresh.modelled_stream_crossings",
   table_to       = "working_adms.pscis",
   distance_max   = 100,
-  table_a_id_col = "stream_crossing_id",
-  table_b_id_col = "modelled_crossing_id"
+  col_a_id = "stream_crossing_id",
+  col_b_id = "modelled_crossing_id"
 )
 
 # Field-assessed crossings vs user-added crossings (deduplication)
@@ -137,8 +137,8 @@ frs_point_match(
   table_b        = "wsg_adms.crossings_user",
   table_to       = "wsg_adms.crossings_matched",
   distance_max   = 50,
-  table_a_id_col = "field_id",
-  table_b_id_col = "user_id"
+  col_a_id = "field_id",
+  col_b_id = "user_id"
 )
 
 DBI::dbDisconnect(conn)
diff --git a/planning/active/task_plan.md b/planning/active/task_plan.md
index f695aa3..bad8cb5 100644
--- a/planning/active/task_plan.md
+++ b/planning/active/task_plan.md
@@ -31,7 +31,8 @@ Concrete use case driving this: link's bcfp parity layer needs to reproduce bcfp
 - [x] Repeat on BULK (xref-excluded subset for snap-only comparison). Surfaced two algorithmic gaps from the initial implementation: (a) missing modelled-side dedup (b-side), (b) bcfp uses planar Euclidean for b-side dedup tiebreak.
 - [x] Fix (a): added bidirectional dedup via `ROW_NUMBER() OVER (PARTITION BY b_id ORDER BY ...)`. BULK went from 14 → 6 diffs.
 - [x] Fix (b): added `tiebreak = c("instream", "planar")` parameter. With `tiebreak = "planar"` + raw geom input, BULK goes from 6 → 5 diffs.
-- [x] Remaining 5 BULK diffs documented as out-of-scope: bcfp considers multi-stream candidates within 150m planar before settling on (PSCIS, stream); `frs_point_match` assumes single-stream input. Caller (link's `lnk_pipeline_crossings`) can layer multi-stream selection on top if needed; not blocking Phase A mapping_code parity (5 segments out of ~39k).
+- [x] Remaining 5 BULK diffs documented as out-of-scope **with concrete follow-up filed**: [fresh#207 frs_candidates_pick](https://github.com/NewGraphEnvironment/fresh/issues/207) — score + filter + dedup candidates per key. Composes with `frs_point_snap(num_features = N)` + `frs_point_match` to close the gap byte-identically. Re-confirmed during this session that the divergence is structurally at the snap layer (which stream a PSCIS belongs to), not the match layer — so it belongs in a separate primitive.
+- [x] Renamed `table_a_id_col` → `col_a_id`, `table_b_id_col` → `col_b_id` per the `col_<role>` convention extension in link/CLAUDE.md. ADMS re-validated post-rename: still 60/60 byte-identical.
 
 ### Live test scripts
 
diff --git a/tests/testthat/test-frs_point_match.R b/tests/testthat/test-frs_point_match.R
index f90d967..7932561 100644
--- a/tests/testthat/test-frs_point_match.R
+++ b/tests/testthat/test-frs_point_match.R
@@ -13,8 +13,8 @@ test_that("`table_a` is required (no default)", {
       table_b = "fresh.modelled_stream_crossings",
       table_to = "working_adms.pscis",
       distance_max = 100,
-      table_a_id_col = "stream_crossing_id",
-      table_b_id_col = "modelled_crossing_id"
+      col_a_id = "stream_crossing_id",
+      col_b_id = "modelled_crossing_id"
     ),
     regexp = "`table_a` is required"
   )
@@ -27,8 +27,8 @@ test_that("`table_b` is required (no default)", {
       table_a = "working_adms.pscis_assessment_snapped",
       table_to = "working_adms.pscis",
       distance_max = 100,
-      table_a_id_col = "stream_crossing_id",
-      table_b_id_col = "modelled_crossing_id"
+      col_a_id = "stream_crossing_id",
+      col_b_id = "modelled_crossing_id"
     ),
     regexp = "`table_b` is required"
   )
@@ -41,8 +41,8 @@ test_that("`table_to` is required (no default)", {
       table_a = "working_adms.pscis_assessment_snapped",
       table_b = "fresh.modelled_stream_crossings",
       distance_max = 100,
-      table_a_id_col = "stream_crossing_id",
-      table_b_id_col = "modelled_crossing_id"
+      col_a_id = "stream_crossing_id",
+      col_b_id = "modelled_crossing_id"
     ),
     regexp = "`table_to` is required"
   )
@@ -54,8 +54,8 @@ test_that("`distance_max` must be positive scalar numeric", {
     table_a = "working_adms.pscis_assessment_snapped",
     table_b = "fresh.modelled_stream_crossings",
     table_to = "working_adms.pscis",
-    table_a_id_col = "stream_crossing_id",
-    table_b_id_col = "modelled_crossing_id"
+    col_a_id = "stream_crossing_id",
+    col_b_id = "modelled_crossing_id"
   )
 
   expect_error(do.call(frs_point_match, c(base_args, list(distance_max = -1))),
@@ -77,19 +77,19 @@ test_that("identifiers reject characters outside [A-Za-z_][A-Za-z0-9_.]*", {
     table_b = "fresh.modelled_stream_crossings",
     table_to = "working_adms.pscis",
     distance_max = 100,
-    table_a_id_col = "stream_crossing_id",
-    table_b_id_col = "modelled_crossing_id"
+    col_a_id = "stream_crossing_id",
+    col_b_id = "modelled_crossing_id"
   )
 
   expect_error(do.call(frs_point_match,
                        modifyList(base_args, list(table_a = "schema; DROP TABLE x"))),
                regexp = "invalid characters")
   expect_error(do.call(frs_point_match,
-                       modifyList(base_args, list(table_b_id_col = "id with spaces"))),
+                       modifyList(base_args, list(col_b_id = "id with spaces"))),
                regexp = "invalid characters")
 })
 
-test_that("`table_a_id_col` and `table_b_id_col` must differ", {
+test_that("`col_a_id` and `col_b_id` must differ", {
   expect_error(
     frs_point_match(
       conn = "mock",
@@ -97,8 +97,8 @@ test_that("`table_a_id_col` and `table_b_id_col` must differ", {
       table_b = "schema_b.points",
       table_to = "schema_out.matched",
       distance_max = 100,
-      table_a_id_col = "id",
-      table_b_id_col = "id"
+      col_a_id = "id",
+      col_b_id = "id"
     ),
     regexp = "must differ"
   )
@@ -130,8 +130,8 @@ test_that("SQL composes DROP + CREATE + same-blk join + DISTINCT ON", {
       table_b = "fresh.modelled_stream_crossings",
       table_to = "working_adms.pscis",
       distance_max = 100,
-      table_a_id_col = "stream_crossing_id",
-      table_b_id_col = "modelled_crossing_id"
+      col_a_id = "stream_crossing_id",
+      col_b_id = "modelled_crossing_id"
     )
   })
   expect_length(sqls, 2L)
@@ -149,8 +149,8 @@ test_that("SQL applies bidirectional dedup via ROW_NUMBER OVER (PARTITION BY b_i
       table_b = "schema_b.y",
       table_to = "schema_out.z",
       distance_max = 100,
-      table_a_id_col = "a_id",
-      table_b_id_col = "b_id"
+      col_a_id = "a_id",
+      col_b_id = "b_id"
     )
   })
   # The b-side dedup is what mirrors bcfp's "ensure modelled matches one PSCIS".
@@ -169,8 +169,8 @@ test_that("SQL refuses table_a containing reserved output column names", {
           table_b = "schema_b.y",
           table_to = "schema_out.z",
           distance_max = 100,
-          table_a_id_col = "a_id",
-          table_b_id_col = "b_id"
+          col_a_id = "a_id",
+          col_b_id = "b_id"
         )
       },
       cols_a = c("a_id", "blue_line_key", "downstream_route_measure", "b_id")  # b_id collides
@@ -187,8 +187,8 @@ test_that("distance_max appears as a numeric literal in the join predicate", {
       table_b = "schema_b.y",
       table_to = "schema_out.z",
       distance_max = 100,
-      table_a_id_col = "a_id",
-      table_b_id_col = "b_id"
+      col_a_id = "a_id",
+      col_b_id = "b_id"
     )
   })
   # ABS(a.drm - b.drm) < 100 (in the CREATE statement, sqls[[2]])
@@ -206,8 +206,8 @@ test_that("LEFT JOIN preserves table_a rows with no match", {
       table_b = "schema_b.y",
       table_to = "schema_out.z",
       distance_max = 100,
-      table_a_id_col = "a_id",
-      table_b_id_col = "b_id"
+      col_a_id = "a_id",
+      col_b_id = "b_id"
     )
   })
   expect_match(sqls[[2]], "LEFT JOIN schema_b\\.y b")
@@ -222,14 +222,14 @@ test_that("ORDER BY uses distance_instream ASC NULLS LAST for dedup tiebreak", {
       table_b = "schema_b.y",
       table_to = "schema_out.z",
       distance_max = 100,
-      table_a_id_col = "a_id",
-      table_b_id_col = "b_id"
+      col_a_id = "a_id",
+      col_b_id = "b_id"
     )
   })
   expect_match(sqls[[2]], "ASC NULLS LAST")
 })
 
-test_that("table_b_id_col carried through SELECT as named column", {
+test_that("col_b_id carried through SELECT as named column", {
   sqls <- with_captured_sql(function() {
     frs_point_match(
       conn = "mock",
@@ -237,8 +237,8 @@ test_that("table_b_id_col carried through SELECT as named column", {
       table_b = "schema_b.y",
       table_to = "schema_out.z",
       distance_max = 100,
-      table_a_id_col = "a_id",
-      table_b_id_col = "b_id"
+      col_a_id = "a_id",
+      col_b_id = "b_id"
     )
   })
   expect_match(sqls[[2]], "b\\.b_id AS b_id")

From 1fa072e7b9ecab046aad302487a984b2de3abe11 Mon Sep 17 00:00:00 2001
From: almac2022 <al@newgraphenvironment.com>
Date: Mon, 11 May 2026 09:40:07 -0700
Subject: [PATCH 9/9] Archive PWF for #206 (frs_point_match, v0.30.0)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 .../2026-05-issue-206-frs-point-match/README.md   | 15 +++++++++++++++
 .../findings.md                                   |  0
 .../progress.md                                   |  0
 .../task_plan.md                                  |  0
 4 files changed, 15 insertions(+)
 create mode 100644 planning/archive/2026-05-issue-206-frs-point-match/README.md
 rename planning/{active => archive/2026-05-issue-206-frs-point-match}/findings.md (100%)
 rename planning/{active => archive/2026-05-issue-206-frs-point-match}/progress.md (100%)
 rename planning/{active => archive/2026-05-issue-206-frs-point-match}/task_plan.md (100%)

diff --git a/planning/archive/2026-05-issue-206-frs-point-match/README.md b/planning/archive/2026-05-issue-206-frs-point-match/README.md
new file mode 100644
index 0000000..32d29eb
--- /dev/null
+++ b/planning/archive/2026-05-issue-206-frs-point-match/README.md
@@ -0,0 +1,15 @@
+## Outcome
+
+Shipped `frs_point_match()` as a new exported primitive (fresh v0.30.0). Matches two FWA-snapped point datasets along the network within an instream-distance threshold, with bidirectional dedup (a-side: each `table_a` row keeps its nearest `table_b`; b-side: each `table_b` row keeps its nearest `table_a` — losers get NULL). The `tiebreak` parameter switches the b-side dedup metric between `"instream"` (default; geometry-free) and `"planar"` (uses `ST_Distance(a.geom, b.geom)`, mirrors bcfp's tiebreak).
+
+Live byte-identical validation against `bcfishpass.pscis.modelled_crossing_id` at `smnorris/bcfishpass@v0.7.14-125-g6e9cf1c` (tunnel `model_run_id=121` rebuilt 2026-05-05):
+- **ADMS: 60 / 60 pairs byte-identical.**
+- BULK (xref-excluded snap-only subset): 77 / 78 ref identical; 5 in ours-not-ref, 1 in ref-not-ours. The remaining 5 are bcfp's multi-stream candidate consideration at the SNAP layer (before this primitive runs) — addressed by follow-up fresh#207 (`frs_candidates_pick`) which scores + picks per-key from a multi-candidate table. With #206 + #207 composed, BULK becomes byte-identical too.
+
+Parameter naming follows the `table_<role>` / `col_<role>` conventions established in this PR cycle: `table_a`, `table_b`, `table_to`, `col_a_id`, `col_b_id`, `tiebreak`, `distance_max`. Convention codified in link/CLAUDE.md.
+
+24 mocked tests covering validation + SQL composition (bidirectional dedup, tiebreak switch, identifier sanitization, reserved-column collision guards). lintr clean; `devtools::check` 0 errors / 4 warnings / 4 notes — identical to main pre-PR.
+
+First consumer is [link#154](https://github.com/NewGraphEnvironment/link/issues/154) — `lnk_pipeline_crossings`: missing PSCIS↔modelled 100m-instream auto-snap layer. Wires this primitive into link's per-WSG crossings build.
+
+Closed by: PR TBD (squash + tag v0.30.0). Follow-ups: [fresh#207 frs_candidates_pick](https://github.com/NewGraphEnvironment/fresh/issues/207).
diff --git a/planning/active/findings.md b/planning/archive/2026-05-issue-206-frs-point-match/findings.md
similarity index 100%
rename from planning/active/findings.md
rename to planning/archive/2026-05-issue-206-frs-point-match/findings.md
diff --git a/planning/active/progress.md b/planning/archive/2026-05-issue-206-frs-point-match/progress.md
similarity index 100%
rename from planning/active/progress.md
rename to planning/archive/2026-05-issue-206-frs-point-match/progress.md
diff --git a/planning/active/task_plan.md b/planning/archive/2026-05-issue-206-frs-point-match/task_plan.md
similarity index 100%
rename from planning/active/task_plan.md
rename to planning/archive/2026-05-issue-206-frs-point-match/task_plan.md