Skip to content

Roman DC2 + HLWAS selection-function products#53

Open
psferguson wants to merge 57 commits into
mainfrom
roman_hlwas
Open

Roman DC2 + HLWAS selection-function products#53
psferguson wants to merge 57 commits into
mainfrom
roman_hlwas

Conversation

@psferguson

Copy link
Copy Markdown
Collaborator

Roman DC2 + HLWAS selection-function products

Stacked on roman_multisurvey (the unified StreamInjector / namespacing refactor); retarget to main after that merges.

Summary

Adds the full Roman catalog-level injection product suite to streamobs, derived from the Roman–Rubin DC2 image simulations (Troxel et al. 2023). This lets streams be injected and "observed" through Roman's depth, photometric errors, and star/galaxy selection — alongside LSST/DES via the unified multi-survey injector — and documents the derivation as a reusable methodology so LSST/DES products can later be re-derived self-consistently.

Surveys added

release bands depth maps
roman_dc2 F106, F129, F158 truth-anchored S/N=5 maglim (reference HLIS mock)
roman_hlwas_wide F158 exposure-time–scaled quasi-depth
roman_hlwas_medium F158 "
roman_hlwas_all F106, F158 "

(F129 isn't in the HLWAS exposure-time maps; F184 excluded for a known chromatic-calibration issue. Wide/medium are F158-only by survey design.)

What's included

  • Detection + classification efficiency vs maglim, derived on true stars at S/N>5 (the S/N cut is baked into the curves, not re-applied at injection).
  • F158 size-envelope star classifier + single-band F158 detection.
  • Two-curve photometric-error model (catalog = reported SExtractor error driving the S/N cut; sample = truth-based scatter driving the noise draw), with a tracked afterburner (config/surveys/roman_photoerror_corrections.yaml) to clean the measured faint end reproducibly.
  • Quasi-depth maps for the HLWAS tiers: depth = 26.375 + 1.25·log10(t/770 s), anchored to the DC2 truth-anchored F158 median and the DC2 HLIS reference exposure (Troxel §3.1) — keeps HLWAS on the same depth convention as the DC2-derived tables.
  • Galaxy-misclassification curve for compact galaxies (size_true < 0.3″), via a positional Roman↔LSST-DC2 match and cosmoDC2 size_true (joined by cosmodc2_id). 100% size coverage of matched galaxies.
  • Configs for all Roman releases; a single shared roman_star_classifier used by both the product generator and the misclassification script (no drift).
  • Docs: a survey-agnostic selection-function methodology technote + thin per-release data sheets, and a rendered multi-survey CMD example (LSST g−r/r, Roman F106−F158/F158, cross-survey g−F158/F158; true & fully-observed).
  • Data archive tooling: bin/build_data_archive.py

Conventions

  • Columns: true magnitudes key on the survey name (roman_F158_true, release-independent); observed/error/flag columns key on the full namespace (roman_dc2_F158_obs, roman_dc2_flag_observed).
  • Roman band names are uppercase (F106/F129/F158), matching ugali and the literature.

Testing

  • New tests/test_roman.py + Roman entries in SURVEY_REGISTRY (incl. a true/obs column-convention round-trip and Vega→AB checks).
  • Full suite: 204 passed, 1 pre-existing unrelated failure (des_yr6 Y-band threshold, also present on the base branch).
  • Docs build clean (myst-nb; all example notebooks render).

Data

Data files are not committed. Runtime products ship via Zenodo (data.zip built by bin/build_data_archive.py). Update BASE_DATA_URL in download_data.py after upload.

Deferred to follow-up PRs

  • Injector consuming the misclassification curve → background distributions + matched-filter maps.
  • Other Roman bands/tiers; real ETC-derived (vs quasi) depth maps.

psferguson and others added 30 commits June 16, 2026 08:25
Merge MultiSurveyInjector into StreamInjector: one class now accepts a
single survey or several (name/Survey, {namespace: spec} dict, or list).
Output columns are always survey-namespaced (<survey>_<band>_true/_obs/_err,
<survey>_flag_observed), even for a single survey. The model emits
<survey>_<band>_true uniformly (single-survey isochrone is just the
one-survey case).

- observed.py: StreamInjector takes one-or-many surveys; shared
  _inject_one_survey/detect_flag now take an explicit survey; public
  complete_data() fills missing geometry/ra-dec/true-mags from the config;
  MultiSurveyInjector removed.
- model.py: complete_catalog never overwrites present values (fills only
  missing rows, per column); add `dist` (scalar/vector) to set distances
  directly without phi1 / a distance_modulus model.
- columns.py docs, scene yaml, demo builder + regenerated notebook updated.
- tests: namespaced columns; new TestCompleteCatalogPermutations.
- plan doc: single class, always-namespace, exact-nstars agreed; flagged
  for removal (migrate to docs) before merge.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The reference band (survey.completeness_band) has its SNR>=5 cut baked into
the survey selection functions, so the per-band loop was double-applying it
(idempotent today, but conceptually double-counted and fragile), and a
special-cased "force" block was needed for the perfect-galstarsep flag
because the detection-efficiency curve does not bake the cut in.

Now the reference-band cut is applied exactly once (to both flags) and the
detection_mag_cut loop defaults to all injected bands except the reference
band, which is skipped. Behaviour-preserving (same bands cut, ref counted
once); removes the double-count and the flag asymmetry.

Document the path from (b) to (a) -- folding the cut into the
detection-efficiency curve itself -- in roman_multisurvey_plan.md, including
which data products must be regenerated (per-survey detection_eff tables)
and how.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…t, isochrone masses

Responds to MatthieuPE's review on the roman_multisurvey PR.

API simplification:
- Collapse survey_bands+bands into one `bands` arg (list | {survey: bands} dict)
- Fold _complete_shared into the public complete_data; inject() delegates to it
- Make `survey` a required arg of detect_flag (drop the primary fallback)

Release-everywhere namespacing (Decision 1):
- Add Survey.namespace ({name}_{release}); injector keys surveys by it so the
  same survey at two releases yields distinct, non-colliding columns
- _load_survey accepts a {"survey":, "release":} spec dict; _inject_one_survey
  derives the namespace from survey.namespace
- `primary` is now the primary Survey; namespace string is primary_namespace
- Model: single-survey isochrone path is release-aware; _build_iso strips
  `release` before the ugali factory

SNR cut (Decision 2): the ref-band S/N>=5 cut is baked into both selection-
function curves, so remove the redundant re-application in _inject_one_survey
and correct the comment/docstring.

Isochrone/mass:
- Raise _MASS_STEPS 1000 -> 4000 (convergence check: ~600 vs ~220 distinct
  masses for a 5000-star stream; documented)
- Collapse single/multi isochrone builders into one _build_isochrones path
- sample_multisurvey accepts optional masses and returns (mags, masses);
  complete_catalog exposes a `mass` column and reuses a provided one

Tests: release-namespaced column updates; new multi-survey complete_data,
unified-bands, mandatory-survey, and isochrone-mass tests. 35 passed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Doc staleness introduced by the PR #47 behavior changes:
- column_convention.md / quickstart.md / multisurvey.md: column examples are now
  release-namespaced ({name}_{release}); state the namespacing rule. Quickstart
  Example 3 (lsst/yr1) no longer errors on copy-paste.
- Correct the false "the dict key is the column namespace" claim in
  multisurvey.md, the StreamInjector.__init__ docstring, and roman_rubin_demo.yaml
  (keys are containers; namespace is re-derived from each Survey).
- Document the {"survey":,"release":} spec-dict input form, the bands-dict
  validation, the mass column / user-supplied masses, _MASS_STEPS, and
  primary/primary_namespace.
- Reword the S/N "applied once" text: the reference-band cut is owned by the
  selection-function curves, not re-applied by the injector.
- Note the multi-survey isochrone requirement that surveys: keys equal the
  injector namespaces.

Code:
- set_completeness now accepts both "classification_eff" and the legacy
  misspelled "classifiction_eff" header, so the correct spelling documented in
  new_survey.md works without breaking the current (misspelled) data package.
  Also fixed the docstring's stale "eff_star" -> "detection_eff".

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- SplineStreamModel._create_model called an undefined self._create_distance();
  rename to self._create_distance_modulus() (model.py:981). This made every
  spline-stream run (e.g. bin/generate_spline_stream.py) raise AttributeError.
- plotting.plot_inject read bare, non-namespaced columns (flag_observed, r_obs,
  ...) and so failed on the injector's {name}_{release}-namespaced output; derive
  the namespace from survey.namespace and use it for all flag/true/obs columns.

Tests: test_spline_model.py (instantiate + sample, skipped if the spline data
file is absent) and test_plotting.py (plot_inject smoke test on namespaced
output). 37 passed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The DES survey config is name=des, release=yr6 (namespace des_yr6) and the data
directory is data/surveys/des_yr6/, matching the LSST yr* convention. DES.md told
users to load release='y6' and use a des_y6/ folder, which fails (no des_y6
config). Standardize the release/dir on yr6 in DES.md and the new_survey.md
example. (Data-file basenames in the package remain des_y6_*; that's a separate
data-package rename.)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The Notes claimed magnitudes (like velocities) overwrite whole columns. In fact
phi1/phi2/dist, magnitudes, and the shared mass column fill only missing rows
(the preserve-existing contract); only velocities are recomputed wholesale.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…) and fix test_surveys collection error (pytest >=9)
psferguson and others added 27 commits June 18, 2026 11:49
- config/surveys/roman_hlwas.yaml: survey config keyed to the official
  5-sigma point-source depth convention (DC2-derived maglim map,
  completeness and photo-error tables in F158)
- notebooks/create_streamobs_files_hlwas.ipynb: full derivation from the
  Roman-Rubin DC2 mock (paper-exact flags==0 / S/N>5 / matched cuts,
  truth-duplicate and tile-margin handling, desqr-style depth maps,
  streamobs-format tables)
- notebooks/build_roman_dc2_det_truth.py: builder for the det->truth
  matched catalog the derivation runs on
- docs: new "Roman HLWAS survey files" page documenting the derivation,
  depth conventions (STScI HLWAS medians), and caveats
- rewrite the Roman HLWAS docs page as a self-contained data-section
  style derivation (no notebook references), documenting the matching,
  selection, completeness, photo-error, and depth-normalization choices
- embed diagnostic figures (mag distributions, combined star efficiency
  + galaxy misclassification, photo-error, maglim maps, LSST comparison),
  exported by the derivation notebook into docs/source/_static/roman_hlwas/
- notebook: merge the galaxy misclassification curve onto the star
  detection/classification efficiency plot

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…odel

Validating the reported SExtractor magerr against the truth shows it
underestimates the real scatter of (observed - true) by a flat factor
~1.9-2.0 in all four bands - the apparent excess depth over the official
5-sigma values is not real. Accordingly:

- photo-error table is now built from the truth-based scatter
  ((p84-p16)/2 of mag_auto - truth mag), not the reported magerr
- normalized _5sigps maglim maps removed; maps stay in the measured
  convention and the tables' delta_mag is keyed to the measured F158
  map median, one internal convention throughout
- new error-validation figure (reported vs truth-based, per band) in the
  notebook and docs; docs depth/error sections rewritten accordingly
- config points at the un-normalized maglim map

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… extinction docs

From a full re-read of Troxel+23:
- saturation IS modeled in the mock (clip ~1.1e5 e-, effects brighter
  than mag~17, Fig 7); config saturation 15 -> 17 (delta_saturation -10.1)
- restrict survey bands to f106/f129/f158 (F184 is deep-tier-only in the
  community HLWAS and has a known unresolved chromatic calibration issue)
- docs: detection is two-stage (2.5-sigma/minarea-5 segmentation + S/N>5
  catalog cut), not a single 5-sigma threshold
- docs: flags==0 documented as the observational pure-star-sample
  selection (removes blends), deliberately stricter than the paper's
  relaxed flags<=2 analysis cuts
- docs: official Roman WFI AB zeropoints table (2024-03-01 effective
  areas) + grey calibration offsets; extinction-coefficient section
  (CCM89 absolute values, band ratios validated to ~1% against the truth
  dereddening corrections; amplitudes not validatable due to the mock's
  dust bug)
- caveats: detector-optimistic simple model, unmasked diffraction spikes

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…e 3)

A_band/E(B-V) = 1.1495/0.8497/0.6140 for F106/F129/F158 (synphot,
solar Phoenix spectrum), replacing the CCM89 estimates (which agreed to
1-2%). Docs cite the document and cross-reference its zeropoints against
the effective-area ecsv values.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The star-classified sample is galaxy-dominated faintward of F158~25.5
(69% true galaxies at 25.5, 99% at 26.5), inflating the apparent scatter
(0.33 vs 0.15 mag at 25.5). New cell shows the same diagnostics for true
stars passing the star classification.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ename

- all selection-function products now use TRUE stars passing the star
  classification (photo-error table, depth maps; efficiency already did):
  the observationally star-classified sample is galaxy-dominated faintward
  of F158~25.5 and doubles the apparent faint-end scatter
- depth maps truth-anchored: desqr spatial structure from reported errors,
  absolute scale set where the truth-based scatter reaches S/N=5 (medians
  26.19/26.19/25.98/25.26); photo-error model = 0.217 mag at delta_mag=0
  by construction; config delta_saturation re-keyed (-9.0)
- two-panel photo-error figure (reported errors | sigma(true-obs) vs
  reported); single-panel magnitude distributions (color=band,
  linestyle=true/obs)
- derivation notebook converted to scripts/roman/create_streamobs_files_hlwas.py
  and removed from the repo (kept locally, gitignored); roman build scripts
  moved to scripts/roman/
- docs page renamed roman_hlwas -> roman_dc2, figures under
  _static/roman_dc2/, depth/error sections rewritten for the anchored
  convention

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The catalog's S/N>5 selection uses the reported detection-image flux
errors; truth-validating them (scatter of mag_auto - truth_bb_mag for
clean stars, bright-end floor removed in quadrature) gives an error
factor of 1.71, so a true S/N=5 selection is reported det_sn > 8.6
(keeps 81.6% of detections). Applied to every product selection; the
error-validation cell intentionally keeps the raw catalog.

Effect: detection efficiency now falls off at the true depth
(0.82/0.59/0.18 at delta_mag = 0/+0.5/+1) instead of extending ~1 mag
past it; the photo-error table ends at delta ~+0.9 and still evaluates
to sigma = 0.217 at delta_mag = 0; anchored maglims unchanged. Combined
50% completeness now at F158 ~ 26.3.

The notebook copy is regenerated from the script and rerun (both stay
local/gitignored, including the converter helper).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The selection-function products are DC2-derived, so they now live under
the roman 'dc2' release: config/surveys/roman_dc2.yaml with data in
data/surveys/roman_dc2/ (tables + maglim maps moved there; the survey
loads and smoke-tests via SurveyFactory.create_survey('roman', 'dc2')).
config/surveys/roman_hlwas.yaml is a commented placeholder for the real
HLWAS footprint (exptime-scaled maglim maps in the same truth-anchored
convention, reusing the DC2 tables). Script OUT_DIR updated; notebook
regenerated and rerun.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ction

Replace the scalar class_star>0.5 cut with a single-band F158 size envelope as
the production star classifier in create_streamobs_files_hlwas.py: a two-sided
band in log(size) about the per-mag stellar locus, tuned to 0.875 purity
(DES Y6 0<=EXT_XGB<=1), bright-capped by the stellar log-size scatter,
single-peaked, frozen faintward of mag 24, with the upper bound flaring to
0.15" at mag 18. Detection is now treated as single-band F158 (true S/N>5 from
magerr_auto_H158, error factor 1.59) instead of the 4-band det_sn gate; the
envelope replaces class_star in every product selection.

Also wires the two-curve photo-error model into config/surveys/roman_dc2.yaml
(log_photo_error_catalog + log_photo_error_sample) and has the generator write
roman_photoerror_f158_catalog.csv.

Adds the classifier-comparison figure (envelope vs class_star>0.5 vs per-mag
optimized class_star; classifiction_eff + purity-on-secax) and updates
docs/source/roman_dc2.md for the new classifier, H-band detection, the shifted
depths (F158 maglim 25.98 -> 26.38, error factor 1.71 -> 1.59) and a quantified
explanation of the bright-end completeness plateau.

Note: running the generator also (re)writes the survey data products under
data/surveys/roman_dc2/ -- roman_photoerror_f158{,_catalog}.csv,
roman_stellar_efficiency_cutf158.csv, roman_dc2_maglim_f*_nside1024.fits.gz.
Those live beyond a symlink, outside the tracked tree, and are NOT in this
commit; regenerate them by running the script.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Keep the rendered-notebook builders (build_*_nb.py) un-tracked; the
production scripts and their products are the source of truth.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- fixing + testing reproducibility
- testing isochrone model
@MatthieuPE

Copy link
Copy Markdown
Collaborator

I added some tests and fixed an issue of reproducibility of injection / sampling

Comment on lines +43 to +44
> `classifiction_eff` (the misspelling is intentional and load-bearing — the loader
> greps that exact string). Keep it when re-deriving products for other surveys.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha I did see that before, I guess it was intentional, we could / should solve it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants