Roman multisurvey#47
Open
psferguson wants to merge 31 commits into
Open
Conversation
856d196 to
d28d6da
Compare
This was referenced Jun 16, 2026
MatthieuPE
reviewed
Jun 16, 2026
Collaborator
Author
|
Make sure a unit test exists that tests the complete_data method in observed.py/StreamInjectior for the multi survey case. |
Collaborator
|
Do we really need to do a new instance for multi survey injector, rather than a single one which could take a list of surveys as input |
Merge MultiSurveyInjector into StreamInjector: one class now accepts a
single survey or several (name/Survey, {namespace: spec} dict, or list).
Output columns are always survey-namespaced (<survey>_<band>_true/_obs/_err,
<survey>_flag_observed), even for a single survey. The model emits
<survey>_<band>_true uniformly (single-survey isochrone is just the
one-survey case).
- observed.py: StreamInjector takes one-or-many surveys; shared
_inject_one_survey/detect_flag now take an explicit survey; public
complete_data() fills missing geometry/ra-dec/true-mags from the config;
MultiSurveyInjector removed.
- model.py: complete_catalog never overwrites present values (fills only
missing rows, per column); add `dist` (scalar/vector) to set distances
directly without phi1 / a distance_modulus model.
- columns.py docs, scene yaml, demo builder + regenerated notebook updated.
- tests: namespaced columns; new TestCompleteCatalogPermutations.
- plan doc: single class, always-namespace, exact-nstars agreed; flagged
for removal (migrate to docs) before merge.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The reference band (survey.completeness_band) has its SNR>=5 cut baked into the survey selection functions, so the per-band loop was double-applying it (idempotent today, but conceptually double-counted and fragile), and a special-cased "force" block was needed for the perfect-galstarsep flag because the detection-efficiency curve does not bake the cut in. Now the reference-band cut is applied exactly once (to both flags) and the detection_mag_cut loop defaults to all injected bands except the reference band, which is skipped. Behaviour-preserving (same bands cut, ref counted once); removes the double-count and the flag asymmetry. Document the path from (b) to (a) -- folding the cut into the detection-efficiency curve itself -- in roman_multisurvey_plan.md, including which data products must be regenerated (per-survey detection_eff tables) and how. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
MatthieuPE
reviewed
Jun 17, 2026
MatthieuPE
requested changes
Jun 17, 2026
Adding test + formating
…t, isochrone masses
Responds to MatthieuPE's review on the roman_multisurvey PR.
API simplification:
- Collapse survey_bands+bands into one `bands` arg (list | {survey: bands} dict)
- Fold _complete_shared into the public complete_data; inject() delegates to it
- Make `survey` a required arg of detect_flag (drop the primary fallback)
Release-everywhere namespacing (Decision 1):
- Add Survey.namespace ({name}_{release}); injector keys surveys by it so the
same survey at two releases yields distinct, non-colliding columns
- _load_survey accepts a {"survey":, "release":} spec dict; _inject_one_survey
derives the namespace from survey.namespace
- `primary` is now the primary Survey; namespace string is primary_namespace
- Model: single-survey isochrone path is release-aware; _build_iso strips
`release` before the ugali factory
SNR cut (Decision 2): the ref-band S/N>=5 cut is baked into both selection-
function curves, so remove the redundant re-application in _inject_one_survey
and correct the comment/docstring.
Isochrone/mass:
- Raise _MASS_STEPS 1000 -> 4000 (convergence check: ~600 vs ~220 distinct
masses for a 5000-star stream; documented)
- Collapse single/multi isochrone builders into one _build_isochrones path
- sample_multisurvey accepts optional masses and returns (mags, masses);
complete_catalog exposes a `mass` column and reuses a provided one
Tests: release-namespaced column updates; new multi-survey complete_data,
unified-bands, mandatory-survey, and isochrone-mass tests. 35 passed.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Doc staleness introduced by the PR #47 behavior changes: - column_convention.md / quickstart.md / multisurvey.md: column examples are now release-namespaced ({name}_{release}); state the namespacing rule. Quickstart Example 3 (lsst/yr1) no longer errors on copy-paste. - Correct the false "the dict key is the column namespace" claim in multisurvey.md, the StreamInjector.__init__ docstring, and roman_rubin_demo.yaml (keys are containers; namespace is re-derived from each Survey). - Document the {"survey":,"release":} spec-dict input form, the bands-dict validation, the mass column / user-supplied masses, _MASS_STEPS, and primary/primary_namespace. - Reword the S/N "applied once" text: the reference-band cut is owned by the selection-function curves, not re-applied by the injector. - Note the multi-survey isochrone requirement that surveys: keys equal the injector namespaces. Code: - set_completeness now accepts both "classification_eff" and the legacy misspelled "classifiction_eff" header, so the correct spelling documented in new_survey.md works without breaking the current (misspelled) data package. Also fixed the docstring's stale "eff_star" -> "detection_eff". Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- SplineStreamModel._create_model called an undefined self._create_distance();
rename to self._create_distance_modulus() (model.py:981). This made every
spline-stream run (e.g. bin/generate_spline_stream.py) raise AttributeError.
- plotting.plot_inject read bare, non-namespaced columns (flag_observed, r_obs,
...) and so failed on the injector's {name}_{release}-namespaced output; derive
the namespace from survey.namespace and use it for all flag/true/obs columns.
Tests: test_spline_model.py (instantiate + sample, skipped if the spline data
file is absent) and test_plotting.py (plot_inject smoke test on namespaced
output). 37 passed.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The DES survey config is name=des, release=yr6 (namespace des_yr6) and the data directory is data/surveys/des_yr6/, matching the LSST yr* convention. DES.md told users to load release='y6' and use a des_y6/ folder, which fails (no des_y6 config). Standardize the release/dir on yr6 in DES.md and the new_survey.md example. (Data-file basenames in the package remain des_y6_*; that's a separate data-package rename.) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The Notes claimed magnitudes (like velocities) overwrite whole columns. In fact phi1/phi2/dist, magnitudes, and the shared mass column fill only missing rows (the preserve-existing contract); only velocities are recomputed wholesale. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Collaborator
Author
|
Ok, I think I/cdawg addressed all of your comments.
|
MatthieuPE
reviewed
Jun 18, 2026
Correcting bug and moving notebooks
…) and fix test_surveys collection error (pytest >=9)
Collaborator
|
PR Ok for merging for me |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Multi-survey (Roman + Rubin) stream injection
Lets a single mock stream carry both Roman and Rubin/LSST photometry, with each band drawing its errors and detection from its own survey's maglim maps. Four phases (see docs/source/roman_multisurvey_plan.md):
Phase 1 — Survey holds two error curves: catalog (reported magerr, drives the S/N cut) and optional sample (true scatter, drives the noise draw).
⚠️ Behaviour changes
Phase 2 — injector de-hardcoded off {r, g}; all column names route through a new columns.py with a uniform _true/_obs/err scheme.
Phase 3 — IsochroneModel is multi-survey: masses drawn once and interpolated into every survey's bands (same physical star, consistent across surveys); Roman bands auto-converted Vega→AB.
Phase 4 — new MultiSurveyInjector: one shared sky placement + true-mag fill, then per-survey _obs/_err and _flag_observed. Adds a demo scene and a runnable Phases 1–4 notebook.
nstars now returns exactly N stars (was an emergent IMF count).
S/N cut applies to all injected bands (was hard-coded ["g"]).
Column names changed to _true/obs/err — not backward compatible with the old mag/magerr. Public inject() API unchanged.
Other
Isochrone configs switched Bressan2012 → Marigo2017; repo-wide black/isort pass.
Testing
pytest green; demo notebook executes with 0 errors. Real Roman/Rubin maglim maps aren't committed, so MultiSurveyInjector was validated against a stub survey (swapping in Survey.load(...) needs no injector changes).