Skip to content

iddata refactoring#40

Open
matthewcornell wants to merge 7 commits into
mainfrom
mc/idmodels-iddata-refactor
Open

iddata refactoring#40
matthewcornell wants to merge 7 commits into
mainfrom
mc/idmodels-iddata-refactor

Conversation

@matthewcornell
Copy link
Copy Markdown
Member

implementation of design4.md

Copy link
Copy Markdown
Contributor

@lshandross lshandross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put all my comments on the code in this review. We can go over them together and elaborate/resolve as we see fit, then perhaps ask Claude to address the remaining ones

Comment thread src/iddata/ancillary/population.py Outdated
Comment thread src/iddata/sources/nhsn.py Outdated
Comment thread src/iddata/ancillary/population.py Outdated
Comment thread src/iddata/ancillary/population.py Outdated
Comment thread src/iddata/ancillary/population.py Outdated
Comment thread src/iddata/__init__.py Outdated
Comment thread tests/iddata/unit/test_load_data.py Outdated
Comment thread tests/iddata/unit/test_load_data.py Outdated
Comment thread tests/iddata/unit/test_sources.py
Comment thread tests/iddata/unit/test_sources.py Outdated
Comment thread src/iddata/constants.py Outdated
matthewcornell and others added 3 commits May 12, 2026 10:30
- Move _load_us_census() from nhsn.py to ancillary/population.py, its natural home
- Fix PopulationData.load() to return season-indexed (location, season, pop, log_pop) instead of most-recent-only; DiseaseDataLoader now merges ancillary by ["location", "season"] when the ancillary has a season column
- Restore original two-error logic in NHSNDataSource.load() for as_of < 2024-11-15 (separate errors for drop_pandemic_seasons=False and disease != FLU); add else branch for _load_from_nhsn; extract shared postprocessing into _postprocess()
- Add load_fips_mappings() helper to utils.py; update all sources to use it
- ILINet: move source-specific scaling to load() (before _aggregate_to_fips), set source before aggregation; replace deprecated groupby.apply(pd.DataFrame) with .agg(); add Region 10 comment
- FluSurvNet: fix _load_base age_labels default to ["Overall"]; inline _load_us_census_for_flusurv(); fix deprecated fillna(method="ffill") → .ffill(); fix deprecated W-sat → W-SAT; replace .apply(pd.DataFrame) with .agg()
- NSSP: fix error message to use "NSSPDataSource"; move source assignment before _fill_missing_states; fix .apply(pd.DataFrame) with .agg() in _fill_missing_states
- Improve base.py DataSource docstring (location can be HSA NCI ID, agg_level examples, season/season_week wording)
- Remove unused S3_BUCKET from constants.py
- Bump version to 0.2.0
- Tests: remove stale transform columns from mock source df; add NSSPDataSource custom-disease test; add NSSP case to test_load_data_sources; simplify FluSurvNet parametrize (no spurious conditional)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Only called once; no other sources use this split pattern.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@matthewcornell
Copy link
Copy Markdown
Member Author

DEC: Bump version to "2.0.0"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants