refactor(dao): parametrize tests to run on FS and DB dao#3161
Conversation
|
|
||
| districts = dao.get_districts() | ||
| assert len(districts) == 2 | ||
| districts = [d for d in dao.get_districts() if d.id != "all areas"] |
There was a problem hiding this comment.
fs dao fixture is initialized with all area, db dao fixture isn't.
There was a problem hiding this comment.
Pull request overview
This PR refactors multiple DAO/command tests so they can run against both filesystem (FS) and database (DB) StudyDao implementations, and aligns FS DAOs toward raising domain exceptions (notably AreaNotFound) to reduce backend-specific branching in tests.
Changes:
- Parameterize several DAO test suites to run on both FS and DB backends (with DB-only sections kept for raw SQL assertions).
- Align FS DAO behaviors/exceptions (renewables, thermals, links, xpansion) to raise more specific exceptions (e.g.
AreaNotFound,CandidateNotFoundError,XpansionFileNotFoundError). - Update expected error messages in variant-study and integration tests to match
AreaNotFoundmessage format.
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/variantstudy/model/command/test_update_renewable_cluster.py | Updates expected error message for missing area when updating renewables clusters. |
| tests/variantstudy/model/command/test_create_renewables_cluster.py | Updates expected error message for missing area when creating renewables clusters. |
| tests/study/dao/test_database_xpansion_dao.py | Generalizes xpansion DAO tests to StudyDao and adds backend-conditional assertions where behavior differs. |
| tests/study/dao/test_database_thermal_dao.py | Parameterizes thermal DAO tests across backends and separates DB-only cascade/SQL assertions. |
| tests/study/dao/test_database_scenario_builder_dao.py | Parameterizes scenario builder tests across backends and normalizes FS vs DB output differences. |
| tests/study/dao/test_database_renewable_dao.py | Parameterizes renewable DAO tests across backends and moves DB-only behavior to dedicated tests. |
| tests/study/dao/test_database_link_dao.py | Splits link DAO tests into shared vs DB-only (SQL inspection) sections and runs shared ones on both backends. |
| tests/study/dao/test_database_hydro_dao.py | Moves shared hydro tests to StudyDao and isolates DB-only behaviors (ValueError expectations, cascades, SQL checks). |
| tests/study/dao/test_database_district_dao.py | Parameterizes district DAO tests across FS/DB backends. |
| tests/integration/study_data_blueprint/test_renewable.py | Updates API error description assertions to match AreaNotFound message format. |
| antarest/study/storage/variantstudy/model/command/update_thermal_clusters.py | Switches to catching AreaNotFound when listing thermals per area. |
| antarest/study/storage/variantstudy/model/command/update_renewables_clusters.py | Switches to catching AreaNotFound when listing renewables per area. |
| antarest/study/dao/file/file_study_xpansion_dao.py | Improves FS xpansion exception semantics for missing candidates/config/files. |
| antarest/study/dao/file/file_study_thermal_dao.py | Maps FS tree missing-node errors to AreaNotFound and adds clearer exceptions on delete. |
| antarest/study/dao/file/file_study_renewable_dao.py | Adds AreaNotFound checks and improves delete/get behaviors for FS renewable DAO. |
| antarest/study/dao/file/file_study_link_dao.py | Improves FS link DAO exception types (AreaNotFound) and missing-node handling (ChildNotFoundError). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # def test_get_renewable_matrix_raises_when_missing(dao: StudyDao) -> None: | ||
| # save_area(dao, "Paris") | ||
| # dao.save_renewable("paris", RenewableCluster(id="battery", name="Battery")) | ||
|
|
||
| def test_get_renewable_matrix_raises_when_missing(db_dao: DatabaseStudyDao) -> None: | ||
| dao = db_dao | ||
| save_area(dao, "Paris") | ||
| dao.save_renewable("paris", RenewableCluster(id="battery", name="Battery")) | ||
| # with pytest.raises(RenewableClusterNotFound): | ||
| # dao.get_renewable_series("paris", "gas") | ||
| # with pytest.raises(AreaNotFound): | ||
| # dao.get_renewable_series("nonexistent", "gas") | ||
|
|
||
| with pytest.raises(RenewableClusterNotFound): | ||
| dao.get_renewable_series("paris", "gas") | ||
| with pytest.raises(AreaNotFound): | ||
| dao.get_renewable_series("nonexistent", "gas") | ||
|
|
||
| # def test_save_renewable_matrix_raises_when_missing(dao: StudyDao) -> None: | ||
| # save_area(dao, "Paris") | ||
|
|
||
| def test_save_renewable_matrix_raises_when_missing(db_dao: DatabaseStudyDao) -> None: | ||
| dao = db_dao | ||
| save_area(dao, "Paris") | ||
|
|
||
| with pytest.raises(RenewableClusterNotFound): | ||
| dao.save_renewable_series({"paris": {"gas": "missing-matrix-id"}}) | ||
| with pytest.raises(AreaNotFound): | ||
| dao.save_renewable_series({"nonexistent": {"gas": "missing-matrix-id"}}) | ||
| # with pytest.raises(RenewableClusterNotFound): | ||
| # dao.save_renewable_series({"paris": {"gas": "missing-matrix-id"}}) | ||
| # with pytest.raises(AreaNotFound): | ||
| # dao.save_renewable_series({"nonexistent": {"gas": "missing-matrix-id"}}) |
| names = {c.name for c in paris_clusters} | ||
| assert names == {"Battery", "Solar PaNELS"} | ||
| caps = {c.name: c.nominal_capacity for c in paris_clusters} | ||
| assert caps["Battery"] == 200.0 | ||
| assert caps["Solar PaNELS"] == 500.0 |
There was a problem hiding this comment.
Make test resilient if the FS and DB dao return identical result but with different order.
| caps2 = {c.name: c.nominal_capacity for c in paris_clusters} | ||
| assert caps2["Battery"] == 1000.0 | ||
| assert caps2["Solar PaNELS"] == 500.0 | ||
| assert caps2["Wind"] == 100.0 |
| if file_study.config.version >= STUDY_VERSION_8_3: | ||
| file_study.tree.save( | ||
| properties.adequacy_patch_properties.model_dump(mode="json", by_alias=True), | ||
| properties.adequacy_patch_properties.model_dump(mode="json", by_alias=True, exclude_none=True), |
There was a problem hiding this comment.
Well spotted, but seems to me that a None should not be considered a valid value for studies > 8.3
The "initialization" is supposed to fill it
| trimming = ThematicTrimming() | ||
| initialize_thematic_trimming_against_version(trimming, self.get_impl().get_version()) | ||
| return trimming.model_copy(update=row.thematic_trimming) |
There was a problem hiding this comment.
If we don't do this, version specific fields that weren't initialized during creation would return None instead of default value.
In practice this never happen as we always suply all fields, so this has no functional impact today.
However keeping both DAOs consistent prevents confusing discrepancies if either implementation is extended /used in a new module in the future
There was a problem hiding this comment.
Indeed they will be supplied on creation anyway, I am not sure we should keep this code:
it does not incentivize us to have a clean creation in the first place, and it blurs the logic of this DAO where we want complete data in the first place.
|
|
||
| values = {"study_id": study_id, "layer_id": layer.id, "name": layer.name} | ||
| upsert_one(session, LAYER_TABLE, values) | ||
| self.get_impl().save_layer_areas(layer.id, layer.areas) |
There was a problem hiding this comment.
In FS DAO we don't save areas
@override
def save_layer(self, layer: Layer) -> None:
file_study = self.get_file_study()
file_study.tree.save(layer.name, ["layers", "layers", "layers", layer.id])There was a problem hiding this comment.
Here we need to fix the DAO signature to make things clear.
If this method is not supposed to update areas, it must have as an argument a class which does not hold the list of areas.
We should have a simpler class, like LayerMetadata or something like that.
Else we should refactor the callers so that they indeed use only one method to save both metadata and areas, but that would probably require more work
MartinBelthle
left a comment
There was a problem hiding this comment.
I read all the tests but not the source code. Really nice work
| ) | ||
| UserResourceDataCreation(path=rel_path, resource_type=ResourceType.FILE, blob_id=blob_id) | ||
| ) | ||
| elif item.is_dir(): |
There was a problem hiding this comment.
As discussed, either we find a way to only return files and empty folders instead of rglob (not sure it exists), or we add a and not next(item.listdir())
| Recursively remove keys whose value is an empty dict. | ||
|
|
||
| Normalizes FS vs DB scenario-builder output: the FS backend leaves empty | ||
| area dicts after all clusters are deleted, while the DB backend prunes |
There was a problem hiding this comment.
As discussed, we should have the same behvior for FS and DB meaning we should initialize missing data for the get_scenario_by_type method in the DB DAO
| 2: PlaylistValues(status=False), | ||
| 3: PlaylistValues(status=False), | ||
| 4: PlaylistValues(status=False, weight=0.2), | ||
| 7: PlaylistValues(weight=1.1), |
There was a problem hiding this comment.
Makes me realize that it does not make sense to set a Playlist for year 7 when the study has only 4 years. We should probably validate this but won't be done inside this PR
| stmt = select(COMMENTS_TABLE.c.comments).where(COMMENTS_TABLE.c.study_id == dao.get_study_id()) | ||
| assert db_session.execute(stmt).scalar_one() == comments | ||
| if isinstance(dao, DatabaseStudyDao): | ||
| stmt = select(COMMENTS_TABLE.c.comments).where(COMMENTS_TABLE.c.study_id == dao.get_study_id()) |
There was a problem hiding this comment.
Here you keep the part of the test where we check the DB content btut everywhere else seems you removed it. I think we can remove it from here too
| assert db_session.execute(select(COST_VARIATION_INJECTION_TABLE)).fetchall() == [] | ||
| assert db_session.execute(select(COST_VARIATION_WITHDRAWAL_TABLE)).fetchall() == [] | ||
| if isinstance(dao, DatabaseStudyDao): | ||
| with db_session: |
There was a problem hiding this comment.
Same remark, I'm fine with keeping this but you removed such thigns from other tests so we should do the same thing everywhere
| ) -> StudyDao: | ||
| if backend == "database": | ||
| dao = build_db_dao(db_session, matrix_service, STUDY_VERSION_9_3) | ||
| ConstantsMatrixUsageProvider(dao.generator_matrix_constants, matrix_service) |
There was a problem hiding this comment.
This should already be done inside the build_db_dao
|
|
||
| def test_provider_includes_reserve_need_matrix(db_session: Session, matrix_service: ISimpleMatrixService) -> None: | ||
| dao = build_db_dao(db_session, matrix_service, STUDY_VERSION_10_0) | ||
| def test_provider_includes_reserve_need_matrix(dao_10_0: StudyDao, core_cache: ICache) -> None: |
There was a problem hiding this comment.
You can add a TODO inside this test saying that it'll be removed when we'll fully support the v10.0 as it will be integrated inside the test_garbage_collection one
| assert all_properties == {"paris": default_props, "london": default_props} | ||
|
|
||
| assert len(db_recorder.sql_statements) == 1, str(db_recorder) | ||
| assert len(db_recorder.sql_statements) == 2, str(db_recorder) |
There was a problem hiding this comment.
Specify that the 2nd query is to get the study version
|
|
||
| # Ensures we modified the properties accordingly | ||
| assert dao.get_area_properties(area_id) == new_properties | ||
| expected = new_properties.model_copy() |
There was a problem hiding this comment.
Not sure to understand why you need to copy. Couldn't you remove this line ?
| assert dao.get_timeseries_config() == TimeSeriesConfiguration() | ||
| assert dao.get_thematic_trimming() == expected_thematic_trimming | ||
| actual_general = dao.get_general_config() | ||
| assert actual_general == expected_general_config, explain_model_diff(actual_general, expected_general_config) |
There was a problem hiding this comment.
Minor: I like explain_model_diff but to me it should be used ponctually and not in the tests hard-coded, it makes the test harder to read IMO
|
|
||
| def test_thermal_matrices_lifecycle(db_session: Session, db_dao: DatabaseStudyDao) -> None: | ||
| dao = db_dao | ||
| def test_thermal_matrix_round_trip(dao: StudyDao, matrix_service) -> None: |
| constraints_result: dict[str, Any] = { | ||
| area_id: {storage_id: {} for storage_id in storages} | ||
| for area_id, storages in self.get_impl().get_all_st_storages().items() | ||
| } | ||
| for area_id in self.get_impl().get_all_area_ids(): | ||
| constraints_result.setdefault(area_id, {}) |
There was a problem hiding this comment.
align DB with FS. Entires must be set for each constraint.
| return self.get_impl().get_matrix(_get_load_matrix_path(area_id)) | ||
| return self._get_area_matrix(area_id, _get_load_matrix_path) |
There was a problem hiding this comment.
I just make FS dao raise same error than DB dao
| if renames: | ||
| self._apply_projection_renames(renames) |
There was a problem hiding this comment.
aligh FS with DB. Updates should cascade.
| projection = settings.sensitivity_config.projection if settings.sensitivity_config else [] | ||
| if projection: | ||
| existing_names = {c["name"] for c in self._get_all_xpansion_candidates().values()} | ||
| missing = [name for name in projection if name not in existing_names] | ||
| if missing: | ||
| raise CandidateNotFoundError("One or more candidates in the projection do not exist") |
There was a problem hiding this comment.
DB dao enforce it so FS should too. Plus this would make the DB dao look slower when it actually just does more work.
…ly, also revert formating
| revision = "4121d3b48393" | ||
| down_revision = "rp986cf862cy" |
There was a problem hiding this comment.
when i merged dev my ruff wasn't happy with single quotes
| if study_version >= STUDY_VERSION_8_4: | ||
| # FS gets this via the empty_study template's generaldata.ini; DB has no template. | ||
| optimization_preferences.transmission_capacities = TransmissionCapacities.LOCAL_VALUES |
There was a problem hiding this comment.
I can't put this in initialize_optimization_preferences_against_version because initialize_optimization_preferences_against_version is called each time we parse a file study, so on each read, and it would overwrite the default value (which is True and not None, so I can't just check is None, and I can't indentify an instance where transmission_capacities was set to True vs an instance where transmission_capacities was never updated).
There was a problem hiding this comment.
Yes, as discussed it's the right place :)
It's the equivalent of having different values between version templates, on the FS side.
| table, id_col = _AREA_ITEM_TABLE_MAP[scenario_type] | ||
| stmt = select(table).where(table.c.study_id == study_id) | ||
| result: dict[str, Any] = {} | ||
| result: dict[str, Any] = {area_id: {} for area_id in self.get_impl().get_all_area_ids()} |
There was a problem hiding this comment.
Why are you initializing for 2 if but not for the others ? Seems you should do it for all cases
| else: | ||
| activated = sum(1 for v in playlist.years.values() if v.status) | ||
| default_status = activated > len(playlist.years) // 2 | ||
| expanded = dict(playlist.years) |
There was a problem hiding this comment.
Minor but why are you using dict(...), playlist.years is already a dict
| default_status = True | ||
| else: | ||
| activated = sum(1 for v in playlist.years.values() if v.status) | ||
| default_status = activated > len(playlist.years) // 2 |
There was a problem hiding this comment.
Discussed the matter with Sylvain and we think you should always consider the default_status is True. It will be different from the FS but we'd actually want the FS to be like the DB for this case as it's less misleading.
You don't have to modify the FS Dao in this PR just consider the default_status to be True and we'll be good
| @override | ||
| def get_reserves_global_parameters(self, area_id: str) -> ReservesGlobalParameters: | ||
| file_study = self.get_file_study() | ||
| check_area_exists(file_study.config, area_id) |
There was a problem hiding this comment.
Makes me realize that you could use this in other DAOs you've modified to simplify a bit the reading
| @override | ||
| def get_all_reserves_global_parameters(self) -> ReservesGlobalParametersMapping: | ||
| file_study = self.get_file_study() | ||
| return {area_id: self.get_reserves_global_parameters(area_id) for area_id in file_study.config.areas} |
There was a problem hiding this comment.
Not sure to understand why you've changed this code. Was easier to read
|
|
||
| for area_id, value in data.items(): | ||
| for storage_id, constraints in value.items(): | ||
| if not self.st_storage_exists(area_id, storage_id): |
There was a problem hiding this comment.
This is not performant at all. For the rest of the modification you've done I'm okay with them because they either use the config or they are in the deletion methods which are almost never used. But here I think we should be careful with performances.
We should also use the config here or at least check the file only once per area to avoid several I/O operations on the same file (.config is faster we should use it and even do it once per area to avoid going through the sts list several times)
| # Mirror the DB read path which initializes version-specific defaults | ||
| # (enabled, ...) so consumers reading from `config` see the same | ||
| # fully-populated storage as the DB DAO returns. | ||
| initialize_st_storage(storage, study_data.version) |
There was a problem hiding this comment.
I think we don't need this, but I'll let Sylvain give his opinion
| filters_synthesis = values.pop( | ||
| "filter-synthesis", values.pop("filter_synthesis", values.pop("filters_synthesis", "")) | ||
| ) | ||
| filters_year = values.pop( | ||
| "filter-year-by-year", values.pop("filter_year_by_year", values.pop("filters_year", "")) | ||
| ) |
There was a problem hiding this comment.
Not sure to understand why you changed this code
| for area_id, ui_info in ui_info_map.items(): | ||
| ui_data = AreaUIData.model_validate(AreaUIFileData(**ui_info).to_config()) | ||
| if ui_data.layer_x: | ||
| ui_data.ui["layers"] = " ".join(sorted(ui_data.layer_x, key=int)) |
There was a problem hiding this comment.
This does not seem to be what we're doing in DB, I'm confused and the Area ui code sucks so much.
In DB there's no notion of if layer_x
To me we shouldn't modify this code to make it even harder to understand for a case that never happens.
sylvlecl
left a comment
There was a problem hiding this comment.
A few comments, could't review everything yet
| initialize_area_properties(props, study_version) | ||
| return props |
There was a problem hiding this comment.
In reality, this should never be useful for database implementation: the "initialization" goal is to replace missing data with their default values, but in database impl we should never have a missing data: the default value data should always be inserted from the start.
So I would be more in favor of removing this. If it's needed, it means there's a problem upstream.
|
|
||
| values = {"study_id": study_id, "layer_id": layer.id, "name": layer.name} | ||
| upsert_one(session, LAYER_TABLE, values) | ||
| self.get_impl().save_layer_areas(layer.id, layer.areas) |
There was a problem hiding this comment.
Here we need to fix the DAO signature to make things clear.
If this method is not supposed to update areas, it must have as an argument a class which does not hold the list of areas.
We should have a simpler class, like LayerMetadata or something like that.
Else we should refactor the callers so that they indeed use only one method to save both metadata and areas, but that would probably require more work
| if study_version >= STUDY_VERSION_8_4: | ||
| # FS gets this via the empty_study template's generaldata.ini; DB has no template. | ||
| optimization_preferences.transmission_capacities = TransmissionCapacities.LOCAL_VALUES |
There was a problem hiding this comment.
Yes, as discussed it's the right place :)
It's the equivalent of having different values between version templates, on the FS side.
| trimming = ThematicTrimming() | ||
| initialize_thematic_trimming_against_version(trimming, self.get_impl().get_version()) | ||
| return trimming.model_copy(update=row.thematic_trimming) |
There was a problem hiding this comment.
Indeed they will be supplied on creation anyway, I am not sure we should keep this code:
it does not incentivize us to have a clean creation in the first place, and it blurs the logic of this DAO where we want complete data in the first place.
| for area_id, ui_info in ui_info_map.items(): | ||
| ui_data = AreaUIData.model_validate(AreaUIFileData(**ui_info).to_config()) | ||
| if ui_data.layer_x: | ||
| ui_data.ui["layers"] = " ".join(sorted(ui_data.layer_x, key=int)) |
There was a problem hiding this comment.
Hmmm, I mostly agree. Indeed we ensure stronger consistency in the DB implementation, with foreign keys, cascade deletes and so on, which is a good thing: we can never have data for a layer that does not exist.
But here, if files contain inconsistencies, I don't think we should try to hide them on the fly.
There was a problem hiding this comment.
But let's not lose time on this, it's not important.
If we want to make this right, we'll need to dig a little more: for example it may make sense to have a layer absent from the list, but still have coordinates, because maybe the user wanted to mask it temporarily without losing the coordinates.
To be honest we don't care about that kid of details for now, it has not been well thought in the past, and we'll not do it now
sylvlecl
left a comment
There was a problem hiding this comment.
Mostly very good stuff bu I think there is a few places where the adaptation is not relevant.
In particular we need to have a clear idea of the rsponsibilities of the DAO layer vs higher layer, in particular for missing values and validation:
- DAO layer should validate its inputs against study version
- DB DAO should not bother filling missing values, because we forbid missing values
- FS DAO must fill missing values, because it's part of the format definition
| @override | ||
| def delete_layer(self, layer_id: str) -> None: | ||
| if layer_id == DEFAULT_LAYER_ID: | ||
| raise LayerNotAllowedToBeDeleted() |
| def get_renewable_series(self, area_id: str, renewable_id: str) -> pl.DataFrame: | ||
| return self.get_impl().get_matrix(_get_renewable_series_path(area_id, renewable_id)) | ||
| if area_id not in self.get_file_study().config.areas: | ||
| raise AreaNotFound(area_id) |
There was a problem hiding this comment.
Nice: seems like it's a very common check: maybe we should have a small function "check_area_exists" ?
There was a problem hiding this comment.
Ah indeed there's already a comment about this
| data = file_study.tree.get(_get_reserves_ini_path(area_id)) | ||
| except (ChildNotFoundError, KeyError): | ||
| data = {} | ||
| if GLOBAL_PARAMETERS_SECTION not in data: |
There was a problem hiding this comment.
Don't we want to use the default values here instead ?
There was a problem hiding this comment.
Elaborating on this:
having missing data in files is not the same as having missing data in DB.
- having missing data in files must be interpreted, in general, as "use the default values". This is not our convention, this is the convention of the simulator, and we must stick with it. We don't have control on the files that we receive, so we cannot assume that everything is defined
- having missing data in DB is an application error. We should assume that we don't have missing data in our DB, because WE filled it up in the first place.
So it's natural to have differences in that case:
- for FS, we must fill missing values with default values
- for DB, we should not do it: if there is missing data, it's an application error that we must fix in the code
| data = file_study.tree.get(_get_reserves_ini_path(area_id)) | ||
| except (ChildNotFoundError, KeyError): | ||
| continue | ||
| if GLOBAL_PARAMETERS_SECTION not in data: |
There was a problem hiding this comment.
Same here, I think we want default values instead of removing the area from the map.
| # Mirror the DB read path which initializes version-specific defaults | ||
| # (enabled, ...) so consumers reading from `config` see the same | ||
| # fully-populated storage as the DB DAO returns. | ||
| initialize_st_storage(storage, study_data.version) |
There was a problem hiding this comment.
I think we should not do it:
what we should have is a content-check on the object which is received, which should alreay be a "full" object, otherwise we should raise.
In that particular case, I think the DB side is not right either:
- on write it should perfom the content check
- on read it should not try to fill missing data. Same pattern as other comments
| filters_synthesis = values.pop( | ||
| "filter-synthesis", values.pop("filter_synthesis", values.pop("filters_synthesis", "")) | ||
| ) | ||
| filters_year = values.pop( | ||
| "filter-year-by-year", values.pop("filter_year_by_year", values.pop("filters_year", "")) | ||
| ) |
There was a problem hiding this comment.
Seems to me the only 2 places we use that class are:
- from the INI file (kebab case stuff):
for that we should just refactor_parse_links_filteringto useparse_linkwhich already handles the parsing logic - from the Link class:
there we should refactor theto_configmethod to just pass explicitly arguments to the LinkConfig constructor
Then we can get rid of that obscure validator

No description provided.