Introduce orm for task tag#353
Conversation
|
Warning Review limit reached
Next review available in: 44 minutes Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available. How can I continue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews. How do review limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window. Please refer docs for additional details. Review details⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
WalkthroughThis PR replaces SQLAlchemy Changes
Sequence Diagram(s)sequenceDiagram
participant TagTaskEndpoint
participant DatabaseTasks
participant AsyncSession
participant TaskTag
TagTaskEndpoint->>DatabaseTasks: tag(id_, tag_, user_id, session)
DatabaseTasks->>TaskTag: construct TaskTag instance
DatabaseTasks->>AsyncSession: session.add(task_tag)
DatabaseTasks->>AsyncSession: session.flush()
AsyncSession-->>DatabaseTasks: IntegrityError on duplicate
Possibly related PRs
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Hey - I've found 2 issues
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="src/database/users.py" line_range="138-139" />
<code_context>
return None
async def get_groups(self) -> list[UserGroup]:
- if self._groups is None:
- self._groups = await get_user_groups_for(
- user_id=self.user_id,
- connection=self._database,
- )
+ if self._groups:
+ return self._groups
+
</code_context>
<issue_to_address>
**issue (bug_risk):** Caching of user groups breaks for users without groups.
Because `[]` is falsy, this guard causes users with no groups to query the DB on every `get_groups` call instead of caching the empty list. Using a `None` sentinel preserves the distinction between “not yet loaded” and “loaded but empty`:
```python
async def get_groups(self) -> list[UserGroup]:
if self._groups is not None:
return self._groups
if self._database is None:
msg = "`get_groups` can only be used when `connection` is provided on instantiation."
raise RuntimeError(msg)
self._groups = await get_user_groups_for(
user_id=self.user_id,
connection=self._database,
)
return self._groups
```
This avoids redundant queries for users without groups while keeping the caching behavior intact.
</issue_to_address>
### Comment 2
<location path="src/routers/openml/runs.py" line_range="181-183" />
<code_context>
setup_id=run.setup,
setup_string=ctx.setup.setup_string if ctx.setup else None,
parameter_setting=[
- ParameterSetting(name=p["name"], value=p["value"], component=p["flow_id"])
+ ParameterSetting(name=p.name, value=p.value, component=p.flow_id)
for p in ctx.parameter_rows
],
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Switching from mapping-style access to attribute-style access on ORM rows may be brittle.
Using `p.name` / `p.value` / `p.flow_id` relies on `UntypedRow` consistently exposing these as attributes, which can be fragile if column labels change or collide. Since this feeds public response construction, consider keeping mapping-style access, or ensure the upstream query defines explicit, stable labels so attribute names remain reliable.
```suggestion
parameter_setting=[
ParameterSetting(name=p["name"], value=p["value"], component=p["flow_id"])
for p in ctx.parameter_rows
],
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Codecov Report❌ Patch coverage is ❌ Your patch check has failed because the patch coverage (91.48%) is below the target coverage (100.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #353 +/- ##
==========================================
- Coverage 95.17% 94.54% -0.64%
==========================================
Files 74 76 +2
Lines 3710 3775 +65
Branches 244 248 +4
==========================================
+ Hits 3531 3569 +38
- Misses 114 139 +25
- Partials 65 67 +2 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/conftest.py (1)
140-150: 🎯 Functional Correctness | 🔴 Critical | ⚡ Quick winMonkeypatched
routers.dependencies.expdb_connectionis never restored — leaks into later tests.Line 140 reassigns the module attribute
routers.dependencies.expdb_connection = override_expdb, but the teardown at lines 149-150 only resetsapp.dependency_overrides[...], not the module attribute itself. After this fixture's teardown,routers.dependencies.expdb_connectionpermanently points tooverride_expdb, which closes over a stale, already-rolled-backexpdb_testconnection from this test. Any later call toexpdb_session()(which reads the module-levelexpdb_connectionat call time) — in a later test or even in production code paths sharing this process — will use that dead connection.🛠️ Proposed fix: restore the module attribute on teardown
app.dependency_overrides[expdb_connection] = expdb_connection app.dependency_overrides[userdb_connection] = userdb_connection + routers.dependencies.expdb_connection = expdb_connectionNote: refactoring
expdb_session()to takeconnection: Annotated[AsyncConnection, Depends(expdb_connection)](flagged separately insrc/routers/dependencies.py) would remove the need for this monkeypatch entirely.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/conftest.py` around lines 140 - 150, The test fixture leaves the monkeypatched routers.dependencies.expdb_connection in place, so later tests can pick up a stale override. In the fixture that sets routers.dependencies.expdb_connection = override_expdb, add teardown to restore the original module-level expdb_connection after the async client context exits, alongside the existing app.dependency_overrides cleanup. Keep the fix localized to this fixture and use the expdb_connection and override_expdb symbols to ensure the module attribute is reset reliably.
🧹 Nitpick comments (4)
src/routers/openml/datasets.py (2)
335-352: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value
_get_dataset_raise_otherwisestill typed asRow[Any]instead ofUntypedRow.The rest of this cohort (e.g.
study.py's_get_study_raise_otherwise) migrated equivalent helpers to the newUntypedRowalias, but this helper's return type annotation is left asRow[Any].♻️ Suggested fix
+from database.schema.base import UntypedRow + async def _get_dataset_raise_otherwise( dataset_id: Identifier, user: User | None, expdb: AsyncConnection, -) -> Row[Any]: +) -> UntypedRow:🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/routers/openml/datasets.py` around lines 335 - 352, Update `_get_dataset_raise_otherwise` to match the cohort’s new typing by replacing the `Row[Any]` return annotation with `UntypedRow`. Keep the function logic unchanged, and ensure any related imports in `datasets.py` are adjusted so the helper aligns with `_get_study_raise_otherwise` and other migrated helpers.
356-362: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueStale
assert ... is not Noneleft over from optional-dependency era.
expdb/user_db/expdb_dbare now requiredDepends(...)parameters (no= Nonedefault), so these asserts can never fail and are dead code.list_datasetsabove already had its analogous assert removed when the parameter became required; these two spots were missed.🧹 Suggested cleanup
async def get_dataset_features( dataset_id: Identifier, expdb: Annotated[AsyncConnection, Depends(expdb_connection)], user: Annotated[User | None, Depends(fetch_user)] = None, ) -> list[Feature]: - assert expdb is not None # noqa: S101 await _get_dataset_raise_otherwise(dataset_id, user, expdb)async def get_dataset( dataset_id: Identifier, user_db: Annotated[AsyncConnection, Depends(userdb_connection)], expdb_db: Annotated[AsyncConnection, Depends(expdb_connection)], user: Annotated[User | None, Depends(fetch_user)] = None, ) -> DatasetMetadata: - assert user_db is not None # noqa: S101 - assert expdb_db is not None # noqa: S101 dataset = await _get_dataset_raise_otherwise(dataset_id, user, expdb_db)Also applies to: 453-461
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/routers/openml/datasets.py` around lines 356 - 362, Remove the stale non-null assertion in get_dataset_features, since expdb is now a required Depends parameter and the assert can never fail. Apply the same cleanup to the analogous required-dependency check in the other affected router function mentioned in the review (the one using user_db/expdb_db), following the pattern already used in list_datasets.src/routers/openml/study.py (1)
66-72: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueStale
assert expdb is not Noneleft over from optional-dependency era.
expdbis now a requiredDepends(expdb_connection)parameter in all three endpoints (no= Nonedefault), so these asserts are unreachable dead code. Same pattern flagged insrc/routers/openml/datasets.py.🧹 Suggested cleanup
async def attach_to_study( ... expdb: Annotated[AsyncConnection, Depends(expdb_connection)], ) -> AttachDetachResponse: - assert expdb is not None # noqa: S101 if user is None:async def create_study( ... expdb: Annotated[AsyncConnection, Depends(expdb_connection)], ) -> dict[Literal["study_id"], int]: - assert expdb is not None # noqa: S101 if study.main_entity_type == StudyType.RUN and study.tasks:async def get_study( alias_or_id: Identifier | str, expdb: Annotated[AsyncConnection, Depends(expdb_connection)], user: Annotated[User | None, Depends(fetch_user)] = None, ) -> Study: - assert expdb is not None # noqa: S101 study = await _get_study_raise_otherwise(alias_or_id, user, expdb)Also applies to: 121-126, 158-163
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/routers/openml/study.py` around lines 66 - 72, Remove the stale `assert expdb is not None` checks from `attach_to_study` and the other OpenML endpoints using the same `expdb_connection` dependency pattern, since `expdb` is now always required and the asserts are dead code. Keep the `Depends(expdb_connection)` parameter on each handler and clean up the redundant null-checks in the affected `study.py` and matching `datasets.py` endpoint functions.src/database/datasets.py (1)
113-121: 🩺 Stability & Availability | 🔵 Trivial | 💤 Low valueGood defensive fix for
e.orig is None.Prevents an
AttributeErrorwhen unpackinge.orig.argsiforigis unexpectedlyNone. Note the samecode, msg = e.orig.argsunpacking pattern is likely present inflows.tag,runs.tag, andsetups.tag(per graph references) — consider applying the same guard there for consistency if not already covered.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/database/datasets.py` around lines 113 - 121, Add the same defensive null check used in datasets.py before unpacking `e.orig.args` in the other IntegrityError handlers mentioned by the graph references, especially in `flows.tag`, `runs.tag`, and `setups.tag`. Locate the matching exception blocks that do `code, msg = e.orig.args`, guard against `e.orig is None` there as well, and keep the existing ForeignKeyConstraintError/DuplicatePrimaryKeyError mapping unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/database/users.py`:
- Around line 139-141: The caching check in the user groups lookup is using a
truthy test, which causes an already-cached empty list to be treated as missing.
Update the groups accessor in the user model to use an explicit None check on
the cached field (for example, in the method that returns self._groups) so an
empty list is returned from cache instead of re-fetching or raising again.
---
Outside diff comments:
In `@tests/conftest.py`:
- Around line 140-150: The test fixture leaves the monkeypatched
routers.dependencies.expdb_connection in place, so later tests can pick up a
stale override. In the fixture that sets routers.dependencies.expdb_connection =
override_expdb, add teardown to restore the original module-level
expdb_connection after the async client context exits, alongside the existing
app.dependency_overrides cleanup. Keep the fix localized to this fixture and use
the expdb_connection and override_expdb symbols to ensure the module attribute
is reset reliably.
---
Nitpick comments:
In `@src/database/datasets.py`:
- Around line 113-121: Add the same defensive null check used in datasets.py
before unpacking `e.orig.args` in the other IntegrityError handlers mentioned by
the graph references, especially in `flows.tag`, `runs.tag`, and `setups.tag`.
Locate the matching exception blocks that do `code, msg = e.orig.args`, guard
against `e.orig is None` there as well, and keep the existing
ForeignKeyConstraintError/DuplicatePrimaryKeyError mapping unchanged.
In `@src/routers/openml/datasets.py`:
- Around line 335-352: Update `_get_dataset_raise_otherwise` to match the
cohort’s new typing by replacing the `Row[Any]` return annotation with
`UntypedRow`. Keep the function logic unchanged, and ensure any related imports
in `datasets.py` are adjusted so the helper aligns with
`_get_study_raise_otherwise` and other migrated helpers.
- Around line 356-362: Remove the stale non-null assertion in
get_dataset_features, since expdb is now a required Depends parameter and the
assert can never fail. Apply the same cleanup to the analogous
required-dependency check in the other affected router function mentioned in the
review (the one using user_db/expdb_db), following the pattern already used in
list_datasets.
In `@src/routers/openml/study.py`:
- Around line 66-72: Remove the stale `assert expdb is not None` checks from
`attach_to_study` and the other OpenML endpoints using the same
`expdb_connection` dependency pattern, since `expdb` is now always required and
the asserts are dead code. Keep the `Depends(expdb_connection)` parameter on
each handler and clean up the redundant null-checks in the affected `study.py`
and matching `datasets.py` endpoint functions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 0780e173-dcec-4820-935c-f5780119637f
📒 Files selected for processing (23)
.pre-commit-config.yamlsrc/core/formatting.pysrc/database/datasets.pysrc/database/evaluations.pysrc/database/flows.pysrc/database/runs.pysrc/database/schema/__init__.pysrc/database/schema/base.pysrc/database/schema/tags.pysrc/database/setups.pysrc/database/studies.pysrc/database/tasks.pysrc/database/users.pysrc/main.pysrc/routers/dependencies.pysrc/routers/openml/datasets.pysrc/routers/openml/runs.pysrc/routers/openml/study.pysrc/routers/openml/tasks.pytests/conftest.pytests/routers/openml/runs_get_test.pytests/routers/openml/task_tag_test.pytests/users.py
|
Code coverage decrease explained by:
|
Description
Start using ORM layer for simple things, first up: tags.