Skip to content

feat: overhaul archive search logic across client#648

Merged
drake-nominal merged 1 commit intomainfrom
deidukas/revamp-archive-in-search
Mar 18, 2026
Merged

feat: overhaul archive search logic across client#648
drake-nominal merged 1 commit intomainfrom
deidukas/revamp-archive-in-search

Conversation

@drake-nominal
Copy link
Copy Markdown
Collaborator

@drake-nominal drake-nominal commented Mar 9, 2026

Summary

Revamps archive-aware search across the SDK by introducing a shared ArchiveStatusFilter flow, threading it through public search APIs, and adding end-to-end coverage for the new behavior.

What changed

  • Added a public ArchiveStatusFilter enum and exported it from nominal.core.
  • Added archive_status support to search APIs that previously only returned non-archived results:
    • NominalClient.search_datasets
    • NominalClient.search_secrets
    • NominalClient.search_videos
    • NominalClient.search_runs
    • NominalClient.search_checklists
    • NominalClient.search_assets
    • NominalClient.search_events
    • NominalClient.search_workbooks
    • NominalClient.search_workbook_templates
    • Asset.search_workbooks
    • Run.search_workbooks
  • Deprecated legacy archive booleans where applicable using the _NotProvided + @warn_on_deprecated_argument pattern:
    • archived
    • include_archived
  • Centralized old/new argument resolution in resolve_effective_archive_status(...) so the archive-status mapping and conflict handling live in one place.
  • Updated pagination helpers to pass archive-status filters through to backend search requests for assets, runs, events, videos, secrets, checklists, and data reviews.
  • Added temporary query backfills for datasets, workbooks, and workbook templates where backend filtering is not fully supported yet.
  • Updated workbook search pagination to request archived/draft records and rely on query-side filtering until the server-side request flags can be removed.
  • Fixed dataset search plumbing so dataset archive-status searches are correctly forwarded through query construction and pagination.

Test coverage

  • Added e2e coverage for archive_status on top-level public search methods.
  • Added e2e coverage for both archive_status and deprecated include_archived on:
    • NominalClient.search_workbooks
    • Asset.search_workbooks
    • Run.search_workbooks
  • Tightened search test setup to use shared archived/active fixtures and more precise filters so the staging suite stays deterministic.
  • Cleaned up mypy/ruff issues in the touched test files.

Testing

  • pytest tests/e2e/test_search.py --profile staging
  • ruff check tests
  • mypy tests

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 5 additional findings.

Open in Devin Review

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Mar 9, 2026

Greptile Summary

This PR overhauled archive-aware search across the SDK by introducing a shared ArchiveStatusFilter enum, threading it through all public search APIs (search_datasets, search_runs, search_assets, search_events, search_videos, search_secrets, search_checklists, search_workbooks, search_workbook_templates, Asset.search_workbooks, Run.search_workbooks), and deprecating the legacy archived / include_archived booleans via the _NotProvided + @warn_on_deprecated_argument pattern. Resolution of deprecated args is centralised in resolve_effective_archive_status. Workspace RID resolution was also refactored from direct attribute access (self._clients.workspace_rid) to a lazy-cached resolve_default_workspace_rid() method backed by a thread-safe LazyField. End-to-end coverage for all three filter values (NOT_ARCHIVED, ARCHIVED, ANY) and the deprecation warning path was added for every touched search method.

Key changes and findings:

  • ArchiveStatusFilter enum exported from nominal.core with to_api_archived_statuses() helper; server-side filtering used for events, runs, assets, checklists, secrets, videos; query-side backfill used for datasets, workbooks, and workbook templates (TODOs left for future server-side support).
  • LazyField[T] provides thread-safe lazy initialisation of the default workspace; get_or_init is protected by a Lock; lock-free reads in resolve_workspace are safe because _value transitions only from _UNSET → set.
  • create_or_find_asset now searches in WorkspaceSearchType.DEFAULT instead of WorkspaceSearchType.ALL for unpinned clients (no pinned workspace_rid). This silently narrows the search scope: assets in non-default workspaces will be missed, potentially creating duplicates, and clients with no resolvable default workspace will now get a NominalConfigError where they previously succeeded.
  • The positional-arg detection branch in warn_on_deprecated_argument is acknowledged with a TODO — all deprecated params in this PR are keyword-only so this branch is currently dead code.

Confidence Score: 3/5

  • Mostly safe to merge; one logic regression in create_or_find_asset for unpinned clients should be addressed before release.
  • The archive-status threading is well-structured and consistently applied across all affected search methods. The deprecation pattern is clean. The main concern is create_or_find_asset: it previously searched all workspaces for unpinned clients and now searches only the resolved default workspace, which is a silent behavioural regression that could create duplicate assets or raise unexpected errors in production.
  • nominal/core/client.py — specifically the create_or_find_asset method (line 1181)

Important Files Changed

Filename Overview
nominal/core/_utils/query_tools.py Introduces ArchiveStatusFilter enum and resolve_effective_archive_status, centralising all deprecated-arg resolution. Backfill helpers correctly construct OR-clauses for query-side archive filtering. Edge-case conflicts between deprecated args are partially handled (new-vs-deprecated is guarded; deprecated-vs-deprecated falls through silently—already flagged in prior threads).
nominal/core/client.py Broadly correct archive-status threading through all search methods, but create_or_find_asset silently narrowed its search scope from "all workspaces" to "default workspace" for unpinned clients, which can cause duplicate-asset creation or NominalConfigError regressions.
nominal/core/_utils/pagination_tools.py Clean refactor: all paginated search helpers now accept archive_status and thread it into archived_statuses on the request object. Workbook/template helpers correctly defer to query-side filtering with a documenting TODO.
nominal/core/_clientsbunch.py Introduces LazyField-backed resolve_default_workspace_rid and resolve_workspace helpers. Thread-safety is achieved via Lock inside LazyField.get_or_init; the lockless is_initialized() / get() read path in resolve_workspace is safe because _value is irreversibly set.
nominal/_utils/dataclass_tools.py New LazyField[T] class provides thread-safe lazy initialisation with Lock. get_or_init is atomic; get and is_initialized are lock-free but safe since _value only transitions from _UNSET to a non-_UNSET value.
nominal/_utils/deprecation_tools.py Adds _NotProvided sentinel class. The positional-arg detection branch (len(args) > len(param_names) - 1) is noted with a TODO because all deprecated kwargs in this PR are keyword-only and can't actually be passed positionally—this branch is dead code in current usage.
nominal/core/workbook.py Correctly migrates _search_workbooks and _iter_search_workbooks to archive_status; removes the now-unused include_archived/archived params from internal helpers. Workbook pagination always requests archived+draft and relies on the query clause for filtering.
nominal/core/asset.py Correctly deprecates include_archived, adds archive_status, and wires resolve_effective_archive_status into Asset.search_workbooks. Also migrates workspace_rid accesses to resolve_default_workspace_rid().
nominal/core/run.py Mirrors the asset.py pattern for Run.search_workbooks: deprecates include_archived, adds archive_status, and delegates resolution to resolve_effective_archive_status.
tests/e2e/test_search.py Substantial new coverage: ArchiveSearchContext fixture creates paired active/archived resources, and _assert_archive_status_behavior / _assert_include_archived_behavior helpers verify all three filter values and the deprecation warning path.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["Public search method\n(e.g. search_runs, search_workbooks)"] --> B{Deprecated args\npresent?}
    B -- "archive_status + legacy arg" --> C["ValueError raised"]
    B -- "include_archived=True" --> D["ArchiveStatusFilter.ANY"]
    B -- "archived=True" --> E["ArchiveStatusFilter.ARCHIVED"]
    B -- "neither / False values" --> F["ArchiveStatusFilter.NOT_ARCHIVED\n(default)"]
    B -- "archive_status provided" --> G["Use archive_status directly"]

    D --> H{Backend\nsupports\narchived_statuses?}
    E --> H
    F --> H
    G --> H

    H -- "Yes (runs, assets, events,\nchecklists, secrets, videos)" --> I["Pass archived_statuses list\nto paginated request"]
    H -- "No (datasets, workbooks,\ntemplates — backfill)" --> J["_backfill_*_archive_query_clause()\nAdds OR/AND query clause\nfor archive status"]

    I --> K["paginate_rpc → server-side filter"]
    J --> L["paginate_rpc → query-side filter\n(show_archived=True for workbooks)"]
Loading

Comments Outside Diff (1)

  1. nominal/core/client.py, line 1181 (link)

    create_or_find_asset search scope regression for unpinned clients

    The old code passed self._clients.workspace_rid directly to search_assets. When the client has no explicitly pinned workspace (workspace_rid=None), that None propagated through search_assets(workspace=None)workspace or WorkspaceSearchType.ALL → a search across all workspaces.

    With the new code, WorkspaceSearchType.DEFAULT is always used, which calls resolve_default_workspace_rid() and scopes the search to a single workspace. For unpinned clients (no configured workspace_rid) this causes two behavioral regressions:

    1. Duplicate assets can be silently created — if an existing asset with the same properties lives in a non-default workspace, it won't be found, and a new duplicate will be created in the default workspace.
    2. Hard failure where there was none — if no default workspace can be resolved, create_or_find_asset now raises NominalConfigError, whereas before it returned results from the global search.

    For pinned clients the behaviour is unchanged (both paths resolve to the same workspace RID). Consider keeping the old fallback for unpinned clients, or documenting that a workspace must be configured for create_or_find_asset to work correctly:

    # Preserve old all-workspaces fallback when no workspace is pinned:
    search_workspace = WorkspaceSearchType.DEFAULT if self._clients.workspace_rid else WorkspaceSearchType.ALL
    assets = self.search_assets(properties=properties, workspace=search_workspace)

Last reviewed commit: 0581daf

greptile-apps[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

greptile-apps[bot]

This comment was marked as resolved.

@drake-nominal drake-nominal force-pushed the deidukas/revamp-archive-in-search branch 2 times, most recently from 5611231 to 0581daf Compare March 9, 2026 22:36
@drake-nominal drake-nominal changed the base branch from main to deidukas/fix-workspace-resolution March 9, 2026 22:42
Base automatically changed from deidukas/fix-workspace-resolution to main March 12, 2026 21:41
@drake-nominal drake-nominal force-pushed the deidukas/revamp-archive-in-search branch from 0581daf to 38ee33b Compare March 16, 2026 13:01
Comment thread tests/e2e/test_search.py
@drake-nominal drake-nominal force-pushed the deidukas/revamp-archive-in-search branch from 38ee33b to 45f6e1f Compare March 17, 2026 00:13
devin-ai-integration[bot]

This comment was marked as resolved.

Copy link
Copy Markdown
Collaborator

@seanmreidy seanmreidy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question about type hint additions

Comment thread tests/test_write_stream.py
@drake-nominal drake-nominal force-pushed the deidukas/revamp-archive-in-search branch from 51f154b to d66def8 Compare March 18, 2026 13:31
@drake-nominal drake-nominal force-pushed the deidukas/revamp-archive-in-search branch from d66def8 to 676e992 Compare March 18, 2026 13:58
@drake-nominal
Copy link
Copy Markdown
Collaborator Author

@claude do a review again

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 18, 2026

Claude finished @drake-nominal's task —— View job


Code Review for PR #648

I've performed a comprehensive analysis of this PR that overhauls archive-aware search across the SDK. Here's my detailed review:

Overall Assessment

This is a well-structured and comprehensive refactor that successfully centralizes archive status filtering across the client SDK. The implementation is consistent, thoroughly tested, and maintains backward compatibility through proper deprecation patterns.

🔍 Key Strengths

1. Clean Architecture

  • ArchiveStatusFilter enum: Well-designed with clear semantics (ARCHIVED, NOT_ARCHIVED, ANY)
  • Centralized resolution: resolve_effective_archive_status() handles all deprecated argument mapping in one place
  • Consistent API threading: Archive status support added uniformly across all search methods

2. Backward Compatibility

  • Proper deprecation pattern: Uses @warn_on_deprecated_argument decorators with clear warning messages
  • Gradual migration path: _NotProvided sentinel allows distinguishing between explicit and implicit arguments
  • Preserved semantics: Legacy behavior maintained while guiding users to new API

3. Comprehensive Test Coverage

  • End-to-end coverage: ArchiveSearchContext fixture creates paired active/archived resources
  • Systematic testing: _assert_archive_status_behavior helper validates all three filter states
  • Deprecation testing: Explicit coverage for warning behavior when using deprecated args

⚠️ Issues to Address

1. Missing Conflict Detection (Medium Priority)

nominal/core/_utils/query_tools.py:48-79

The function properly prevents mixing new and deprecated arguments, but does not detect conflicts between deprecated arguments:

# This silently succeeds and returns ARCHIVED, but should raise an error:
resolve_effective_archive_status(archived=True, include_archived=False)

Recommendation: Add conflict detection:

if has_archived and has_include_archived:
    if archived != include_archived:  # Conflicting values
        raise ValueError(
            "Conflicting deprecated arguments: `archived` and `include_archived` "
            "have contradictory values. Use `archive_status` instead."
        )

2. archived=None Behavior Change (Low Priority)

nominal/core/_utils/query_tools.py:51-78

The old API where archived=None meant "no archive filter" now defaults to NOT_ARCHIVED. Since isinstance(None, bool) is False, both archived=None and not providing archived follow the same path.

Impact: Minimal - this is likely the intended behavior for the new API, but worth documenting.

3. Stacked Decorator Stack Level (Low Priority)

nominal/_utils/deprecation_tools.py:22-23

When multiple @warn_on_deprecated_argument decorators are stacked (like on search_workbooks), the inner decorator may report incorrect line numbers in warnings due to hardcoded stacklevel=2.

Fix: The code already attempts to handle this with _increment_inner_warn_wrapper_offsets() and _stacklevel_for_wrapper(), so this may already be resolved.

4. Type Hint Changes in Tests

tests/test_write_stream.py:71

The reviewer asked about intentional type hint additions. Based on the diff, these appear to be intentional improvements for type safety - adding explicit DataBatch annotations and using cast() for type-unsafe operations.

Resolved Issues from Previous Reviews

  • @functools.cache memory leak: No longer present in the code
  • CI workflow flags: --no-cov is properly included at line 64
  • Missing create_or_find_asset: This method doesn't appear to exist in the current codebase

🏗️ Implementation Quality

Query Construction

  • Server-side vs client-side filtering: Proper separation with TODOs for future server-side support
  • Backfill helpers: Clean implementation for datasets/workbooks/templates not yet supporting server-side archive filtering
  • OR clause construction: Correct query building for ANY filter status

Error Handling

  • Clear error messages: Helpful guidance when deprecated args conflict with new API
  • Proper exception types: ValueError for argument conflicts, NominalConfigError for configuration issues

Code Organization

  • Separation of concerns: Query construction, deprecation handling, and pagination logic cleanly separated
  • Consistent patterns: All search methods follow the same deprecation and resolution patterns

📝 Minor Recommendations

  1. Documentation: The archive status behavior in docstrings is clear and comprehensive
  2. Testing: Excellent coverage with both unit and integration tests
  3. Migration path: Clear deprecation warnings guide users to the new API

🎯 Approval Recommendation

This PR is ready for merge with the conflict detection fix applied. The core functionality is solid, the deprecation handling is proper, and the test coverage is comprehensive. The identified issues are minor and don't affect the core functionality.

The refactor successfully centralizes archive filtering logic while maintaining backward compatibility, which is exactly what was needed for this large-scale API change.


@drake-nominal drake-nominal merged commit 8b77944 into main Mar 18, 2026
10 checks passed
@drake-nominal drake-nominal deleted the deidukas/revamp-archive-in-search branch March 18, 2026 14:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants