Skip to content

fix(new-webui): Add support for querying metadata from multiple datasets (fixes #1024).#1042

Merged
davemarco merged 31 commits into
y-scope:mainfrom
davemarco:datasetfix
Jul 9, 2025
Merged

fix(new-webui): Add support for querying metadata from multiple datasets (fixes #1024).#1042
davemarco merged 31 commits into
y-scope:mainfrom
davemarco:datasetfix

Conversation

@davemarco

@davemarco davemarco commented Jun 26, 2025

Copy link
Copy Markdown
Contributor

This PR is blocked by #1004 / #1050. We cannot test fix for UI, until multiple datasets can be ingested

Description

PR #868 modified the names of the metadata tables leading to issue #1024. The table names were changed, breaking the ui.

The pr is fixes the issue by first querying all datasets, then issuing a large runtime generated query that includes a subquery for each dataset, and finally combines the results using UNION ALL. The large query should be more performant than sending multiple queries for each dataset.

This is even more complicated since clp does not have datasets, so in that case, the dataset query is not sent, and the old query is used.

The solution is a bit crude, but should be fine for a limited number of datasets. I will create an issue about another solution we discussed where we add a dataset column instead of renaming the tables.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

Tested clp and clp-s(with 1 dataset). Need to wait #1004 to test multiple dataset, but it should support it

Summary by CodeRabbit

  • New Features

    • Added loading indicators to dashboard, statistics, details, and space savings cards for improved user feedback during data fetching.
  • Refactor

    • Switched several dashboard and ingest page components to use React Query for data fetching and caching, replacing manual polling and state management.
    • Centralized query client configuration for consistent caching behaviour.
    • Enhanced support for multi-dataset statistics and space savings calculations.
    • Removed manual polling and refresh intervals in dataset and job components for streamlined data updates.
    • Simplified dataset component by removing stale time configuration in data fetching.
    • Introduced enum for SQL table suffixes and added configuration for SQL database table prefix to improve database table management.
  • Chores

    • Modularized and extended backend querying logic to support both single and multi-dataset scenarios.
    • Updated import paths for type definitions to maintain project structure consistency.
    • Updated client settings to include SQL database table prefix and simplified table name logic in startup scripts.

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants