Skip to content

feat: Add registry metadata summary metrics endpoint (#5921)#5939

Open
shuvakant6623 wants to merge 6 commits intofeast-dev:masterfrom
shuvakant6623:fix-metadata-summary-5921
Open

feat: Add registry metadata summary metrics endpoint (#5921)#5939
shuvakant6623 wants to merge 6 commits intofeast-dev:masterfrom
shuvakant6623:fix-metadata-summary-5921

Conversation

@shuvakant6623
Copy link

@shuvakant6623 shuvakant6623 commented Feb 3, 2026

What this PR does / why we need it:

This PR extends the existing Metrics API to provide registry-level metadata summary statistics.
It adds a new /api/v1/metrics/summary endpoint that aggregates resource counts across all projects and exposes high-level registry insights.

This helps users quickly understand the overall state of their registry without querying multiple endpoints.

Which issue(s) this PR fixes:

Fixes #5921

Misc

  • The implementation reuses existing gRPC-based metrics logic.
  • Includes best-effort tracking of the latest update timestamp.
  • Not tested locally yet due to dependency setup constraints; happy to add tests or make adjustments based on feedback.

Open with Devin

@shuvakant6623 shuvakant6623 requested a review from a team as a code owner February 3, 2026 16:20
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 8 potential issues.

View issues and 4 additional flags in Devin Review.

Open in Devin Review

Comment on lines 426 to 431
entities = grpc_call(
grpc_handler.ListDataSources,
RegistryServer_pb2.ListDataSourcesRequest(
project=project_name, allow_cache=allow_cache
),
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Wrong gRPC method called - ListDataSources instead of ListEntities

The code at lines 426-431 is labeled as "count entities" but actually calls ListDataSources instead of ListEntities. This means entities are never counted, and the entities variable actually contains data sources.

Click to expand

Code Analysis

The comment says #count entities but the code calls:

entities = grpc_call(
    grpc_handler.ListDataSources,  # Wrong method!
    RegistryServer_pb2.ListDataSourcesRequest(...)
)

Compare with the correct pattern in count_resources_for_project at metrics.py:69-74:

entities = grpc_call(
    grpc_handler.ListEntities,
    RegistryServer_pb2.ListEntitiesRequest(...)
)

Impact

  • totalEntities in the response will always be 0 (since entities.get("entities", []) returns empty list from a DataSources response)
  • Entities are never actually fetched or counted
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines 441 to 442
except Exception:
features = {"features": []}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Exception handler sets wrong variable - features instead of saved_datasets

When the ListSavedDatasets call fails, the exception handler incorrectly sets features = {"features": []} instead of saved_datasets = {"savedDatasets": []}.

Click to expand

Code Analysis

try: 
    saved_datasets = grpc_call(
        grpc_handler.ListSavedDatasets,
        ...
    )
except Exception:
    features = {"features": []}  # Wrong! Should be saved_datasets

Compare with the correct pattern at metrics.py:88-89:

except Exception:
    saved_datasets = {"savedDatasets": []}

Impact

  • If ListSavedDatasets fails, saved_datasets remains undefined, causing a NameError at line 469
  • The features variable is incorrectly set, which may mask a later NameError for features if ListFeatures is never called
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines 446 to 447
feature_views = grpc_call(
grpc_handler.ListAllFeaturesViews,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Typo in gRPC method name - ListAllFeaturesViews should be ListAllFeatureViews

The method name ListAllFeaturesViews has an extra 's' - it should be ListAllFeatureViews.

Click to expand

Code Analysis

feature_views = grpc_call(
    grpc_handler.ListAllFeaturesViews,  # Typo: extra 's'
    RegistryServer_pb2.ListAllFeatureViewsRequest(...)
)

The correct method name is ListAllFeatureViews as used elsewhere in the codebase (e.g., metrics.py:101, feature_views.py:78).

Impact

This will cause an AttributeError at runtime since grpc_handler.ListAllFeaturesViews does not exist.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

feature_services = {"feature_services": []}

# Aggregate counts
total["entities"] += len(entities.get("entities", []))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Typo in variable name - total instead of totals

Line 467 uses total["entities"] but the dictionary is named totals (with an 's').

Click to expand

Code Analysis

The dictionary is initialized as totals at line 409:

totals = {
    "entities": 0,
    ...
}

But line 467 references total (without 's'):

total["entities"] += len(entities.get("entities", []))

Impact

This will cause a NameError: name 'total' is not defined at runtime.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines 468 to 472
totals["dataSources"] += len(dataSources.get("dataSources", []))
totals["savedDatasets"] += len(savedDatasets.get("savedDatsets", []))
totals["features"] += len(features.get("features", []))
totals["featureViews"] += len(featureViews.get("featureViews", []))
totals["featureServices"] += len(featureServices.get("featureServices", []))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Multiple undefined variables in aggregate counts section

Lines 468-472 reference undefined variables with incorrect casing: dataSources, savedDatasets, features, featureViews, featureServices instead of the actual variable names entities (which contains data sources), saved_datasets, features, feature_views, feature_services.

Click to expand

Code Analysis

The variables are defined with snake_case:

entities = grpc_call(grpc_handler.ListDataSources, ...)  # Actually data sources
saved_datasets = grpc_call(grpc_handler.ListSavedDatasets, ...)
feature_views = grpc_call(grpc_handler.ListAllFeaturesViews, ...)
feature_services = grpc_call(grpc_handler.ListFeatureServices, ...)

But the aggregation uses camelCase which are undefined:

totals["dataSources"] += len(dataSources.get("dataSources", []))  # dataSources undefined
totals["savedDatasets"] += len(savedDatasets.get("savedDatsets", []))  # savedDatasets undefined, also typo in key
totals["features"] += len(features.get("features", []))  # features may be undefined
totals["featureViews"] += len(featureViews.get("featureViews", []))  # featureViews undefined
totals["featureServices"] += len(featureServices.get("featureServices", []))  # featureServices undefined

Impact

This will cause NameError exceptions at runtime for each undefined variable.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

# Aggregate counts
total["entities"] += len(entities.get("entities", []))
totals["dataSources"] += len(dataSources.get("dataSources", []))
totals["savedDatasets"] += len(savedDatasets.get("savedDatsets", []))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Typo in dictionary key - savedDatsets instead of savedDatasets

Line 469 has a typo in the dictionary key: savedDatsets is missing an 'a' and should be savedDatasets.

Click to expand

Code Analysis

totals["savedDatasets"] += len(savedDatasets.get("savedDatsets", []))  # Typo: savedDatsets

The correct key based on the gRPC response format is savedDatasets (as seen in metrics.py:120).

Impact

Even if the variable name issue is fixed, this would always return 0 for saved datasets because the key doesn't match the actual response key.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines 479 to 481
if ts:
if last_updates_ts is None or ts > last_updated_ts:
last_updated_ts
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Timestamp tracking logic is broken - inconsistent variable names and missing assignment

The timestamp tracking code has multiple issues: inconsistent variable naming (last__updates_ts vs last_updates_ts vs last_updated_ts) and a missing assignment statement.

Click to expand

Code Analysis

  1. Line 418 initializes last__updates_ts = None (double underscore)
  2. Line 480 references last_updates_ts (single underscore, different name)
  3. Line 480 also references last_updated_ts (yet another variation)
  4. Line 481 is just last_updated_ts - a bare expression that does nothing (should be last_updated_ts = ts)
  5. Line 491 returns last_updated_ts which is undefined
last__updates_ts = None  # Line 418: initialized with double underscore
...
if ts: 
    if last_updates_ts is None or ts > last_updated_ts:  # Line 480: wrong variable names
        last_updated_ts  # Line 481: does nothing, should be: last_updated_ts = ts
...
"lastUpdatedTImestamp": last_updated_ts,  # Line 491: undefined variable

Impact

  • NameError at runtime when accessing undefined last_updates_ts or last_updated_ts
  • Even if variable names were consistent, the timestamp would never be updated due to the missing assignment
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

total["entities"] += len(entities.get("entities", []))
totals["dataSources"] += len(dataSources.get("dataSources", []))
totals["savedDatasets"] += len(savedDatasets.get("savedDatsets", []))
totals["features"] += len(features.get("features", []))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Missing ListFeatures call - features are never counted

The metrics_summary function never calls ListFeatures to count features, unlike the existing count_resources_for_project function.

Click to expand

Code Analysis

The existing count_resources_for_project function at metrics.py:91-98 calls ListFeatures:

try:
    features = grpc_call(
        grpc_handler.ListFeatures,
        RegistryServer_pb2.ListFeaturesRequest(
            project=project_name, allow_cache=allow_cache
        ),
    )
except Exception:
    features = {"features": []}

The new metrics_summary function has no such call. The features variable is only set in the exception handler for ListSavedDatasets (which is itself a bug).

Impact

  • totalFeatures will always be 0 or cause a NameError if ListSavedDatasets succeeds
  • Features are never actually fetched or counted
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@shuvakant6623 shuvakant6623 force-pushed the fix-metadata-summary-5921 branch from 0f29942 to 39ffafe Compare February 3, 2026 17:33
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

View issue and 3 additional flags in Devin Review.

Open in Devin Review

Signed-off-by: Shuvakant Patra <scientefic2612@gmail.com>
@shuvakant6623 shuvakant6623 force-pushed the fix-metadata-summary-5921 branch from 39ffafe to af10b33 Compare February 3, 2026 17:37
if last_updated_ts is None or ts > last_updated_ts:
last_updated_ts = ts

return {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very similar to "/metrics/resource_counts" endpoint, I think we can extend existing endpoint to include totalProjects and lastUpdatedTimestamp

Copy link
Author

@shuvakant6623 shuvakant6623 Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback!. I’ll update the existing /metrics/resource_counts endpoint to include totalProjects and lastUpdatedTimestamp and remove the separate summary endpoint.

Signed-off-by: Shuvakant Patra <scientefic2612@gmail.com>
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View issue and 4 additional flags in Devin Review.

Open in Devin Review

Signed-off-by: Shuvakant Patra <scientefic2612@gmail.com>
@shuvakant6623 shuvakant6623 changed the title feat: add registry metadata summary metrics endpoint (#5921) feat: Add registry metadata summary metrics endpoint (#5921) Feb 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Metadata to Feature Registry

2 participants