Skip to content

refactor(authz): unify plugin HTTP authz under NemoService#332

Open
maxdubrinsky wants to merge 9 commits into
mainfrom
aircore-751-plugin-http-authz/md
Open

refactor(authz): unify plugin HTTP authz under NemoService#332
maxdubrinsky wants to merge 9 commits into
mainfrom
aircore-751-plugin-http-authz/md

Conversation

@maxdubrinsky

@maxdubrinsky maxdubrinsky commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Summary

Plugin HTTP authorization is currently split between two mechanisms: the nemo.authz / get_authz_contribution() surface and the service routers. This change makes NemoService the single source of plugin auth policy. A plugin attaches authorization rules directly to its route handlers, and the platform derives the policy from the mounted routes when it builds the OPA bundle. There is no separate permission list to keep in sync.

The model is fail-closed. Any route the platform cannot confidently authorize becomes an explicit deny, so a missing or malformed rule cannot fall through to an allow.

How it works

  • A plugin declares permissions as typed objects (each carries its own id and a human-readable description) and puts a @path_rule(...) on each route handler listing the caller kinds and permissions that route requires.
  • At bundle build the platform walks the final mounted route set and derives the permission catalog and per-endpoint policy from those rules. Standard job routes (create, list, get, cancel, logs, results) are stamped automatically by the job-route factory, so a plugin passes its namespace once instead of hand-listing every endpoint.
  • Bundle build then validates the result and fails closed if a service references an undeclared permission, declares one outside its own namespace, emits a malformed id, or leaves any route with no rule. How much gets denied is configurable via auth.on_invalid_plugin (default: deny only the offending routes).

Behavior changes worth checking during review

  • Caller kind is enforced per route. A route can require a human principal, a service principal, or both. Humans can no longer reach service-only routes and services cannot reach human-only routes.
  • Unknown or unruled routes are denied instead of allowed. A route mounted before its authz is in the bundle no longer falls through to the authenticated-user or service bypass.
  • List routes that have no workspace in their path (for example, list workspaces) now require their permission in the system workspace rather than only authentication. This is not a regression on a default deployment: the platform seeds a wildcard Viewer binding in system for all authenticated users, so everyone already holds it. It only changes behavior for deployments that do not have that grant.
  • All seven plugin services and the two customization contributors are re-annotated to the new model. No endpoints change. This is an authz-metadata refactor, not an API change.

Performance

The new rules run on every request and pushed policy evaluation cost up enough to matter for the embedded engine. The per-request endpoint and workspace lookups are now computed once per evaluation instead of repeated at each call site, which brings cost back well under budget.

How to verify

  • Policy tests (238 pass): opa test services/core/auth/src/nmp/core/auth/app/policies services/core/auth/src/nmp/core/auth/app/policy_tests services/core/auth/src/nmp/core/auth/assets/static-authz.yaml. The static-authz.yaml argument is required; without it the inference-gateway cases fail because they read endpoint data from that file.
  • Python: each plugin has a test_authz.py asserting it derives a valid policy with every route ruled and every permission inside its own namespace.
  • Optional, not in CI: e2e/authz_oidc runs the policy against a live platform using real signed JWTs from a test OIDC issuer (pytest e2e/authz_oidc --run-e2e).

Commits

  1. refactor: the model, bundle validation, and the plugin migrations.
  2. perf: the policy-engine evaluation-cost fix.
  3. test: the end-to-end OIDC harness.

Out of scope

Authentication and machine identity are unchanged. Plugin-defined roles, plugin-defined scopes, and IAM/UI surfacing are not part of this change.

Summary by CodeRabbit

  • New Features

    • Route-level authorization can now be declared directly on endpoints, including explicit caller-kind plus permission and scope requirements.
    • Added tighter authorization coverage across platform plugins and generated job/function routes.
  • Bug Fixes & Improvements

    • Stronger fail-closed handling for invalid plugin authorization, including quarantine/fencing for unenumerated routes.
    • Workspace listing now requires system-level permissions.
  • Configuration

    • Added auth.on_invalid_plugin to control fail-mode (deny_route, quarantine, hard_fail).
    • Added auth.platform_admin_exempt_from_service_only for PlatformAdmin exemptions.
  • Documentation & Testing

    • Updated configuration reference and refreshed the OIDC authorization end-to-end test harness/reports.

@github-actions

github-actions Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor
Suite Lines Covered Line Rate Branch Rate
Unit Tests 21334/27892 76.5% 61.5%
Integration Tests 12346/26661 46.3% 19.7%

@maxdubrinsky maxdubrinsky force-pushed the aircore-751-plugin-http-authz/md branch 4 times, most recently from 3a373a8 to af6229c Compare June 15, 2026 20:33
Derive plugin HTTP authorization from @path_rule/@api_route route
metadata plus get_authz_definition(); remove the nemo.authz /
get_authz_contribution() split so NemoService is the sole source of
plugin HTTP auth policy. Adds explicit permission declarations
(id + description), per-service permission_namespace, and bundle-time
validation that fails closed on undeclared/mis-namespaced permissions,
malformed ids/scopes, and any plugin route lacking a final path rule.
Includes symmetric caller-kind enforcement, fence / degraded-coverage
containment, the permission-stamped no-{workspace} GET tightening
(F1-5), and a wasm-safe namespace fence (native builtins). Annotates
all 7 NemoService plugins.

Implements AIRCORE-743 (umbrella AIRCORE-751).

Signed-off-by: Max Dubrinsky <mdubrinsky@nvidia.com>
The embedded-PDP `allow` evaluation re-ran the O(endpoints) endpoint match
(`normalize_endpoint`) and workspace extraction (`extract_workspace_from_path`)
at every call site. OPA's WASM target caches 0-arg rule results for a query but
NOT function calls, so each of the ~7-10 call sites re-scanned all ~200 endpoint
patterns. The plugin-authz rules added more always-evaluated call sites
(endpoint_denied, endpoint_callers x2, the permission-stamped no-workspace GET
rule), pushing the PlatformAdmin-bypass eval from 71M to 112M fuel — over the
100M default embedded_pdp_cpu_limit, so every request 502'd once role bindings
loaded (AIRCORE-797).

Bind each path/method-keyed scan to a 0-arg request-scoped rule computed once per
evaluation (endpoint_scan, workspace_scan, req_permissions, req_callers, req_deny,
req_required_scopes) and route the allow_request/deny_request/scope_check_passed
bodies through them. The scan rules are always-defined (sentinel "" when nothing
matches) so unknown / no-workspace paths hit the cache too — an undefined rule
result is re-evaluated on each reference. The original helper functions are kept
intact: the policy tests call them directly with explicit paths/methods.

Result (200 endpoints, opa 1.8.0, measured via wasmtime fuel): every case drops
to 10-19M (admin bypass 112M -> 18M; permission GET 68M -> 11M), well under
budget and below main. Per-endpoint fuel slope falls ~6x (0.44M -> 0.07M), so the
policy stays under 50M even at 600 endpoints. Verified equivalent: 238/238 opa
tests pass and allow/deny decisions are unchanged across 14 representative cases.

Signed-off-by: Max Dubrinsky <mdubrinsky@nvidia.com>
Real-network verification harness under e2e/authz_oidc: a mini OIDC
issuer (RS256 + JWKS over HTTP) mints signed JWTs against a running
platform, exercising a ~45-case allow/deny matrix over three fixture
plugins to prove the new model denies what it should. Not wired into
CI (no integration marker).

Signed-off-by: Max Dubrinsky <mdubrinsky@nvidia.com>
@maxdubrinsky maxdubrinsky force-pushed the aircore-751-plugin-http-authz/md branch from af6229c to c62b88f Compare June 15, 2026 20:56
@github-actions

Copy link
Copy Markdown
Contributor

@maxdubrinsky maxdubrinsky marked this pull request as ready for review June 15, 2026 21:10
@maxdubrinsky maxdubrinsky requested review from a team as code owners June 15, 2026 21:10
@coderabbitai

coderabbitai Bot commented Jun 15, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 4d5c1ebc-9038-403e-a12f-f667ddcf9da6

📥 Commits

Reviewing files that changed from the base of the PR and between 6db16e3 and c28b8e8.

📒 Files selected for processing (6)
  • docs/set-up/config-reference.mdx
  • plugins/example-plugin/src/nemo_example_plugin/service.py
  • plugins/nemo-evaluator/src/nemo_evaluator/api/v2/metrics.py
  • plugins/nemo-evaluator/src/nemo_evaluator/authz.py
  • plugins/nemo-evaluator/src/nemo_evaluator/service.py
  • services/core/auth/scripts/auth-tools.py
💤 Files with no reviewable changes (1)
  • services/core/auth/scripts/auth-tools.py
✅ Files skipped from review due to trivial changes (2)
  • plugins/nemo-evaluator/src/nemo_evaluator/authz.py
  • docs/set-up/config-reference.mdx
🚧 Files skipped from review as they are similar to previous changes (2)
  • plugins/nemo-evaluator/src/nemo_evaluator/service.py
  • plugins/example-plugin/src/nemo_example_plugin/service.py

📝 Walkthrough

Walkthrough

Derives plugin HTTP authorization from route metadata, updates bundle and policy enforcement for caller kinds and fail modes, migrates plugin services to route-level authz wiring, and adds OIDC E2E coverage with report generation.

Changes

Route-derived plugin authorization

Layer / File(s) Summary
Route discovery core and plugin migration baseline
packages/nemo_platform_plugin/src/nemo_platform_plugin/authz.py, .../authz_discovery.py, .../authz_format.py, .../service.py, .../discovery.py, .../customization_contributor.py, packages/nemo_platform_plugin/tests/test_path_rule.py, .../test_authz.py, .../test_authz_failmode.py, packages/nmp_common/tests/auth/test_authz_format.py, plugins/*/tests/*, plugins/*/src/*/authz.py
Adds route-rule authz primitives, derived plugin discovery results, caller validation, endpoint deny/callers fields, extra_permissions(), and derivation/fail-closed tests.
Route stamping and contributor wiring
packages/nemo_platform_plugin/src/nemo_platform_plugin/functions/routes.py, .../jobs/api_factory.py, .../jobs/routes.py, packages/nmp_customization_common/src/nmp/customization_common/contributor/base.py, packages/nemo_platform_plugin/tests/test_factory_authz.py, .../test_functions_routes.py
Function and job route factories accept authz inputs, generated routes attach path_rule metadata, customization contributors stamp health and job routes, and tests cover emitted permissions and invalid parameter combinations.
Plugin and customization migration
plugins/example-plugin/src/nemo_example_plugin/*, plugins/nemo-agents/src/nemo_agents_plugin/*, plugins/nemo-anonymizer/src/nemo_anonymizer_plugin/service.py, plugins/nemo-auditor/src/nemo_auditor/*, plugins/nemo-customizer/src/nemo_customizer/router.py, plugins/nemo-data-designer/src/nemo_data_designer_plugin/service.py, plugins/nemo-evaluator/src/nemo_evaluator/service.py, plugins/nemo-safe-synthesizer/src/nemo_safe_synthesizer_plugin/*, plugins/nemo-unsloth/*, plugins/*/tests/*
Plugins replace get_authz_contribution() with route decorators or stamped generated routes, add typed permission vocabularies where needed, split gateway proxy scopes by method in agents, and add derivation tests for migrated services.
Bundle merge and policy enforcement
services/core/auth/src/nmp/core/auth/config.py, services/core/auth/src/nmp/core/auth/app/bundle.py, services/core/auth/src/nmp/core/auth/app/policies/*.rego, services/core/auth/src/nmp/core/auth/app/policy_tests/*.rego, services/core/auth/scripts/auth-tools.py, services/core/auth/tests/*, docs/set-up/config-reference.mdx
Adds invalid-plugin fail modes and PlatformAdmin exemption config, merges derived plugin authz into bundle generation, introduces deny markers and denied plugin fences in policy evaluation, enforces caller kinds and system-scoped no-workspace permission checks, and updates policy, bundle, WASM, and config docs/tests.
OIDC authz end-to-end harness
e2e/authz_oidc/*
Adds a local OIDC issuer, platform fixtures, fixture plugins for ruled, unruled, and broken services, a request/identity/status matrix across default and knob phases, audit report generation, and README instructions for running the harness.

Possibly related PRs

Suggested labels

feat

Suggested reviewers

  • mckornfield
  • ironcommit
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 42.65% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed Title clearly and concisely describes the main refactor: unifying plugin HTTP authorization under NemoService as the single source of policy.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch aircore-751-plugin-http-authz/md

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
services/core/auth/scripts/auth-tools.py (1)

1353-1374: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Report changed endpoint methods, not only new paths.

Line 1353 compares only path keys. Adding or changing POST /existing/path is written to static-authz.yaml but reported as “No new endpoint paths,” hiding an authz delta.

Suggested fix
     auth_config = load_yaml(auth_path)
-    before_endpoints = set(auth_config.get("authz", {}).get("endpoints", {}).keys())
+    def _endpoint_method_specs(config: Dict) -> Dict[tuple[str, str], Dict]:
+        return {
+            (path, method.lower()): spec
+            for path, methods in config.get("authz", {}).get("endpoints", {}).items()
+            if isinstance(methods, dict)
+            for method, spec in methods.items()
+            if method.lower() in HTTP_METHODS
+        }
+
+    before_specs = _endpoint_method_specs(auth_config)
@@
     merged = merge_plugin_authz_contributions(auth_config)
-    after_endpoints = set(merged.get("authz", {}).get("endpoints", {}).keys())
-    added_paths = sorted(after_endpoints - before_endpoints)
+    after_specs = _endpoint_method_specs(merged)
+    changed_methods = sorted(key for key, spec in after_specs.items() if before_specs.get(key) != spec)
@@
     console.print("[bold]Merging plugin authz contributions...[/bold]")
-    for path in added_paths:
-        methods = sorted(merged["authz"]["endpoints"][path].keys())
-        console.print(f"  [green]+[/green] {path} ({', '.join(methods)})")
+    for path, method in changed_methods:
+        marker = "+" if (path, method) not in before_specs else "~"
+        console.print(f"  [green]{marker}[/green] {method.upper()} {path}")
 
-    if not added_paths:
-        console.print("[dim]No new endpoint paths (contributions may already be present).[/dim]")
+    if not changed_methods:
+        console.print("[dim]No endpoint method changes (contributions may already be present).[/dim]")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@services/core/auth/scripts/auth-tools.py` around lines 1353 - 1374, The code
only detects new endpoint paths by comparing path keys, missing method changes
on existing paths. To fix this, after computing added_paths from the set
difference of before_endpoints and after_endpoints, also identify modified paths
by iterating through paths that exist in both before and after states and
comparing their merged authz methods/configurations. Update the reporting loop
to iterate through both added_paths and modified_paths, displaying changes for
each. Finally, update the condition at the end that prints "No new endpoint
paths" to check if both added_paths and modified_paths are empty, so that method
changes on existing paths are properly reported and don't incorrectly show as
"No new endpoint paths."
packages/nemo_platform_plugin/src/nemo_platform_plugin/service.py (1)

14-25: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Update the service example to include @path_rule.

The public example still registers an unruled route; copied as-is, the new derivation emits an explicit deny.

Proposed fix
     from fastapi import APIRouter
+    from nemo_platform_plugin.authz import CallerKind, path_rule
     from nemo_platform_plugin.service import NemoService, RouterSpec
@@
             `@router.get`("/health")
+            `@path_rule`(callers=[CallerKind.PRINCIPAL], permissions=[])
             async def health() -> dict[str, str]:
                 return {"status": "ok"}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/nemo_platform_plugin/src/nemo_platform_plugin/service.py` around
lines 14 - 25, The health route handler in the MyService example is missing the
`@path_rule` decorator, which will cause an explicit deny when the code is used.
Add the `@path_rule` decorator to the health async function to properly define the
permission rule for this route endpoint.
packages/nemo_platform_plugin/src/nemo_platform_plugin/jobs/routes.py (1)

48-76: ⚠️ Potential issue | 🟡 Minor

Remove TYPE_CHECKING and use concrete type hints.

Lines 48, 61–73: NemoJob and Callable are runtime-available and should not be string-quoted or gated behind TYPE_CHECKING. Import them directly.

Suggested fix
-from typing import TYPE_CHECKING, Any
+from collections.abc import Callable
+from typing import Any

 from fastapi import APIRouter
 from fastapi.routing import APIRoute
-from nemo_platform_plugin.job import job_collection_path_for
+from nemo_platform_plugin.job import NemoJob, job_collection_path_for
-
-if TYPE_CHECKING:
-    from collections.abc import Callable
-
-    from nemo_platform_plugin.job import NemoJob
-
 def add_job_routes(
-    job_cls: type["NemoJob"],
+    job_cls: type[NemoJob],
     *,
     service_name: str | None = None,
     route_options: list[JobRouteOption] | None = None,
     job_result_routes: list[PlatformJobResultRoute] | None = None,
-    generate_job_name: "Callable[..., str] | None" = None,
+    generate_job_name: Callable[..., str] | None = None,

Apply similar changes to all occurrences at lines 198, 209, 224, 256, and 258.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/nemo_platform_plugin/src/nemo_platform_plugin/jobs/routes.py` around
lines 48 - 76, Remove the TYPE_CHECKING conditional block and import Callable
and NemoJob directly as runtime imports from their respective modules
(collections.abc and nemo_platform_plugin.job). Then remove the string quotes
from all type hints using these symbols in the add_job_routes function signature
and parameters (lines 19-28 and throughout). Apply the same direct import
pattern and remove string quotes at the other affected locations mentioned in
the comment: lines 198, 209, 224, 256, and 258.

Source: Coding guidelines

🧹 Nitpick comments (6)
e2e/authz_oidc/README.md (1)

1-82: ⚡ Quick win

Add required Prerequisites (top) and Next Steps (end) sections.

Line 1 starts directly with usage/context, but prerequisites are missing before primary content. The page also ends at Line 82 without a Next Steps section with cross-links.

As per coding guidelines, “Always list prerequisites at the top of documentation pages before other content” and “Include 'Next Steps' section at the end with cross-links to related documentation content.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@e2e/authz_oidc/README.md` around lines 1 - 82, Add a Prerequisites section
immediately after the main title in e2e/authz_oidc/README.md, before the
"Black-box verification..." paragraph, listing any required tools, dependencies,
or setup needed. Then add a Next Steps section at the end of the file after the
Known limits section, including cross-links to related documentation content
such as the main authz documentation, other e2e test harnesses, or relevant
policy/JWT validation documentation.

Source: Coding guidelines

plugins/example-plugin/tests/test_authz.py (1)

9-39: ⚡ Quick win

Assert derived scopes and warnings.

This test checks permissions/callers, but a read/write scope regression still passes. Also require warnings == [] so authz coverage warnings cannot slip through.

Proposed test hardening
 from nemo_example_plugin.service import ExampleService
+from nemo_platform_plugin.authz import scopes_for
 from nemo_platform_plugin.authz_discovery import _derive_service_contribution
 
 
 def test_example_authz_derivation_has_no_problems() -> None:
-    contrib, problems, _warnings = _derive_service_contribution(ExampleService())
+    contrib, problems, warnings = _derive_service_contribution(ExampleService())
     assert problems == []
+    assert warnings == []
+
+    def assert_binding(path: str, method: str, permission: str, *, write: bool) -> None:
+        binding = contrib.endpoints[path][method]
+        assert binding.permissions == [permission]
+        assert set(binding.scopes) == set(scopes_for("example", write=write))
 
     # Minimal hello endpoint (non-workspace-scoped).
-    assert contrib.endpoints["/apis/example/hello/{name}"]["get"].permissions == ["example.hello.read"]
+    assert_binding("/apis/example/hello/{name}", "get", "example.hello.read", write=False)
 
     # Items CRUD.
     items = "/apis/example/v2/workspaces/{workspace}/items"
-    assert contrib.endpoints[items]["post"].permissions == ["example.items.create"]
-    assert contrib.endpoints[items]["get"].permissions == ["example.items.list"]
-    assert contrib.endpoints[f"{items}/{{name}}"]["get"].permissions == ["example.items.read"]
-    assert contrib.endpoints[f"{items}/{{name}}"]["patch"].permissions == ["example.items.update"]
-    assert contrib.endpoints[f"{items}/{{name}}"]["delete"].permissions == ["example.items.delete"]
+    assert_binding(items, "post", "example.items.create", write=True)
+    assert_binding(items, "get", "example.items.list", write=False)
+    assert_binding(f"{items}/{{name}}", "get", "example.items.read", write=False)
+    assert_binding(f"{items}/{{name}}", "patch", "example.items.update", write=True)
+    assert_binding(f"{items}/{{name}}", "delete", "example.items.delete", write=True)
 
     # Middleware-config CRUD (hyphenated namespace segment).
     mw = "/apis/example/v2/workspaces/{workspace}/middleware-configs"
-    assert contrib.endpoints[mw]["post"].permissions == ["example.middleware-configs.create"]
-    assert contrib.endpoints[f"{mw}/{{name}}"]["delete"].permissions == ["example.middleware-configs.delete"]
+    assert_binding(mw, "post", "example.middleware-configs.create", write=True)
+    assert_binding(mw, "get", "example.middleware-configs.list", write=False)
+    assert_binding(f"{mw}/{{name}}", "get", "example.middleware-configs.read", write=False)
+    assert_binding(f"{mw}/{{name}}", "patch", "example.middleware-configs.update", write=True)
+    assert_binding(f"{mw}/{{name}}", "delete", "example.middleware-configs.delete", write=True)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/example-plugin/tests/test_authz.py` around lines 9 - 39, The test
function test_example_authz_derivation_has_no_problems currently does not verify
that the warnings returned from _derive_service_contribution are empty, which
allows authz coverage warnings to slip through. Add an assertion to check that
_warnings is an empty list, placing it alongside the existing assertion that
problems is an empty list, to ensure no warnings are generated during the authz
derivation process.
plugins/nemo-evaluator/tests/test_authz.py (1)

27-35: ⚡ Quick win

Pin caller kinds, scopes, and the remaining factory bindings.

This only checks selected permissions. Add assertions for callers/scopes and the get/cancel/logs/results job routes so evaluator regressions are caught locally, not only by central factory tests.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/nemo-evaluator/tests/test_authz.py` around lines 27 - 35, The test
currently only validates permissions for a subset of job-collection endpoints
(post, get, delete on the jobs path). Expand the test assertions to check
callers and scopes attributes in addition to permissions for the existing
endpoints, and add new assertions for the remaining job routes (get/{name},
cancel, logs, and results) with their corresponding permissions, callers, and
scopes. This ensures evaluator permission mappings are fully validated locally
rather than relying only on central factory tests to catch regressions.
plugins/nemo-customizer/tests/test_router.py (1)

191-197: ⚡ Quick win

Assert derived contributor endpoint paths.

These checks prove the permissions exist, but not that /jobs endpoints are bound to them under the hub prefix.

Proposed test tightening
     assert problems == []
     assert not any(spec.deny for methods in contribution.endpoints.values() for spec in methods.values())
-    assert "customization.automodel.jobs.create" in contribution.permissions
-    assert "customization.unsloth.jobs.create" in contribution.permissions
+    for backend in ("automodel", "unsloth"):
+        endpoint = f"/apis/customization/v2/workspaces/{{workspace}}/{backend}/jobs"
+        binding = contribution.endpoints[endpoint]["post"]
+        assert binding.permissions == [f"customization.{backend}.jobs.create"]
+        assert binding.callers == ["principal"]
+        assert binding.deny is False
     # The hub's own /healthz is authenticated-but-permissionless (ruled, not denied).
     hub_healthz = contribution.endpoints["/apis/customization/healthz"]["get"]
     assert hub_healthz.permissions == [] and not hub_healthz.deny
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/nemo-customizer/tests/test_router.py` around lines 191 - 197, The
test verifies that the permissions exist in contribution.permissions, but does
not verify that the actual job creation endpoints are bound to these permissions
under the hub prefix. Add assertions after the existing permission checks to
verify that the automodel jobs endpoint and unsloth jobs endpoint exist in
contribution.endpoints under their respective hub prefix paths (like
/apis/customization/automodel/jobs and /apis/customization/unsloth/jobs), and
that these endpoints reference the correct permissions. This will ensure the
endpoints are not only created but also properly connected to the required
permissions.
plugins/nemo-auditor/tests/test_authz.py (1)

29-32: ⚡ Quick win

Assert target CRUD endpoint bindings.

Line 32 only checks permission declaration, not that target routes use those permissions. A wrong target route stamp could pass.

Proposed test tightening
     # targets CRUD: spot-check + every referenced permission is declared.
     targets = "/apis/auditor/v2/workspaces/{workspace}/targets"
     assert contrib.endpoints[targets]["post"].permissions == ["auditor.targets.create"]
-    assert {"auditor.targets.read", "auditor.targets.delete"} <= set(contrib.permissions)
+    assert contrib.endpoints[targets]["get"].permissions == ["auditor.targets.list"]
+    assert contrib.endpoints[f"{targets}/{{name}}"]["get"].permissions == ["auditor.targets.read"]
+    assert contrib.endpoints[f"{targets}/{{name}}"]["put"].permissions == ["auditor.targets.update"]
+    assert contrib.endpoints[f"{targets}/{{name}}"]["delete"].permissions == ["auditor.targets.delete"]
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/nemo-auditor/tests/test_authz.py` around lines 29 - 32, The test on
line 32 only verifies that permissions are declared in contrib.permissions but
does not verify that the target endpoints (GET for read, DELETE for delete)
actually use those permissions in their bindings. Add assertions similar to line
30 (checking the POST endpoint) to verify that
contrib.endpoints[targets]["get"].permissions includes "auditor.targets.read"
and contrib.endpoints[targets]["delete"].permissions includes
"auditor.targets.delete", ensuring that wrong endpoint permission stamps cannot
pass the test.
services/core/auth/src/nmp/core/auth/app/policies/authz.rego (1)

169-172: ⚡ Quick win

Remove the unused path binding.

Line 171 computes extract_path but never uses it; Regal flags this hot-path rule.

Proposed fix
 	method := extract_method
 	method in ["GET", "HEAD"]
-	path := extract_path
 	endpoint_scan != ""
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@services/core/auth/src/nmp/core/auth/app/policies/authz.rego` around lines
169 - 172, The binding `path := extract_path` is computed but never referenced
elsewhere in the rule, creating an unnecessary variable assignment that impacts
hot-path performance. Remove the unused `path := extract_path` line while
keeping the other bindings and conditions (method and endpoint_scan checks)
intact.

Source: Linters/SAST tools

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@e2e/authz_oidc/conftest.py`:
- Around line 273-280: The platform fixture and similar fixtures referenced in
the comment use generators but do not ensure proper cleanup if an exception
occurs during the setup phase. Wrap the _provision(p) call in a try-finally
block so that the cleanup code (next(gen, None)) is guaranteed to execute even
if _provision raises an exception before reaching the yield statement. This
pattern should be applied to all fixture generator-based fixtures that have
setup code between the initial next(gen) call and the yield statement.
- Around line 91-96: The subprocess.run() call used for uv pip install is
missing a timeout parameter, which can cause it to block indefinitely and stall
the entire e2e test session. Add a timeout parameter to the subprocess.run()
invocation to specify a maximum duration (in seconds) that the pip installation
process is allowed to run before it is terminated. This will prevent the fixture
plugin installation from hanging and blocking subsequent test execution.

In `@e2e/authz_oidc/idp.py`:
- Around line 116-120: The stop() method shuts down and closes the server socket
but does not wait for the background thread that runs the server to exit, which
can cause race conditions during test teardown. After calling
self._server.shutdown() and self._server.server_close(), add a join() call on
the background thread attribute (likely self._thread or similar) to block until
the thread has fully exited before the stop() method returns. This ensures the
server thread cleanup is complete before teardown proceeds.

In `@e2e/authz_oidc/matrix.py`:
- Around line 54-66: In the Case dataclass, replace the generic type annotations
with concrete types to improve static type checking. Change the expected field
from object to set[int] | Literal["not-403"] since it holds either a set of
integers or the string constant "not-403". Change the body field from dict |
None to dict[str, object] | None to reflect that it is a dictionary with string
keys and mixed value types. This will eliminate the need for type ignores in
downstream code and strengthen type safety.

In `@packages/nemo_platform_plugin/src/nemo_platform_plugin/authz_discovery.py`:
- Around line 253-279: The for loop that registers extra permissions (iterating
through the extra list returned from extra_permissions()) is outside the
try-except block, so if _register_permission() raises an exception for a
non-Permission value, it will abort the entire derivation and discard
already-derived route bindings. Move the _register_permission() call inside the
try-except block or wrap the loop iteration itself with exception handling to
catch exceptions from individual permission registrations. Record any exceptions
from _register_permission() in the errors list (similar to how exceptions from
extra_permissions() are handled) and continue processing remaining permissions,
so that bad extra_permissions() entries do not prevent route bindings from being
included.

In `@packages/nemo_platform_plugin/tests/test_authz.py`:
- Around line 138-140: The test at the route /v2/z uses permissions with the z.*
namespace (z.read and z.internal) but the service is named svc, which creates a
namespace mismatch. Keep this test consistent with the service namespace by
changing both Permission strings in the path_rule decorators from z.read and
z.internal to use the svc namespace instead (for example, svc.read and
svc.internal), ensuring the test validates permission-set differences without
introducing namespace validation issues.
- Around line 69-72: The fake entry point lambda in the monkeypatch is too
permissive because it returns the same service dictionary regardless of which
group parameter is passed to discover_entry_points, meaning the test would still
pass even if the code accidentally queries the wrong group. Modify the lambda to
enforce that only the "nemo.services" group is queried by checking the group
parameter and only returning the fake entry point dictionary when group equals
"nemo.services", otherwise returning an empty dictionary to make the test fail
if the discovery queries any other group.

In `@plugins/nemo-safe-synthesizer/tests/unit/test_authz.py`:
- Around line 12-14: The test function
test_safe_synthesizer_authz_derivation_has_no_problems is missing an optional
dependency guard. Add a pytest importorskip call at the beginning of this test
function to skip it when the safe-synthesizer optional dependency is not
installed. This prevents the test from failing in minimal environments and
ensures it only runs when the required optional paths are available for route
derivation.

In `@plugins/nemo-unsloth/tests/test_contributor.py`:
- Around line 61-69: The test currently collects permission IDs and only asserts
that a specific permission exists, but does not verify that every contributed
route has a path rule. Modify the test to iterate through all APIRoute objects
in specs and assert that each route has at least one path rule (by calling
get_path_rules on its endpoint) and that each path rule contains permissions.
This ensures no routes slip through without proper permission rules, rather than
just checking if one specific permission exists somewhere in the collection.

In `@services/core/auth/src/nmp/core/auth/app/bundle.py`:
- Around line 186-188: The assignment to denied_plugin_prefixes on line 188
overwrites any existing prefixes in the merged configuration, which can reopen
previously fenced namespaces. Instead of directly assigning the sorted
denied_prefixes, retrieve any existing denied_plugin_prefixes value from the
configuration, union it with the new denied_prefixes set, and then assign the
combined sorted result back to ensure all prefixes (both existing and newly
degraded) are preserved.

In `@services/core/auth/src/nmp/core/auth/app/policy_tests/caller_kind_test.rego`:
- Line 1: The package declaration at the top of the file uses the name
`authz_test`, but this does not align with the directory-package naming
convention expected by Regal linting rules. Rename the package statement from
`package authz_test` to match the expected package structure based on the file's
directory hierarchy, which should follow the pattern of converting the directory
path segments into a dotted package name (typically something like
`nmp.core.auth.app.policy_tests` based on standard Rego conventions). This will
resolve the lint error about the package/directory mismatch.

In `@services/core/auth/tests/test_embedded_pdp.py`:
- Around line 823-830: The test case at line 826 expects allowed=True for the
path "/apis/brokenplugin-extra/x", but this endpoint is not registered in the
authz data (static-authz.yaml). According to fail-closed behavior, unknown
routes must be denied. Change the test expectation from True to False for the
"/apis/brokenplugin-extra/x" case to align with the deny-by-default behavior for
unregistered endpoints, or alternatively register this endpoint in the authz
data if it should be permitted.

---

Outside diff comments:
In `@packages/nemo_platform_plugin/src/nemo_platform_plugin/jobs/routes.py`:
- Around line 48-76: Remove the TYPE_CHECKING conditional block and import
Callable and NemoJob directly as runtime imports from their respective modules
(collections.abc and nemo_platform_plugin.job). Then remove the string quotes
from all type hints using these symbols in the add_job_routes function signature
and parameters (lines 19-28 and throughout). Apply the same direct import
pattern and remove string quotes at the other affected locations mentioned in
the comment: lines 198, 209, 224, 256, and 258.

In `@packages/nemo_platform_plugin/src/nemo_platform_plugin/service.py`:
- Around line 14-25: The health route handler in the MyService example is
missing the `@path_rule` decorator, which will cause an explicit deny when the
code is used. Add the `@path_rule` decorator to the health async function to
properly define the permission rule for this route endpoint.

In `@services/core/auth/scripts/auth-tools.py`:
- Around line 1353-1374: The code only detects new endpoint paths by comparing
path keys, missing method changes on existing paths. To fix this, after
computing added_paths from the set difference of before_endpoints and
after_endpoints, also identify modified paths by iterating through paths that
exist in both before and after states and comparing their merged authz
methods/configurations. Update the reporting loop to iterate through both
added_paths and modified_paths, displaying changes for each. Finally, update the
condition at the end that prints "No new endpoint paths" to check if both
added_paths and modified_paths are empty, so that method changes on existing
paths are properly reported and don't incorrectly show as "No new endpoint
paths."

---

Nitpick comments:
In `@e2e/authz_oidc/README.md`:
- Around line 1-82: Add a Prerequisites section immediately after the main title
in e2e/authz_oidc/README.md, before the "Black-box verification..." paragraph,
listing any required tools, dependencies, or setup needed. Then add a Next Steps
section at the end of the file after the Known limits section, including
cross-links to related documentation content such as the main authz
documentation, other e2e test harnesses, or relevant policy/JWT validation
documentation.

In `@plugins/example-plugin/tests/test_authz.py`:
- Around line 9-39: The test function
test_example_authz_derivation_has_no_problems currently does not verify that the
warnings returned from _derive_service_contribution are empty, which allows
authz coverage warnings to slip through. Add an assertion to check that
_warnings is an empty list, placing it alongside the existing assertion that
problems is an empty list, to ensure no warnings are generated during the authz
derivation process.

In `@plugins/nemo-auditor/tests/test_authz.py`:
- Around line 29-32: The test on line 32 only verifies that permissions are
declared in contrib.permissions but does not verify that the target endpoints
(GET for read, DELETE for delete) actually use those permissions in their
bindings. Add assertions similar to line 30 (checking the POST endpoint) to
verify that contrib.endpoints[targets]["get"].permissions includes
"auditor.targets.read" and contrib.endpoints[targets]["delete"].permissions
includes "auditor.targets.delete", ensuring that wrong endpoint permission
stamps cannot pass the test.

In `@plugins/nemo-customizer/tests/test_router.py`:
- Around line 191-197: The test verifies that the permissions exist in
contribution.permissions, but does not verify that the actual job creation
endpoints are bound to these permissions under the hub prefix. Add assertions
after the existing permission checks to verify that the automodel jobs endpoint
and unsloth jobs endpoint exist in contribution.endpoints under their respective
hub prefix paths (like /apis/customization/automodel/jobs and
/apis/customization/unsloth/jobs), and that these endpoints reference the
correct permissions. This will ensure the endpoints are not only created but
also properly connected to the required permissions.

In `@plugins/nemo-evaluator/tests/test_authz.py`:
- Around line 27-35: The test currently only validates permissions for a subset
of job-collection endpoints (post, get, delete on the jobs path). Expand the
test assertions to check callers and scopes attributes in addition to
permissions for the existing endpoints, and add new assertions for the remaining
job routes (get/{name}, cancel, logs, and results) with their corresponding
permissions, callers, and scopes. This ensures evaluator permission mappings are
fully validated locally rather than relying only on central factory tests to
catch regressions.

In `@services/core/auth/src/nmp/core/auth/app/policies/authz.rego`:
- Around line 169-172: The binding `path := extract_path` is computed but never
referenced elsewhere in the rule, creating an unnecessary variable assignment
that impacts hot-path performance. Remove the unused `path := extract_path` line
while keeping the other bindings and conditions (method and endpoint_scan
checks) intact.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: dc5ab413-96ac-4aa3-a1b5-3efbdb86562f

📥 Commits

Reviewing files that changed from the base of the PR and between a3c9e2f and c62b88f.

📒 Files selected for processing (79)
  • docs/set-up/config-reference.mdx
  • e2e/authz_oidc/README.md
  • e2e/authz_oidc/conftest.py
  • e2e/authz_oidc/fixtures/harness-broken/pyproject.toml
  • e2e/authz_oidc/fixtures/harness-broken/src/harness_broken/service.py
  • e2e/authz_oidc/fixtures/harness-fixture/pyproject.toml
  • e2e/authz_oidc/fixtures/harness-fixture/src/harness_fixture/service.py
  • e2e/authz_oidc/fixtures/harness-unruled/pyproject.toml
  • e2e/authz_oidc/fixtures/harness-unruled/src/harness_unruled/service.py
  • e2e/authz_oidc/harness.py
  • e2e/authz_oidc/idp.py
  • e2e/authz_oidc/matrix.py
  • e2e/authz_oidc/report.py
  • e2e/authz_oidc/test_authz_matrix.py
  • packages/nemo_platform_plugin/src/nemo_platform_plugin/authz.py
  • packages/nemo_platform_plugin/src/nemo_platform_plugin/authz_discovery.py
  • packages/nemo_platform_plugin/src/nemo_platform_plugin/authz_format.py
  • packages/nemo_platform_plugin/src/nemo_platform_plugin/customization_contributor.py
  • packages/nemo_platform_plugin/src/nemo_platform_plugin/discovery.py
  • packages/nemo_platform_plugin/src/nemo_platform_plugin/functions/routes.py
  • packages/nemo_platform_plugin/src/nemo_platform_plugin/jobs/api_factory.py
  • packages/nemo_platform_plugin/src/nemo_platform_plugin/jobs/routes.py
  • packages/nemo_platform_plugin/src/nemo_platform_plugin/service.py
  • packages/nemo_platform_plugin/tests/test_authz.py
  • packages/nemo_platform_plugin/tests/test_authz_failmode.py
  • packages/nemo_platform_plugin/tests/test_discovery.py
  • packages/nemo_platform_plugin/tests/test_factory_authz.py
  • packages/nemo_platform_plugin/tests/test_functions_routes.py
  • packages/nemo_platform_plugin/tests/test_path_rule.py
  • packages/nmp_common/tests/auth/test_authz_format.py
  • packages/nmp_customization_common/src/nmp/customization_common/contributor/base.py
  • plugins/example-plugin/src/nemo_example_plugin/_perms.py
  • plugins/example-plugin/src/nemo_example_plugin/middleware_service.py
  • plugins/example-plugin/src/nemo_example_plugin/service.py
  • plugins/example-plugin/tests/test_authz.py
  • plugins/nemo-agents/src/nemo_agents_plugin/api/v2/_perms.py
  • plugins/nemo-agents/src/nemo_agents_plugin/api/v2/agents.py
  • plugins/nemo-agents/src/nemo_agents_plugin/api/v2/deployment_logs.py
  • plugins/nemo-agents/src/nemo_agents_plugin/api/v2/deployments.py
  • plugins/nemo-agents/src/nemo_agents_plugin/api/v2/gateway.py
  • plugins/nemo-agents/src/nemo_agents_plugin/service.py
  • plugins/nemo-agents/tests/test_authz.py
  • plugins/nemo-agents/tests/unit/test_service.py
  • plugins/nemo-anonymizer/src/nemo_anonymizer_plugin/service.py
  • plugins/nemo-anonymizer/tests/unit/test_authz.py
  • plugins/nemo-auditor/src/nemo_auditor/api/v2/_perms.py
  • plugins/nemo-auditor/src/nemo_auditor/api/v2/configs.py
  • plugins/nemo-auditor/src/nemo_auditor/api/v2/targets.py
  • plugins/nemo-auditor/src/nemo_auditor/service.py
  • plugins/nemo-auditor/tests/test_authz.py
  • plugins/nemo-customizer/src/nemo_customizer/router.py
  • plugins/nemo-customizer/tests/test_router.py
  • plugins/nemo-data-designer/src/nemo_data_designer_plugin/service.py
  • plugins/nemo-data-designer/tests/unit/test_authz.py
  • plugins/nemo-evaluator/src/nemo_evaluator/service.py
  • plugins/nemo-evaluator/tests/test_authz.py
  • plugins/nemo-safe-synthesizer/src/nemo_safe_synthesizer_plugin/api/v2/jobs/endpoints.py
  • plugins/nemo-safe-synthesizer/src/nemo_safe_synthesizer_plugin/service.py
  • plugins/nemo-safe-synthesizer/tests/unit/test_authz.py
  • plugins/nemo-safe-synthesizer/tests/unit/test_service.py
  • plugins/nemo-unsloth/src/nemo_unsloth_plugin/contributor.py
  • plugins/nemo-unsloth/tests/test_contributor.py
  • services/core/auth/scripts/auth-tools.py
  • services/core/auth/src/nmp/core/auth/app/bundle.py
  • services/core/auth/src/nmp/core/auth/app/policies/authz.rego
  • services/core/auth/src/nmp/core/auth/app/policies/common.rego
  • services/core/auth/src/nmp/core/auth/app/policies/extract.rego
  • services/core/auth/src/nmp/core/auth/app/policies/scopes.rego
  • services/core/auth/src/nmp/core/auth/app/policy_tests/caller_kind_test.rego
  • services/core/auth/src/nmp/core/auth/app/policy_tests/deny_test.rego
  • services/core/auth/src/nmp/core/auth/app/policy_tests/generic_entities_deny_test.rego
  • services/core/auth/src/nmp/core/auth/app/policy_tests/namespace_access_test.rego
  • services/core/auth/src/nmp/core/auth/app/policy_tests/namespace_creation_test.rego
  • services/core/auth/src/nmp/core/auth/app/policy_tests/platform_admin_test.rego
  • services/core/auth/src/nmp/core/auth/app/policy_tests/scopes_test.rego
  • services/core/auth/src/nmp/core/auth/app/policy_tests/unknown_endpoint_test.rego
  • services/core/auth/src/nmp/core/auth/config.py
  • services/core/auth/tests/test_bundle.py
  • services/core/auth/tests/test_embedded_pdp.py
💤 Files with no reviewable changes (1)
  • packages/nemo_platform_plugin/tests/test_discovery.py

Comment thread e2e/authz_oidc/conftest.py
Comment thread e2e/authz_oidc/conftest.py Outdated
Comment thread e2e/authz_oidc/idp.py
Comment thread e2e/authz_oidc/matrix.py
Comment thread plugins/nemo-safe-synthesizer/tests/unit/test_authz.py
Comment thread plugins/nemo-unsloth/tests/test_contributor.py Outdated
Comment thread services/core/auth/src/nmp/core/auth/app/bundle.py Outdated
Comment thread services/core/auth/tests/test_embedded_pdp.py
- bundle: union denied_plugin_prefixes with any pre-existing static fence
  instead of overwriting it, so a runtime-degraded plugin cannot reopen a
  statically fenced namespace.
- e2e harness: add a timeout to fixture plugin install, close the platform
  generator on setup failure, and join the OIDC issuer thread on stop; give
  Case.expected/body concrete types (drops two type: ignores).
- plugin authz tests: assert discovery queries the nemo.services group, keep
  the distinct-permission-set test inside its own namespace, gate the
  safe-synthesizer derivation test on its optional runtime, and assert every
  contributed unsloth route carries a @path_rule.

Signed-off-by: Max Dubrinsky <mdubrinsky@nvidia.com>
Comment thread packages/nemo_platform_plugin/src/nemo_platform_plugin/functions/routes.py Outdated
Replace the per-adapter permission_namespace / api_area / permission_id /
permission_description kwargs on add_function_routes, add_job_routes, and
job_route_factory with a single AuthzScope value object.

AuthzScope is the front door for plugin route authz: it owns the
<namespace>.<action> id format (previously hand-built in two places) and
mints Permission objects via .permission(*segments, description=...). scope
is the coarse OAuth grouping; namespace defaults to it and deepens via
.child(...) for plugins whose permission namespace nests below the scope
(agents, customization contributors). Both job verbs and function invokes
now derive through the same .permission() path.

Collapsing the two inputs into one typed object makes the old
"namespace/api_area must be set together" guard unrepresentable, so it's
removed from both adapters. The function adapter keeps an optional
permission_description hatch (and rejects it without authz, to avoid
silently discarding it). permission_id is dropped (no callers; ids are
always derived).

Migrate all call sites: example, data-designer, anonymizer, evaluator,
safe-synthesizer, agents, and the shared customization BaseContributor.

Signed-off-by: Max Dubrinsky <mdubrinsky@nvidia.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
packages/nemo_platform_plugin/src/nemo_platform_plugin/jobs/api_factory.py (1)

423-428: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Return the tarball, not the directory.

Directory artifacts create tar_path, then FileResponse still points at output_path; directory downloads will fail instead of serving the archive.

Fix
         if output_path.is_dir():
             filename = f"{output_path.name}.tar.gz"
             tar_path = output_path.parent / filename
             with tarfile.open(tar_path, "w:gz") as tar:
                 tar.add(output_path, arcname=os.path.basename(output_path))
-            return FileResponse(path=output_path, filename=filename)
+            return FileResponse(path=tar_path, filename=filename)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/nemo_platform_plugin/src/nemo_platform_plugin/jobs/api_factory.py`
around lines 423 - 428, In the conditional block where output_path is a
directory, the code creates a tarball at tar_path but then returns FileResponse
pointing to the original output_path instead of the newly created tar_path. Fix
the FileResponse instantiation to use tar_path as the path parameter instead of
output_path so that the created tarball archive is served to the client rather
than attempting to serve the directory.
packages/nemo_platform_plugin/src/nemo_platform_plugin/jobs/routes.py (1)

48-65: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Use concrete runtime type imports here.

Callable and same-package NemoJob are still imported under TYPE_CHECKING and referenced as string annotations. Import them normally and remove the quoted hints.

Proposed fix
-from typing import TYPE_CHECKING, Any
+from collections.abc import Callable
+from typing import Any
 
 from fastapi import APIRouter
 from fastapi.routing import APIRoute
 from nemo_platform_plugin.authz import AuthzScope
-from nemo_platform_plugin.job import job_collection_path_for
+from nemo_platform_plugin.job import NemoJob, job_collection_path_for
@@
-if TYPE_CHECKING:
-    from collections.abc import Callable
-
-    from nemo_platform_plugin.job import NemoJob
-
-
 def add_job_routes(
-    job_cls: type["NemoJob"],
+    job_cls: type[NemoJob],
@@
-    generate_job_name: "Callable[..., str] | None" = None,
+    generate_job_name: Callable[..., str] | None = None,
@@
-def _derive_service_name(job_cls: type["NemoJob"]) -> str:
+def _derive_service_name(job_cls: type[NemoJob]) -> str:
@@
-def _derive_job_type(job_cls: type["NemoJob"]) -> str:
+def _derive_job_type(job_cls: type[NemoJob]) -> str:
@@
-def _adapt_to_spec(job_cls: type["NemoJob"]) -> "Callable[..., Any]":
+def _adapt_to_spec(job_cls: type[NemoJob]) -> Callable[..., Any]:
@@
-    job_cls: type["NemoJob"],
+    job_cls: type[NemoJob],
@@
-) -> "Callable[..., Any]":
+) -> Callable[..., Any]:

As per coding guidelines, “Always prefer concrete type hints over string-based ones in Python code; do not import types under TYPE_CHECKING, instead import types as regular imports when possible” and “Prefer concrete type hints over string-based type hints; do not use TYPE_CHECKING imports for types that are runtime-available in the same package.”

Also applies to: 68-76, 197-208, 223-257

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/nemo_platform_plugin/src/nemo_platform_plugin/jobs/routes.py` around
lines 48 - 65, Move the imports of Callable from collections.abc and NemoJob
from nemo_platform_plugin.job out of the TYPE_CHECKING conditional block and
into the regular imports at the top of the file. Then remove any string-based
type annotations (quoted type hints) where Callable and NemoJob are used
throughout the file and replace them with direct type references. This same fix
needs to be applied at lines 68-76, 197-208, and 223-257 where these types are
similarly referenced as string annotations.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@packages/nemo_platform_plugin/src/nemo_platform_plugin/jobs/api_factory.py`:
- Around line 423-428: In the conditional block where output_path is a
directory, the code creates a tarball at tar_path but then returns FileResponse
pointing to the original output_path instead of the newly created tar_path. Fix
the FileResponse instantiation to use tar_path as the path parameter instead of
output_path so that the created tarball archive is served to the client rather
than attempting to serve the directory.

In `@packages/nemo_platform_plugin/src/nemo_platform_plugin/jobs/routes.py`:
- Around line 48-65: Move the imports of Callable from collections.abc and
NemoJob from nemo_platform_plugin.job out of the TYPE_CHECKING conditional block
and into the regular imports at the top of the file. Then remove any
string-based type annotations (quoted type hints) where Callable and NemoJob are
used throughout the file and replace them with direct type references. This same
fix needs to be applied at lines 68-76, 197-208, and 223-257 where these types
are similarly referenced as string annotations.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 93c2fab8-fc53-4445-ba74-9c689accba72

📥 Commits

Reviewing files that changed from the base of the PR and between bb478a1 and 30189b6.

📒 Files selected for processing (16)
  • packages/nemo_platform_plugin/src/nemo_platform_plugin/authz.py
  • packages/nemo_platform_plugin/src/nemo_platform_plugin/functions/routes.py
  • packages/nemo_platform_plugin/src/nemo_platform_plugin/jobs/api_factory.py
  • packages/nemo_platform_plugin/src/nemo_platform_plugin/jobs/routes.py
  • packages/nemo_platform_plugin/tests/test_factory_authz.py
  • packages/nemo_platform_plugin/tests/test_functions_routes.py
  • packages/nmp_customization_common/src/nmp/customization_common/contributor/base.py
  • plugins/example-plugin/src/nemo_example_plugin/service.py
  • plugins/nemo-agents/src/nemo_agents_plugin/service.py
  • plugins/nemo-anonymizer/src/nemo_anonymizer_plugin/service.py
  • plugins/nemo-data-designer/src/nemo_data_designer_plugin/service.py
  • plugins/nemo-evaluator/src/nemo_evaluator/service.py
  • plugins/nemo-safe-synthesizer/src/nemo_safe_synthesizer_plugin/api/v2/jobs/endpoints.py
  • plugins/nemo-safe-synthesizer/src/nemo_safe_synthesizer_plugin/service.py
  • plugins/nemo-safe-synthesizer/tests/unit/test_service.py
  • plugins/nemo-unsloth/tests/test_contributor.py
🚧 Files skipped from review as they are similar to previous changes (7)
  • plugins/nemo-unsloth/tests/test_contributor.py
  • plugins/nemo-data-designer/src/nemo_data_designer_plugin/service.py
  • plugins/example-plugin/src/nemo_example_plugin/service.py
  • plugins/nemo-safe-synthesizer/src/nemo_safe_synthesizer_plugin/service.py
  • packages/nmp_customization_common/src/nmp/customization_common/contributor/base.py
  • plugins/nemo-evaluator/src/nemo_evaluator/service.py
  • plugins/nemo-safe-synthesizer/tests/unit/test_service.py

)
scope_sets = {None if rule.scopes is None else frozenset(rule.scopes) for rule in rules}
if len(scope_sets) > 1:
raise ValueError(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we want to declare the scope for the endpoint as a separate decorator, and ensure there's only one, and then we can make this error impossible. Does that make sense?


Returns ``(permissions, scopes, callers)`` for the representative rule.
"""
perm_sets = {frozenset(p.id for p in rule.permissions) for rule in rules}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm expecting us to group by caller here, but what it's actually doing is grouping by id. Is that right?

f"distinct permission sets — use one rule with shared permissions, or a single "
f"rule listing multiple callers."
)
scope_sets = {None if rule.scopes is None else frozenset(rule.scopes) for rule in rules}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, expecting to group by caller

Comment thread packages/nemo_platform_plugin/src/nemo_platform_plugin/authz_discovery.py Outdated

@ironcommit ironcommit left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work, super happy with how it came to gether. Just a few questions and clarifications.

Comment thread packages/nemo_platform_plugin/src/nemo_platform_plugin/authz_discovery.py Outdated

platform_admin_exempt if {
platform_admin_in_system
data.authz.config.platform_admin_exempt_from_service_only == true

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get that we are being defensive here, but do we really need to make a distinction between service and non-service routes at all? Shouldn't a PlatformAdmin just have access to everything always regardless if it's principal or service principal? I don't totally understand why we need this distinction at all, and I would err on the side of just keeing things simple.

on_invalid,
"; ".join(result.problems),
)
if on_invalid == "hard_fail":

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this "hard_fail" the default? I'm expecting a fast fail on misconfiguration.

# coverage can't be trusted:
# * no route enumerated at all (load/derivation failure) — the runner may still
# mount this plugin via a separate instantiation; OR
# * quarantine — _quarantine_contribution only rewrites the routes derivation SAW,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure what SAW means here. It's also not clear to me what the usecase is here for quarantine. There might be a purpose for it, it's just not clear from the code yet.

# The offending routes are always emitted as explicit denies; this controls the blast
# radius. deny_route: deny only the bad routes. quarantine: deny the whole plugin.
# hard_fail: refuse to build the OPA bundle.
on_invalid_plugin: Literal["deny_route", "quarantine", "hard_fail"] = Field(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue for hard_fail as the default for misconfiguration.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe consider just removing the config field all together, if everything is just fast fail on configuration error, then this simplifies. I don't see a use case for this.

# data.authz.config.platform_admin_exempt_from_service_only.
platform_admin_exempt_from_service_only: bool = Field(
default=False,
description="Allow a human PlatformAdmin on SERVICE_PRINCIPAL-only plugin routes.",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if "human" is correct here. this is really a question of if "principal" can access a route that is defined for "service_principal" only. We don't really know if it's human or not, we only know "principal/service_pricipal". I think principal means an OIDC user, and service principal roughly maps to either an API key, or other non-oidc access. So the question is, should an OIDC user be able to access an endpoint that is restricted to service keys only, and I think the answer should be always no. Happy to consider a use case where the answer might be yes.

…hz cache

_collapse_rules: rename perm_sets/scope_sets to distinct_permission_sets/distinct_scope_sets and note that callers are the only OR'd dimension.

discover_plugin_authz: fetch/store the cache via _cached_plugin_authz()/_cache_plugin_authz() helpers instead of touching the global directly.

Signed-off-by: Max Dubrinsky <mdubrinsky@nvidia.com>
Add AuthzScope.scopes(write=)/read_scopes()/write_scopes(), built from the OAuth scope area (not the permission namespace, so .child() keeps the area).

Each plugin declares one shared AuthzScope as SCOPE in its authz module; the route generators and hand-written routes (agents, auditor, evaluator, example) derive scopes from it, dropping the repeated scopes_for("<area>", ...) literals. Wire-preserving: derived endpoint scopes are unchanged.

Signed-off-by: Max Dubrinsky <mdubrinsky@nvidia.com>
…tp-authz/md

Signed-off-by: Max Dubrinsky <mdubrinsky@nvidia.com>

# Conflicts:
#	plugins/example-plugin/src/nemo_example_plugin/service.py
#	plugins/nemo-evaluator/src/nemo_evaluator/service.py

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
plugins/nemo-evaluator/src/nemo_evaluator/api/v2/metrics.py (1)

6-6: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

from __future__ import annotations makes every annotation string-based.

Guidelines require concrete type hints, not string-based ones. All hints here reference already-imported names, so dropping this import keeps them eager and concrete.

As per coding guidelines: "Always prefer concrete type hints over string-based ones in Python code".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/nemo-evaluator/src/nemo_evaluator/api/v2/metrics.py` at line 6, The
`from __future__ import annotations` statement at the beginning of the
metrics.py file converts all type annotations to strings, which violates the
coding guideline requiring concrete type hints. Remove this import statement
from the file since all type hints used in this module reference
already-imported names, and removing it will preserve eager evaluation of type
annotations as concrete types rather than strings.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@plugins/nemo-evaluator/src/nemo_evaluator/api/v2/metrics.py`:
- Around line 109-120: The exception messages in the EntityValidationError and
ValueError except blocks are being logged directly without sanitization,
bypassing the _sanitize_for_log protection against log injection. Sanitize the
exception object `e` by passing it through the _sanitize_for_log function before
including it in the logger.warning calls at lines where you log `{e}` directly.
Apply this sanitization consistently to both the EntityValidationError handler
and the ValueError handler to ensure all user-controlled exception text is
properly sanitized before appearing in logs.

---

Nitpick comments:
In `@plugins/nemo-evaluator/src/nemo_evaluator/api/v2/metrics.py`:
- Line 6: The `from __future__ import annotations` statement at the beginning of
the metrics.py file converts all type annotations to strings, which violates the
coding guideline requiring concrete type hints. Remove this import statement
from the file since all type hints used in this module reference
already-imported names, and removing it will preserve eager evaluation of type
annotations as concrete types rather than strings.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 4d5c1ebc-9038-403e-a12f-f667ddcf9da6

📥 Commits

Reviewing files that changed from the base of the PR and between 6db16e3 and c28b8e8.

📒 Files selected for processing (6)
  • docs/set-up/config-reference.mdx
  • plugins/example-plugin/src/nemo_example_plugin/service.py
  • plugins/nemo-evaluator/src/nemo_evaluator/api/v2/metrics.py
  • plugins/nemo-evaluator/src/nemo_evaluator/authz.py
  • plugins/nemo-evaluator/src/nemo_evaluator/service.py
  • services/core/auth/scripts/auth-tools.py
💤 Files with no reviewable changes (1)
  • services/core/auth/scripts/auth-tools.py
✅ Files skipped from review due to trivial changes (2)
  • plugins/nemo-evaluator/src/nemo_evaluator/authz.py
  • docs/set-up/config-reference.mdx
🚧 Files skipped from review as they are similar to previous changes (2)
  • plugins/nemo-evaluator/src/nemo_evaluator/service.py
  • plugins/example-plugin/src/nemo_example_plugin/service.py

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.

Actionable comments posted: 1

🧹 Nitpick comments (1)
plugins/nemo-evaluator/src/nemo_evaluator/api/v2/metrics.py (1)

6-6: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

from __future__ import annotations makes every annotation string-based.

Guidelines require concrete type hints, not string-based ones. All hints here reference already-imported names, so dropping this import keeps them eager and concrete.

As per coding guidelines: "Always prefer concrete type hints over string-based ones in Python code".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/nemo-evaluator/src/nemo_evaluator/api/v2/metrics.py` at line 6, The
`from __future__ import annotations` statement at the beginning of the
metrics.py file converts all type annotations to strings, which violates the
coding guideline requiring concrete type hints. Remove this import statement
from the file since all type hints used in this module reference
already-imported names, and removing it will preserve eager evaluation of type
annotations as concrete types rather than strings.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@plugins/nemo-evaluator/src/nemo_evaluator/api/v2/metrics.py`:
- Around line 109-120: The exception messages in the EntityValidationError and
ValueError except blocks are being logged directly without sanitization,
bypassing the _sanitize_for_log protection against log injection. Sanitize the
exception object `e` by passing it through the _sanitize_for_log function before
including it in the logger.warning calls at lines where you log `{e}` directly.
Apply this sanitization consistently to both the EntityValidationError handler
and the ValueError handler to ensure all user-controlled exception text is
properly sanitized before appearing in logs.

---

Nitpick comments:
In `@plugins/nemo-evaluator/src/nemo_evaluator/api/v2/metrics.py`:
- Line 6: The `from __future__ import annotations` statement at the beginning of
the metrics.py file converts all type annotations to strings, which violates the
coding guideline requiring concrete type hints. Remove this import statement
from the file since all type hints used in this module reference
already-imported names, and removing it will preserve eager evaluation of type
annotations as concrete types rather than strings.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 4d5c1ebc-9038-403e-a12f-f667ddcf9da6

📥 Commits

Reviewing files that changed from the base of the PR and between 6db16e3 and c28b8e8.

📒 Files selected for processing (6)
  • docs/set-up/config-reference.mdx
  • plugins/example-plugin/src/nemo_example_plugin/service.py
  • plugins/nemo-evaluator/src/nemo_evaluator/api/v2/metrics.py
  • plugins/nemo-evaluator/src/nemo_evaluator/authz.py
  • plugins/nemo-evaluator/src/nemo_evaluator/service.py
  • services/core/auth/scripts/auth-tools.py
💤 Files with no reviewable changes (1)
  • services/core/auth/scripts/auth-tools.py
✅ Files skipped from review due to trivial changes (2)
  • plugins/nemo-evaluator/src/nemo_evaluator/authz.py
  • docs/set-up/config-reference.mdx
🚧 Files skipped from review as they are similar to previous changes (2)
  • plugins/nemo-evaluator/src/nemo_evaluator/service.py
  • plugins/example-plugin/src/nemo_example_plugin/service.py
🛑 Comments failed to post (1)
plugins/nemo-evaluator/src/nemo_evaluator/api/v2/metrics.py (1)

109-120: 🔒 Security & Privacy | 🟡 Minor | ⚡ Quick win

Raw exception messages bypass _sanitize_for_log.

You sanitize workspace/name but log {e} unsanitized (Lines 110, 119). Exception text can carry user-controlled input, reopening the log-injection vector _sanitize_for_log was added to close.

🛡️ Proposed fix
     except EntityValidationError as e:
-        logger.warning(f"Entity store validation error during metric creation: {e}")
+        logger.warning(f"Entity store validation error during metric creation: {_sanitize_for_log(e)}")
         raise HTTPException(status_code=status.HTTP_422_UNPROCESSABLE_ENTITY, detail=str(e))
     except ValueError as e:
         if "already exists" in str(e).lower():
             logger.warning(f"Metric already exists: {safe_workspace}/{safe_name}")
             raise HTTPException(
                 status_code=status.HTTP_409_CONFLICT,
                 detail=f"Metric with workspace '{workspace}' and name '{name}' already exists",
             )
-        logger.warning(f"Metric creation validation error: {e}")
+        logger.warning(f"Metric creation validation error: {_sanitize_for_log(e)}")
         raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="Invalid metric data")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/nemo-evaluator/src/nemo_evaluator/api/v2/metrics.py` around lines 109
- 120, The exception messages in the EntityValidationError and ValueError except
blocks are being logged directly without sanitization, bypassing the
_sanitize_for_log protection against log injection. Sanitize the exception
object `e` by passing it through the _sanitize_for_log function before including
it in the logger.warning calls at lines where you log `{e}` directly. Apply this
sanitization consistently to both the EntityValidationError handler and the
ValueError handler to ensure all user-controlled exception text is properly
sanitized before appearing in logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants