refactor(authz): unify plugin HTTP authz under NemoService#332
refactor(authz): unify plugin HTTP authz under NemoService#332maxdubrinsky wants to merge 9 commits into
Conversation
|
3a373a8 to
af6229c
Compare
Derive plugin HTTP authorization from @path_rule/@api_route route
metadata plus get_authz_definition(); remove the nemo.authz /
get_authz_contribution() split so NemoService is the sole source of
plugin HTTP auth policy. Adds explicit permission declarations
(id + description), per-service permission_namespace, and bundle-time
validation that fails closed on undeclared/mis-namespaced permissions,
malformed ids/scopes, and any plugin route lacking a final path rule.
Includes symmetric caller-kind enforcement, fence / degraded-coverage
containment, the permission-stamped no-{workspace} GET tightening
(F1-5), and a wasm-safe namespace fence (native builtins). Annotates
all 7 NemoService plugins.
Implements AIRCORE-743 (umbrella AIRCORE-751).
Signed-off-by: Max Dubrinsky <mdubrinsky@nvidia.com>
The embedded-PDP `allow` evaluation re-ran the O(endpoints) endpoint match (`normalize_endpoint`) and workspace extraction (`extract_workspace_from_path`) at every call site. OPA's WASM target caches 0-arg rule results for a query but NOT function calls, so each of the ~7-10 call sites re-scanned all ~200 endpoint patterns. The plugin-authz rules added more always-evaluated call sites (endpoint_denied, endpoint_callers x2, the permission-stamped no-workspace GET rule), pushing the PlatformAdmin-bypass eval from 71M to 112M fuel — over the 100M default embedded_pdp_cpu_limit, so every request 502'd once role bindings loaded (AIRCORE-797). Bind each path/method-keyed scan to a 0-arg request-scoped rule computed once per evaluation (endpoint_scan, workspace_scan, req_permissions, req_callers, req_deny, req_required_scopes) and route the allow_request/deny_request/scope_check_passed bodies through them. The scan rules are always-defined (sentinel "" when nothing matches) so unknown / no-workspace paths hit the cache too — an undefined rule result is re-evaluated on each reference. The original helper functions are kept intact: the policy tests call them directly with explicit paths/methods. Result (200 endpoints, opa 1.8.0, measured via wasmtime fuel): every case drops to 10-19M (admin bypass 112M -> 18M; permission GET 68M -> 11M), well under budget and below main. Per-endpoint fuel slope falls ~6x (0.44M -> 0.07M), so the policy stays under 50M even at 600 endpoints. Verified equivalent: 238/238 opa tests pass and allow/deny decisions are unchanged across 14 representative cases. Signed-off-by: Max Dubrinsky <mdubrinsky@nvidia.com>
Real-network verification harness under e2e/authz_oidc: a mini OIDC issuer (RS256 + JWKS over HTTP) mints signed JWTs against a running platform, exercising a ~45-case allow/deny matrix over three fixture plugins to prove the new model denies what it should. Not wired into CI (no integration marker). Signed-off-by: Max Dubrinsky <mdubrinsky@nvidia.com>
af6229c to
c62b88f
Compare
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (6)
💤 Files with no reviewable changes (1)
✅ Files skipped from review due to trivial changes (2)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughDerives plugin HTTP authorization from route metadata, updates bundle and policy enforcement for caller kinds and fail modes, migrates plugin services to route-level authz wiring, and adds OIDC E2E coverage with report generation. ChangesRoute-derived plugin authorization
Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 12
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
services/core/auth/scripts/auth-tools.py (1)
1353-1374:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winReport changed endpoint methods, not only new paths.
Line 1353 compares only path keys. Adding or changing
POST /existing/pathis written tostatic-authz.yamlbut reported as “No new endpoint paths,” hiding an authz delta.Suggested fix
auth_config = load_yaml(auth_path) - before_endpoints = set(auth_config.get("authz", {}).get("endpoints", {}).keys()) + def _endpoint_method_specs(config: Dict) -> Dict[tuple[str, str], Dict]: + return { + (path, method.lower()): spec + for path, methods in config.get("authz", {}).get("endpoints", {}).items() + if isinstance(methods, dict) + for method, spec in methods.items() + if method.lower() in HTTP_METHODS + } + + before_specs = _endpoint_method_specs(auth_config) @@ merged = merge_plugin_authz_contributions(auth_config) - after_endpoints = set(merged.get("authz", {}).get("endpoints", {}).keys()) - added_paths = sorted(after_endpoints - before_endpoints) + after_specs = _endpoint_method_specs(merged) + changed_methods = sorted(key for key, spec in after_specs.items() if before_specs.get(key) != spec) @@ console.print("[bold]Merging plugin authz contributions...[/bold]") - for path in added_paths: - methods = sorted(merged["authz"]["endpoints"][path].keys()) - console.print(f" [green]+[/green] {path} ({', '.join(methods)})") + for path, method in changed_methods: + marker = "+" if (path, method) not in before_specs else "~" + console.print(f" [green]{marker}[/green] {method.upper()} {path}") - if not added_paths: - console.print("[dim]No new endpoint paths (contributions may already be present).[/dim]") + if not changed_methods: + console.print("[dim]No endpoint method changes (contributions may already be present).[/dim]")🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@services/core/auth/scripts/auth-tools.py` around lines 1353 - 1374, The code only detects new endpoint paths by comparing path keys, missing method changes on existing paths. To fix this, after computing added_paths from the set difference of before_endpoints and after_endpoints, also identify modified paths by iterating through paths that exist in both before and after states and comparing their merged authz methods/configurations. Update the reporting loop to iterate through both added_paths and modified_paths, displaying changes for each. Finally, update the condition at the end that prints "No new endpoint paths" to check if both added_paths and modified_paths are empty, so that method changes on existing paths are properly reported and don't incorrectly show as "No new endpoint paths."packages/nemo_platform_plugin/src/nemo_platform_plugin/service.py (1)
14-25:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winUpdate the service example to include
@path_rule.The public example still registers an unruled route; copied as-is, the new derivation emits an explicit deny.
Proposed fix
from fastapi import APIRouter + from nemo_platform_plugin.authz import CallerKind, path_rule from nemo_platform_plugin.service import NemoService, RouterSpec @@ `@router.get`("/health") + `@path_rule`(callers=[CallerKind.PRINCIPAL], permissions=[]) async def health() -> dict[str, str]: return {"status": "ok"}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/nemo_platform_plugin/src/nemo_platform_plugin/service.py` around lines 14 - 25, The health route handler in the MyService example is missing the `@path_rule` decorator, which will cause an explicit deny when the code is used. Add the `@path_rule` decorator to the health async function to properly define the permission rule for this route endpoint.packages/nemo_platform_plugin/src/nemo_platform_plugin/jobs/routes.py (1)
48-76:⚠️ Potential issue | 🟡 MinorRemove
TYPE_CHECKINGand use concrete type hints.Lines 48, 61–73:
NemoJobandCallableare runtime-available and should not be string-quoted or gated behindTYPE_CHECKING. Import them directly.Suggested fix
-from typing import TYPE_CHECKING, Any +from collections.abc import Callable +from typing import Any from fastapi import APIRouter from fastapi.routing import APIRoute -from nemo_platform_plugin.job import job_collection_path_for +from nemo_platform_plugin.job import NemoJob, job_collection_path_for - -if TYPE_CHECKING: - from collections.abc import Callable - - from nemo_platform_plugin.job import NemoJob - def add_job_routes( - job_cls: type["NemoJob"], + job_cls: type[NemoJob], *, service_name: str | None = None, route_options: list[JobRouteOption] | None = None, job_result_routes: list[PlatformJobResultRoute] | None = None, - generate_job_name: "Callable[..., str] | None" = None, + generate_job_name: Callable[..., str] | None = None,Apply similar changes to all occurrences at lines 198, 209, 224, 256, and 258.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/nemo_platform_plugin/src/nemo_platform_plugin/jobs/routes.py` around lines 48 - 76, Remove the TYPE_CHECKING conditional block and import Callable and NemoJob directly as runtime imports from their respective modules (collections.abc and nemo_platform_plugin.job). Then remove the string quotes from all type hints using these symbols in the add_job_routes function signature and parameters (lines 19-28 and throughout). Apply the same direct import pattern and remove string quotes at the other affected locations mentioned in the comment: lines 198, 209, 224, 256, and 258.Source: Coding guidelines
🧹 Nitpick comments (6)
e2e/authz_oidc/README.md (1)
1-82: ⚡ Quick winAdd required
Prerequisites(top) andNext Steps(end) sections.Line 1 starts directly with usage/context, but prerequisites are missing before primary content. The page also ends at Line 82 without a
Next Stepssection with cross-links.As per coding guidelines, “Always list prerequisites at the top of documentation pages before other content” and “Include 'Next Steps' section at the end with cross-links to related documentation content.”
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@e2e/authz_oidc/README.md` around lines 1 - 82, Add a Prerequisites section immediately after the main title in e2e/authz_oidc/README.md, before the "Black-box verification..." paragraph, listing any required tools, dependencies, or setup needed. Then add a Next Steps section at the end of the file after the Known limits section, including cross-links to related documentation content such as the main authz documentation, other e2e test harnesses, or relevant policy/JWT validation documentation.Source: Coding guidelines
plugins/example-plugin/tests/test_authz.py (1)
9-39: ⚡ Quick winAssert derived scopes and warnings.
This test checks permissions/callers, but a read/write scope regression still passes. Also require
warnings == []so authz coverage warnings cannot slip through.Proposed test hardening
from nemo_example_plugin.service import ExampleService +from nemo_platform_plugin.authz import scopes_for from nemo_platform_plugin.authz_discovery import _derive_service_contribution def test_example_authz_derivation_has_no_problems() -> None: - contrib, problems, _warnings = _derive_service_contribution(ExampleService()) + contrib, problems, warnings = _derive_service_contribution(ExampleService()) assert problems == [] + assert warnings == [] + + def assert_binding(path: str, method: str, permission: str, *, write: bool) -> None: + binding = contrib.endpoints[path][method] + assert binding.permissions == [permission] + assert set(binding.scopes) == set(scopes_for("example", write=write)) # Minimal hello endpoint (non-workspace-scoped). - assert contrib.endpoints["/apis/example/hello/{name}"]["get"].permissions == ["example.hello.read"] + assert_binding("/apis/example/hello/{name}", "get", "example.hello.read", write=False) # Items CRUD. items = "/apis/example/v2/workspaces/{workspace}/items" - assert contrib.endpoints[items]["post"].permissions == ["example.items.create"] - assert contrib.endpoints[items]["get"].permissions == ["example.items.list"] - assert contrib.endpoints[f"{items}/{{name}}"]["get"].permissions == ["example.items.read"] - assert contrib.endpoints[f"{items}/{{name}}"]["patch"].permissions == ["example.items.update"] - assert contrib.endpoints[f"{items}/{{name}}"]["delete"].permissions == ["example.items.delete"] + assert_binding(items, "post", "example.items.create", write=True) + assert_binding(items, "get", "example.items.list", write=False) + assert_binding(f"{items}/{{name}}", "get", "example.items.read", write=False) + assert_binding(f"{items}/{{name}}", "patch", "example.items.update", write=True) + assert_binding(f"{items}/{{name}}", "delete", "example.items.delete", write=True) # Middleware-config CRUD (hyphenated namespace segment). mw = "/apis/example/v2/workspaces/{workspace}/middleware-configs" - assert contrib.endpoints[mw]["post"].permissions == ["example.middleware-configs.create"] - assert contrib.endpoints[f"{mw}/{{name}}"]["delete"].permissions == ["example.middleware-configs.delete"] + assert_binding(mw, "post", "example.middleware-configs.create", write=True) + assert_binding(mw, "get", "example.middleware-configs.list", write=False) + assert_binding(f"{mw}/{{name}}", "get", "example.middleware-configs.read", write=False) + assert_binding(f"{mw}/{{name}}", "patch", "example.middleware-configs.update", write=True) + assert_binding(f"{mw}/{{name}}", "delete", "example.middleware-configs.delete", write=True)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/example-plugin/tests/test_authz.py` around lines 9 - 39, The test function test_example_authz_derivation_has_no_problems currently does not verify that the warnings returned from _derive_service_contribution are empty, which allows authz coverage warnings to slip through. Add an assertion to check that _warnings is an empty list, placing it alongside the existing assertion that problems is an empty list, to ensure no warnings are generated during the authz derivation process.plugins/nemo-evaluator/tests/test_authz.py (1)
27-35: ⚡ Quick winPin caller kinds, scopes, and the remaining factory bindings.
This only checks selected permissions. Add assertions for callers/scopes and the get/cancel/logs/results job routes so evaluator regressions are caught locally, not only by central factory tests.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/nemo-evaluator/tests/test_authz.py` around lines 27 - 35, The test currently only validates permissions for a subset of job-collection endpoints (post, get, delete on the jobs path). Expand the test assertions to check callers and scopes attributes in addition to permissions for the existing endpoints, and add new assertions for the remaining job routes (get/{name}, cancel, logs, and results) with their corresponding permissions, callers, and scopes. This ensures evaluator permission mappings are fully validated locally rather than relying only on central factory tests to catch regressions.plugins/nemo-customizer/tests/test_router.py (1)
191-197: ⚡ Quick winAssert derived contributor endpoint paths.
These checks prove the permissions exist, but not that
/jobsendpoints are bound to them under the hub prefix.Proposed test tightening
assert problems == [] assert not any(spec.deny for methods in contribution.endpoints.values() for spec in methods.values()) - assert "customization.automodel.jobs.create" in contribution.permissions - assert "customization.unsloth.jobs.create" in contribution.permissions + for backend in ("automodel", "unsloth"): + endpoint = f"/apis/customization/v2/workspaces/{{workspace}}/{backend}/jobs" + binding = contribution.endpoints[endpoint]["post"] + assert binding.permissions == [f"customization.{backend}.jobs.create"] + assert binding.callers == ["principal"] + assert binding.deny is False # The hub's own /healthz is authenticated-but-permissionless (ruled, not denied). hub_healthz = contribution.endpoints["/apis/customization/healthz"]["get"] assert hub_healthz.permissions == [] and not hub_healthz.deny🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/nemo-customizer/tests/test_router.py` around lines 191 - 197, The test verifies that the permissions exist in contribution.permissions, but does not verify that the actual job creation endpoints are bound to these permissions under the hub prefix. Add assertions after the existing permission checks to verify that the automodel jobs endpoint and unsloth jobs endpoint exist in contribution.endpoints under their respective hub prefix paths (like /apis/customization/automodel/jobs and /apis/customization/unsloth/jobs), and that these endpoints reference the correct permissions. This will ensure the endpoints are not only created but also properly connected to the required permissions.plugins/nemo-auditor/tests/test_authz.py (1)
29-32: ⚡ Quick winAssert target CRUD endpoint bindings.
Line 32 only checks permission declaration, not that target routes use those permissions. A wrong target route stamp could pass.
Proposed test tightening
# targets CRUD: spot-check + every referenced permission is declared. targets = "/apis/auditor/v2/workspaces/{workspace}/targets" assert contrib.endpoints[targets]["post"].permissions == ["auditor.targets.create"] - assert {"auditor.targets.read", "auditor.targets.delete"} <= set(contrib.permissions) + assert contrib.endpoints[targets]["get"].permissions == ["auditor.targets.list"] + assert contrib.endpoints[f"{targets}/{{name}}"]["get"].permissions == ["auditor.targets.read"] + assert contrib.endpoints[f"{targets}/{{name}}"]["put"].permissions == ["auditor.targets.update"] + assert contrib.endpoints[f"{targets}/{{name}}"]["delete"].permissions == ["auditor.targets.delete"]🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/nemo-auditor/tests/test_authz.py` around lines 29 - 32, The test on line 32 only verifies that permissions are declared in contrib.permissions but does not verify that the target endpoints (GET for read, DELETE for delete) actually use those permissions in their bindings. Add assertions similar to line 30 (checking the POST endpoint) to verify that contrib.endpoints[targets]["get"].permissions includes "auditor.targets.read" and contrib.endpoints[targets]["delete"].permissions includes "auditor.targets.delete", ensuring that wrong endpoint permission stamps cannot pass the test.services/core/auth/src/nmp/core/auth/app/policies/authz.rego (1)
169-172: ⚡ Quick winRemove the unused
pathbinding.Line 171 computes
extract_pathbut never uses it; Regal flags this hot-path rule.Proposed fix
method := extract_method method in ["GET", "HEAD"] - path := extract_path endpoint_scan != ""🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@services/core/auth/src/nmp/core/auth/app/policies/authz.rego` around lines 169 - 172, The binding `path := extract_path` is computed but never referenced elsewhere in the rule, creating an unnecessary variable assignment that impacts hot-path performance. Remove the unused `path := extract_path` line while keeping the other bindings and conditions (method and endpoint_scan checks) intact.Source: Linters/SAST tools
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@e2e/authz_oidc/conftest.py`:
- Around line 273-280: The platform fixture and similar fixtures referenced in
the comment use generators but do not ensure proper cleanup if an exception
occurs during the setup phase. Wrap the _provision(p) call in a try-finally
block so that the cleanup code (next(gen, None)) is guaranteed to execute even
if _provision raises an exception before reaching the yield statement. This
pattern should be applied to all fixture generator-based fixtures that have
setup code between the initial next(gen) call and the yield statement.
- Around line 91-96: The subprocess.run() call used for uv pip install is
missing a timeout parameter, which can cause it to block indefinitely and stall
the entire e2e test session. Add a timeout parameter to the subprocess.run()
invocation to specify a maximum duration (in seconds) that the pip installation
process is allowed to run before it is terminated. This will prevent the fixture
plugin installation from hanging and blocking subsequent test execution.
In `@e2e/authz_oidc/idp.py`:
- Around line 116-120: The stop() method shuts down and closes the server socket
but does not wait for the background thread that runs the server to exit, which
can cause race conditions during test teardown. After calling
self._server.shutdown() and self._server.server_close(), add a join() call on
the background thread attribute (likely self._thread or similar) to block until
the thread has fully exited before the stop() method returns. This ensures the
server thread cleanup is complete before teardown proceeds.
In `@e2e/authz_oidc/matrix.py`:
- Around line 54-66: In the Case dataclass, replace the generic type annotations
with concrete types to improve static type checking. Change the expected field
from object to set[int] | Literal["not-403"] since it holds either a set of
integers or the string constant "not-403". Change the body field from dict |
None to dict[str, object] | None to reflect that it is a dictionary with string
keys and mixed value types. This will eliminate the need for type ignores in
downstream code and strengthen type safety.
In `@packages/nemo_platform_plugin/src/nemo_platform_plugin/authz_discovery.py`:
- Around line 253-279: The for loop that registers extra permissions (iterating
through the extra list returned from extra_permissions()) is outside the
try-except block, so if _register_permission() raises an exception for a
non-Permission value, it will abort the entire derivation and discard
already-derived route bindings. Move the _register_permission() call inside the
try-except block or wrap the loop iteration itself with exception handling to
catch exceptions from individual permission registrations. Record any exceptions
from _register_permission() in the errors list (similar to how exceptions from
extra_permissions() are handled) and continue processing remaining permissions,
so that bad extra_permissions() entries do not prevent route bindings from being
included.
In `@packages/nemo_platform_plugin/tests/test_authz.py`:
- Around line 138-140: The test at the route /v2/z uses permissions with the z.*
namespace (z.read and z.internal) but the service is named svc, which creates a
namespace mismatch. Keep this test consistent with the service namespace by
changing both Permission strings in the path_rule decorators from z.read and
z.internal to use the svc namespace instead (for example, svc.read and
svc.internal), ensuring the test validates permission-set differences without
introducing namespace validation issues.
- Around line 69-72: The fake entry point lambda in the monkeypatch is too
permissive because it returns the same service dictionary regardless of which
group parameter is passed to discover_entry_points, meaning the test would still
pass even if the code accidentally queries the wrong group. Modify the lambda to
enforce that only the "nemo.services" group is queried by checking the group
parameter and only returning the fake entry point dictionary when group equals
"nemo.services", otherwise returning an empty dictionary to make the test fail
if the discovery queries any other group.
In `@plugins/nemo-safe-synthesizer/tests/unit/test_authz.py`:
- Around line 12-14: The test function
test_safe_synthesizer_authz_derivation_has_no_problems is missing an optional
dependency guard. Add a pytest importorskip call at the beginning of this test
function to skip it when the safe-synthesizer optional dependency is not
installed. This prevents the test from failing in minimal environments and
ensures it only runs when the required optional paths are available for route
derivation.
In `@plugins/nemo-unsloth/tests/test_contributor.py`:
- Around line 61-69: The test currently collects permission IDs and only asserts
that a specific permission exists, but does not verify that every contributed
route has a path rule. Modify the test to iterate through all APIRoute objects
in specs and assert that each route has at least one path rule (by calling
get_path_rules on its endpoint) and that each path rule contains permissions.
This ensures no routes slip through without proper permission rules, rather than
just checking if one specific permission exists somewhere in the collection.
In `@services/core/auth/src/nmp/core/auth/app/bundle.py`:
- Around line 186-188: The assignment to denied_plugin_prefixes on line 188
overwrites any existing prefixes in the merged configuration, which can reopen
previously fenced namespaces. Instead of directly assigning the sorted
denied_prefixes, retrieve any existing denied_plugin_prefixes value from the
configuration, union it with the new denied_prefixes set, and then assign the
combined sorted result back to ensure all prefixes (both existing and newly
degraded) are preserved.
In `@services/core/auth/src/nmp/core/auth/app/policy_tests/caller_kind_test.rego`:
- Line 1: The package declaration at the top of the file uses the name
`authz_test`, but this does not align with the directory-package naming
convention expected by Regal linting rules. Rename the package statement from
`package authz_test` to match the expected package structure based on the file's
directory hierarchy, which should follow the pattern of converting the directory
path segments into a dotted package name (typically something like
`nmp.core.auth.app.policy_tests` based on standard Rego conventions). This will
resolve the lint error about the package/directory mismatch.
In `@services/core/auth/tests/test_embedded_pdp.py`:
- Around line 823-830: The test case at line 826 expects allowed=True for the
path "/apis/brokenplugin-extra/x", but this endpoint is not registered in the
authz data (static-authz.yaml). According to fail-closed behavior, unknown
routes must be denied. Change the test expectation from True to False for the
"/apis/brokenplugin-extra/x" case to align with the deny-by-default behavior for
unregistered endpoints, or alternatively register this endpoint in the authz
data if it should be permitted.
---
Outside diff comments:
In `@packages/nemo_platform_plugin/src/nemo_platform_plugin/jobs/routes.py`:
- Around line 48-76: Remove the TYPE_CHECKING conditional block and import
Callable and NemoJob directly as runtime imports from their respective modules
(collections.abc and nemo_platform_plugin.job). Then remove the string quotes
from all type hints using these symbols in the add_job_routes function signature
and parameters (lines 19-28 and throughout). Apply the same direct import
pattern and remove string quotes at the other affected locations mentioned in
the comment: lines 198, 209, 224, 256, and 258.
In `@packages/nemo_platform_plugin/src/nemo_platform_plugin/service.py`:
- Around line 14-25: The health route handler in the MyService example is
missing the `@path_rule` decorator, which will cause an explicit deny when the
code is used. Add the `@path_rule` decorator to the health async function to
properly define the permission rule for this route endpoint.
In `@services/core/auth/scripts/auth-tools.py`:
- Around line 1353-1374: The code only detects new endpoint paths by comparing
path keys, missing method changes on existing paths. To fix this, after
computing added_paths from the set difference of before_endpoints and
after_endpoints, also identify modified paths by iterating through paths that
exist in both before and after states and comparing their merged authz
methods/configurations. Update the reporting loop to iterate through both
added_paths and modified_paths, displaying changes for each. Finally, update the
condition at the end that prints "No new endpoint paths" to check if both
added_paths and modified_paths are empty, so that method changes on existing
paths are properly reported and don't incorrectly show as "No new endpoint
paths."
---
Nitpick comments:
In `@e2e/authz_oidc/README.md`:
- Around line 1-82: Add a Prerequisites section immediately after the main title
in e2e/authz_oidc/README.md, before the "Black-box verification..." paragraph,
listing any required tools, dependencies, or setup needed. Then add a Next Steps
section at the end of the file after the Known limits section, including
cross-links to related documentation content such as the main authz
documentation, other e2e test harnesses, or relevant policy/JWT validation
documentation.
In `@plugins/example-plugin/tests/test_authz.py`:
- Around line 9-39: The test function
test_example_authz_derivation_has_no_problems currently does not verify that the
warnings returned from _derive_service_contribution are empty, which allows
authz coverage warnings to slip through. Add an assertion to check that
_warnings is an empty list, placing it alongside the existing assertion that
problems is an empty list, to ensure no warnings are generated during the authz
derivation process.
In `@plugins/nemo-auditor/tests/test_authz.py`:
- Around line 29-32: The test on line 32 only verifies that permissions are
declared in contrib.permissions but does not verify that the target endpoints
(GET for read, DELETE for delete) actually use those permissions in their
bindings. Add assertions similar to line 30 (checking the POST endpoint) to
verify that contrib.endpoints[targets]["get"].permissions includes
"auditor.targets.read" and contrib.endpoints[targets]["delete"].permissions
includes "auditor.targets.delete", ensuring that wrong endpoint permission
stamps cannot pass the test.
In `@plugins/nemo-customizer/tests/test_router.py`:
- Around line 191-197: The test verifies that the permissions exist in
contribution.permissions, but does not verify that the actual job creation
endpoints are bound to these permissions under the hub prefix. Add assertions
after the existing permission checks to verify that the automodel jobs endpoint
and unsloth jobs endpoint exist in contribution.endpoints under their respective
hub prefix paths (like /apis/customization/automodel/jobs and
/apis/customization/unsloth/jobs), and that these endpoints reference the
correct permissions. This will ensure the endpoints are not only created but
also properly connected to the required permissions.
In `@plugins/nemo-evaluator/tests/test_authz.py`:
- Around line 27-35: The test currently only validates permissions for a subset
of job-collection endpoints (post, get, delete on the jobs path). Expand the
test assertions to check callers and scopes attributes in addition to
permissions for the existing endpoints, and add new assertions for the remaining
job routes (get/{name}, cancel, logs, and results) with their corresponding
permissions, callers, and scopes. This ensures evaluator permission mappings are
fully validated locally rather than relying only on central factory tests to
catch regressions.
In `@services/core/auth/src/nmp/core/auth/app/policies/authz.rego`:
- Around line 169-172: The binding `path := extract_path` is computed but never
referenced elsewhere in the rule, creating an unnecessary variable assignment
that impacts hot-path performance. Remove the unused `path := extract_path` line
while keeping the other bindings and conditions (method and endpoint_scan
checks) intact.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: dc5ab413-96ac-4aa3-a1b5-3efbdb86562f
📒 Files selected for processing (79)
docs/set-up/config-reference.mdxe2e/authz_oidc/README.mde2e/authz_oidc/conftest.pye2e/authz_oidc/fixtures/harness-broken/pyproject.tomle2e/authz_oidc/fixtures/harness-broken/src/harness_broken/service.pye2e/authz_oidc/fixtures/harness-fixture/pyproject.tomle2e/authz_oidc/fixtures/harness-fixture/src/harness_fixture/service.pye2e/authz_oidc/fixtures/harness-unruled/pyproject.tomle2e/authz_oidc/fixtures/harness-unruled/src/harness_unruled/service.pye2e/authz_oidc/harness.pye2e/authz_oidc/idp.pye2e/authz_oidc/matrix.pye2e/authz_oidc/report.pye2e/authz_oidc/test_authz_matrix.pypackages/nemo_platform_plugin/src/nemo_platform_plugin/authz.pypackages/nemo_platform_plugin/src/nemo_platform_plugin/authz_discovery.pypackages/nemo_platform_plugin/src/nemo_platform_plugin/authz_format.pypackages/nemo_platform_plugin/src/nemo_platform_plugin/customization_contributor.pypackages/nemo_platform_plugin/src/nemo_platform_plugin/discovery.pypackages/nemo_platform_plugin/src/nemo_platform_plugin/functions/routes.pypackages/nemo_platform_plugin/src/nemo_platform_plugin/jobs/api_factory.pypackages/nemo_platform_plugin/src/nemo_platform_plugin/jobs/routes.pypackages/nemo_platform_plugin/src/nemo_platform_plugin/service.pypackages/nemo_platform_plugin/tests/test_authz.pypackages/nemo_platform_plugin/tests/test_authz_failmode.pypackages/nemo_platform_plugin/tests/test_discovery.pypackages/nemo_platform_plugin/tests/test_factory_authz.pypackages/nemo_platform_plugin/tests/test_functions_routes.pypackages/nemo_platform_plugin/tests/test_path_rule.pypackages/nmp_common/tests/auth/test_authz_format.pypackages/nmp_customization_common/src/nmp/customization_common/contributor/base.pyplugins/example-plugin/src/nemo_example_plugin/_perms.pyplugins/example-plugin/src/nemo_example_plugin/middleware_service.pyplugins/example-plugin/src/nemo_example_plugin/service.pyplugins/example-plugin/tests/test_authz.pyplugins/nemo-agents/src/nemo_agents_plugin/api/v2/_perms.pyplugins/nemo-agents/src/nemo_agents_plugin/api/v2/agents.pyplugins/nemo-agents/src/nemo_agents_plugin/api/v2/deployment_logs.pyplugins/nemo-agents/src/nemo_agents_plugin/api/v2/deployments.pyplugins/nemo-agents/src/nemo_agents_plugin/api/v2/gateway.pyplugins/nemo-agents/src/nemo_agents_plugin/service.pyplugins/nemo-agents/tests/test_authz.pyplugins/nemo-agents/tests/unit/test_service.pyplugins/nemo-anonymizer/src/nemo_anonymizer_plugin/service.pyplugins/nemo-anonymizer/tests/unit/test_authz.pyplugins/nemo-auditor/src/nemo_auditor/api/v2/_perms.pyplugins/nemo-auditor/src/nemo_auditor/api/v2/configs.pyplugins/nemo-auditor/src/nemo_auditor/api/v2/targets.pyplugins/nemo-auditor/src/nemo_auditor/service.pyplugins/nemo-auditor/tests/test_authz.pyplugins/nemo-customizer/src/nemo_customizer/router.pyplugins/nemo-customizer/tests/test_router.pyplugins/nemo-data-designer/src/nemo_data_designer_plugin/service.pyplugins/nemo-data-designer/tests/unit/test_authz.pyplugins/nemo-evaluator/src/nemo_evaluator/service.pyplugins/nemo-evaluator/tests/test_authz.pyplugins/nemo-safe-synthesizer/src/nemo_safe_synthesizer_plugin/api/v2/jobs/endpoints.pyplugins/nemo-safe-synthesizer/src/nemo_safe_synthesizer_plugin/service.pyplugins/nemo-safe-synthesizer/tests/unit/test_authz.pyplugins/nemo-safe-synthesizer/tests/unit/test_service.pyplugins/nemo-unsloth/src/nemo_unsloth_plugin/contributor.pyplugins/nemo-unsloth/tests/test_contributor.pyservices/core/auth/scripts/auth-tools.pyservices/core/auth/src/nmp/core/auth/app/bundle.pyservices/core/auth/src/nmp/core/auth/app/policies/authz.regoservices/core/auth/src/nmp/core/auth/app/policies/common.regoservices/core/auth/src/nmp/core/auth/app/policies/extract.regoservices/core/auth/src/nmp/core/auth/app/policies/scopes.regoservices/core/auth/src/nmp/core/auth/app/policy_tests/caller_kind_test.regoservices/core/auth/src/nmp/core/auth/app/policy_tests/deny_test.regoservices/core/auth/src/nmp/core/auth/app/policy_tests/generic_entities_deny_test.regoservices/core/auth/src/nmp/core/auth/app/policy_tests/namespace_access_test.regoservices/core/auth/src/nmp/core/auth/app/policy_tests/namespace_creation_test.regoservices/core/auth/src/nmp/core/auth/app/policy_tests/platform_admin_test.regoservices/core/auth/src/nmp/core/auth/app/policy_tests/scopes_test.regoservices/core/auth/src/nmp/core/auth/app/policy_tests/unknown_endpoint_test.regoservices/core/auth/src/nmp/core/auth/config.pyservices/core/auth/tests/test_bundle.pyservices/core/auth/tests/test_embedded_pdp.py
💤 Files with no reviewable changes (1)
- packages/nemo_platform_plugin/tests/test_discovery.py
- bundle: union denied_plugin_prefixes with any pre-existing static fence instead of overwriting it, so a runtime-degraded plugin cannot reopen a statically fenced namespace. - e2e harness: add a timeout to fixture plugin install, close the platform generator on setup failure, and join the OIDC issuer thread on stop; give Case.expected/body concrete types (drops two type: ignores). - plugin authz tests: assert discovery queries the nemo.services group, keep the distinct-permission-set test inside its own namespace, gate the safe-synthesizer derivation test on its optional runtime, and assert every contributed unsloth route carries a @path_rule. Signed-off-by: Max Dubrinsky <mdubrinsky@nvidia.com>
Replace the per-adapter permission_namespace / api_area / permission_id / permission_description kwargs on add_function_routes, add_job_routes, and job_route_factory with a single AuthzScope value object. AuthzScope is the front door for plugin route authz: it owns the <namespace>.<action> id format (previously hand-built in two places) and mints Permission objects via .permission(*segments, description=...). scope is the coarse OAuth grouping; namespace defaults to it and deepens via .child(...) for plugins whose permission namespace nests below the scope (agents, customization contributors). Both job verbs and function invokes now derive through the same .permission() path. Collapsing the two inputs into one typed object makes the old "namespace/api_area must be set together" guard unrepresentable, so it's removed from both adapters. The function adapter keeps an optional permission_description hatch (and rejects it without authz, to avoid silently discarding it). permission_id is dropped (no callers; ids are always derived). Migrate all call sites: example, data-designer, anonymizer, evaluator, safe-synthesizer, agents, and the shared customization BaseContributor. Signed-off-by: Max Dubrinsky <mdubrinsky@nvidia.com>
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
packages/nemo_platform_plugin/src/nemo_platform_plugin/jobs/api_factory.py (1)
423-428:⚠️ Potential issue | 🟠 Major | ⚡ Quick winReturn the tarball, not the directory.
Directory artifacts create
tar_path, thenFileResponsestill points atoutput_path; directory downloads will fail instead of serving the archive.Fix
if output_path.is_dir(): filename = f"{output_path.name}.tar.gz" tar_path = output_path.parent / filename with tarfile.open(tar_path, "w:gz") as tar: tar.add(output_path, arcname=os.path.basename(output_path)) - return FileResponse(path=output_path, filename=filename) + return FileResponse(path=tar_path, filename=filename)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/nemo_platform_plugin/src/nemo_platform_plugin/jobs/api_factory.py` around lines 423 - 428, In the conditional block where output_path is a directory, the code creates a tarball at tar_path but then returns FileResponse pointing to the original output_path instead of the newly created tar_path. Fix the FileResponse instantiation to use tar_path as the path parameter instead of output_path so that the created tarball archive is served to the client rather than attempting to serve the directory.packages/nemo_platform_plugin/src/nemo_platform_plugin/jobs/routes.py (1)
48-65: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winUse concrete runtime type imports here.
Callableand same-packageNemoJobare still imported underTYPE_CHECKINGand referenced as string annotations. Import them normally and remove the quoted hints.Proposed fix
-from typing import TYPE_CHECKING, Any +from collections.abc import Callable +from typing import Any from fastapi import APIRouter from fastapi.routing import APIRoute from nemo_platform_plugin.authz import AuthzScope -from nemo_platform_plugin.job import job_collection_path_for +from nemo_platform_plugin.job import NemoJob, job_collection_path_for @@ -if TYPE_CHECKING: - from collections.abc import Callable - - from nemo_platform_plugin.job import NemoJob - - def add_job_routes( - job_cls: type["NemoJob"], + job_cls: type[NemoJob], @@ - generate_job_name: "Callable[..., str] | None" = None, + generate_job_name: Callable[..., str] | None = None, @@ -def _derive_service_name(job_cls: type["NemoJob"]) -> str: +def _derive_service_name(job_cls: type[NemoJob]) -> str: @@ -def _derive_job_type(job_cls: type["NemoJob"]) -> str: +def _derive_job_type(job_cls: type[NemoJob]) -> str: @@ -def _adapt_to_spec(job_cls: type["NemoJob"]) -> "Callable[..., Any]": +def _adapt_to_spec(job_cls: type[NemoJob]) -> Callable[..., Any]: @@ - job_cls: type["NemoJob"], + job_cls: type[NemoJob], @@ -) -> "Callable[..., Any]": +) -> Callable[..., Any]:As per coding guidelines, “Always prefer concrete type hints over string-based ones in Python code; do not import types under TYPE_CHECKING, instead import types as regular imports when possible” and “Prefer concrete type hints over string-based type hints; do not use TYPE_CHECKING imports for types that are runtime-available in the same package.”
Also applies to: 68-76, 197-208, 223-257
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/nemo_platform_plugin/src/nemo_platform_plugin/jobs/routes.py` around lines 48 - 65, Move the imports of Callable from collections.abc and NemoJob from nemo_platform_plugin.job out of the TYPE_CHECKING conditional block and into the regular imports at the top of the file. Then remove any string-based type annotations (quoted type hints) where Callable and NemoJob are used throughout the file and replace them with direct type references. This same fix needs to be applied at lines 68-76, 197-208, and 223-257 where these types are similarly referenced as string annotations.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@packages/nemo_platform_plugin/src/nemo_platform_plugin/jobs/api_factory.py`:
- Around line 423-428: In the conditional block where output_path is a
directory, the code creates a tarball at tar_path but then returns FileResponse
pointing to the original output_path instead of the newly created tar_path. Fix
the FileResponse instantiation to use tar_path as the path parameter instead of
output_path so that the created tarball archive is served to the client rather
than attempting to serve the directory.
In `@packages/nemo_platform_plugin/src/nemo_platform_plugin/jobs/routes.py`:
- Around line 48-65: Move the imports of Callable from collections.abc and
NemoJob from nemo_platform_plugin.job out of the TYPE_CHECKING conditional block
and into the regular imports at the top of the file. Then remove any
string-based type annotations (quoted type hints) where Callable and NemoJob are
used throughout the file and replace them with direct type references. This same
fix needs to be applied at lines 68-76, 197-208, and 223-257 where these types
are similarly referenced as string annotations.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 93c2fab8-fc53-4445-ba74-9c689accba72
📒 Files selected for processing (16)
packages/nemo_platform_plugin/src/nemo_platform_plugin/authz.pypackages/nemo_platform_plugin/src/nemo_platform_plugin/functions/routes.pypackages/nemo_platform_plugin/src/nemo_platform_plugin/jobs/api_factory.pypackages/nemo_platform_plugin/src/nemo_platform_plugin/jobs/routes.pypackages/nemo_platform_plugin/tests/test_factory_authz.pypackages/nemo_platform_plugin/tests/test_functions_routes.pypackages/nmp_customization_common/src/nmp/customization_common/contributor/base.pyplugins/example-plugin/src/nemo_example_plugin/service.pyplugins/nemo-agents/src/nemo_agents_plugin/service.pyplugins/nemo-anonymizer/src/nemo_anonymizer_plugin/service.pyplugins/nemo-data-designer/src/nemo_data_designer_plugin/service.pyplugins/nemo-evaluator/src/nemo_evaluator/service.pyplugins/nemo-safe-synthesizer/src/nemo_safe_synthesizer_plugin/api/v2/jobs/endpoints.pyplugins/nemo-safe-synthesizer/src/nemo_safe_synthesizer_plugin/service.pyplugins/nemo-safe-synthesizer/tests/unit/test_service.pyplugins/nemo-unsloth/tests/test_contributor.py
🚧 Files skipped from review as they are similar to previous changes (7)
- plugins/nemo-unsloth/tests/test_contributor.py
- plugins/nemo-data-designer/src/nemo_data_designer_plugin/service.py
- plugins/example-plugin/src/nemo_example_plugin/service.py
- plugins/nemo-safe-synthesizer/src/nemo_safe_synthesizer_plugin/service.py
- packages/nmp_customization_common/src/nmp/customization_common/contributor/base.py
- plugins/nemo-evaluator/src/nemo_evaluator/service.py
- plugins/nemo-safe-synthesizer/tests/unit/test_service.py
| ) | ||
| scope_sets = {None if rule.scopes is None else frozenset(rule.scopes) for rule in rules} | ||
| if len(scope_sets) > 1: | ||
| raise ValueError( |
There was a problem hiding this comment.
I wonder if we want to declare the scope for the endpoint as a separate decorator, and ensure there's only one, and then we can make this error impossible. Does that make sense?
|
|
||
| Returns ``(permissions, scopes, callers)`` for the representative rule. | ||
| """ | ||
| perm_sets = {frozenset(p.id for p in rule.permissions) for rule in rules} |
There was a problem hiding this comment.
I'm expecting us to group by caller here, but what it's actually doing is grouping by id. Is that right?
| f"distinct permission sets — use one rule with shared permissions, or a single " | ||
| f"rule listing multiple callers." | ||
| ) | ||
| scope_sets = {None if rule.scopes is None else frozenset(rule.scopes) for rule in rules} |
There was a problem hiding this comment.
same here, expecting to group by caller
ironcommit
left a comment
There was a problem hiding this comment.
Amazing work, super happy with how it came to gether. Just a few questions and clarifications.
|
|
||
| platform_admin_exempt if { | ||
| platform_admin_in_system | ||
| data.authz.config.platform_admin_exempt_from_service_only == true |
There was a problem hiding this comment.
I get that we are being defensive here, but do we really need to make a distinction between service and non-service routes at all? Shouldn't a PlatformAdmin just have access to everything always regardless if it's principal or service principal? I don't totally understand why we need this distinction at all, and I would err on the side of just keeing things simple.
| on_invalid, | ||
| "; ".join(result.problems), | ||
| ) | ||
| if on_invalid == "hard_fail": |
There was a problem hiding this comment.
this "hard_fail" the default? I'm expecting a fast fail on misconfiguration.
| # coverage can't be trusted: | ||
| # * no route enumerated at all (load/derivation failure) — the runner may still | ||
| # mount this plugin via a separate instantiation; OR | ||
| # * quarantine — _quarantine_contribution only rewrites the routes derivation SAW, |
There was a problem hiding this comment.
not sure what SAW means here. It's also not clear to me what the usecase is here for quarantine. There might be a purpose for it, it's just not clear from the code yet.
| # The offending routes are always emitted as explicit denies; this controls the blast | ||
| # radius. deny_route: deny only the bad routes. quarantine: deny the whole plugin. | ||
| # hard_fail: refuse to build the OPA bundle. | ||
| on_invalid_plugin: Literal["deny_route", "quarantine", "hard_fail"] = Field( |
There was a problem hiding this comment.
I would argue for hard_fail as the default for misconfiguration.
There was a problem hiding this comment.
Maybe consider just removing the config field all together, if everything is just fast fail on configuration error, then this simplifies. I don't see a use case for this.
| # data.authz.config.platform_admin_exempt_from_service_only. | ||
| platform_admin_exempt_from_service_only: bool = Field( | ||
| default=False, | ||
| description="Allow a human PlatformAdmin on SERVICE_PRINCIPAL-only plugin routes.", |
There was a problem hiding this comment.
I'm not sure if "human" is correct here. this is really a question of if "principal" can access a route that is defined for "service_principal" only. We don't really know if it's human or not, we only know "principal/service_pricipal". I think principal means an OIDC user, and service principal roughly maps to either an API key, or other non-oidc access. So the question is, should an OIDC user be able to access an endpoint that is restricted to service keys only, and I think the answer should be always no. Happy to consider a use case where the answer might be yes.
…hz cache _collapse_rules: rename perm_sets/scope_sets to distinct_permission_sets/distinct_scope_sets and note that callers are the only OR'd dimension. discover_plugin_authz: fetch/store the cache via _cached_plugin_authz()/_cache_plugin_authz() helpers instead of touching the global directly. Signed-off-by: Max Dubrinsky <mdubrinsky@nvidia.com>
Add AuthzScope.scopes(write=)/read_scopes()/write_scopes(), built from the OAuth scope area (not the permission namespace, so .child() keeps the area).
Each plugin declares one shared AuthzScope as SCOPE in its authz module; the route generators and hand-written routes (agents, auditor, evaluator, example) derive scopes from it, dropping the repeated scopes_for("<area>", ...) literals. Wire-preserving: derived endpoint scopes are unchanged.
Signed-off-by: Max Dubrinsky <mdubrinsky@nvidia.com>
…tp-authz/md Signed-off-by: Max Dubrinsky <mdubrinsky@nvidia.com> # Conflicts: # plugins/example-plugin/src/nemo_example_plugin/service.py # plugins/nemo-evaluator/src/nemo_evaluator/service.py
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
plugins/nemo-evaluator/src/nemo_evaluator/api/v2/metrics.py (1)
6-6: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value
from __future__ import annotationsmakes every annotation string-based.Guidelines require concrete type hints, not string-based ones. All hints here reference already-imported names, so dropping this import keeps them eager and concrete.
As per coding guidelines: "Always prefer concrete type hints over string-based ones in Python code".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/nemo-evaluator/src/nemo_evaluator/api/v2/metrics.py` at line 6, The `from __future__ import annotations` statement at the beginning of the metrics.py file converts all type annotations to strings, which violates the coding guideline requiring concrete type hints. Remove this import statement from the file since all type hints used in this module reference already-imported names, and removing it will preserve eager evaluation of type annotations as concrete types rather than strings.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@plugins/nemo-evaluator/src/nemo_evaluator/api/v2/metrics.py`:
- Around line 109-120: The exception messages in the EntityValidationError and
ValueError except blocks are being logged directly without sanitization,
bypassing the _sanitize_for_log protection against log injection. Sanitize the
exception object `e` by passing it through the _sanitize_for_log function before
including it in the logger.warning calls at lines where you log `{e}` directly.
Apply this sanitization consistently to both the EntityValidationError handler
and the ValueError handler to ensure all user-controlled exception text is
properly sanitized before appearing in logs.
---
Nitpick comments:
In `@plugins/nemo-evaluator/src/nemo_evaluator/api/v2/metrics.py`:
- Line 6: The `from __future__ import annotations` statement at the beginning of
the metrics.py file converts all type annotations to strings, which violates the
coding guideline requiring concrete type hints. Remove this import statement
from the file since all type hints used in this module reference
already-imported names, and removing it will preserve eager evaluation of type
annotations as concrete types rather than strings.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 4d5c1ebc-9038-403e-a12f-f667ddcf9da6
📒 Files selected for processing (6)
docs/set-up/config-reference.mdxplugins/example-plugin/src/nemo_example_plugin/service.pyplugins/nemo-evaluator/src/nemo_evaluator/api/v2/metrics.pyplugins/nemo-evaluator/src/nemo_evaluator/authz.pyplugins/nemo-evaluator/src/nemo_evaluator/service.pyservices/core/auth/scripts/auth-tools.py
💤 Files with no reviewable changes (1)
- services/core/auth/scripts/auth-tools.py
✅ Files skipped from review due to trivial changes (2)
- plugins/nemo-evaluator/src/nemo_evaluator/authz.py
- docs/set-up/config-reference.mdx
🚧 Files skipped from review as they are similar to previous changes (2)
- plugins/nemo-evaluator/src/nemo_evaluator/service.py
- plugins/example-plugin/src/nemo_example_plugin/service.py
There was a problem hiding this comment.
Caution
Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.
Actionable comments posted: 1
🧹 Nitpick comments (1)
plugins/nemo-evaluator/src/nemo_evaluator/api/v2/metrics.py (1)
6-6: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value
from __future__ import annotationsmakes every annotation string-based.Guidelines require concrete type hints, not string-based ones. All hints here reference already-imported names, so dropping this import keeps them eager and concrete.
As per coding guidelines: "Always prefer concrete type hints over string-based ones in Python code".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/nemo-evaluator/src/nemo_evaluator/api/v2/metrics.py` at line 6, The `from __future__ import annotations` statement at the beginning of the metrics.py file converts all type annotations to strings, which violates the coding guideline requiring concrete type hints. Remove this import statement from the file since all type hints used in this module reference already-imported names, and removing it will preserve eager evaluation of type annotations as concrete types rather than strings.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@plugins/nemo-evaluator/src/nemo_evaluator/api/v2/metrics.py`:
- Around line 109-120: The exception messages in the EntityValidationError and
ValueError except blocks are being logged directly without sanitization,
bypassing the _sanitize_for_log protection against log injection. Sanitize the
exception object `e` by passing it through the _sanitize_for_log function before
including it in the logger.warning calls at lines where you log `{e}` directly.
Apply this sanitization consistently to both the EntityValidationError handler
and the ValueError handler to ensure all user-controlled exception text is
properly sanitized before appearing in logs.
---
Nitpick comments:
In `@plugins/nemo-evaluator/src/nemo_evaluator/api/v2/metrics.py`:
- Line 6: The `from __future__ import annotations` statement at the beginning of
the metrics.py file converts all type annotations to strings, which violates the
coding guideline requiring concrete type hints. Remove this import statement
from the file since all type hints used in this module reference
already-imported names, and removing it will preserve eager evaluation of type
annotations as concrete types rather than strings.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 4d5c1ebc-9038-403e-a12f-f667ddcf9da6
📒 Files selected for processing (6)
docs/set-up/config-reference.mdxplugins/example-plugin/src/nemo_example_plugin/service.pyplugins/nemo-evaluator/src/nemo_evaluator/api/v2/metrics.pyplugins/nemo-evaluator/src/nemo_evaluator/authz.pyplugins/nemo-evaluator/src/nemo_evaluator/service.pyservices/core/auth/scripts/auth-tools.py
💤 Files with no reviewable changes (1)
- services/core/auth/scripts/auth-tools.py
✅ Files skipped from review due to trivial changes (2)
- plugins/nemo-evaluator/src/nemo_evaluator/authz.py
- docs/set-up/config-reference.mdx
🚧 Files skipped from review as they are similar to previous changes (2)
- plugins/nemo-evaluator/src/nemo_evaluator/service.py
- plugins/example-plugin/src/nemo_example_plugin/service.py
🛑 Comments failed to post (1)
plugins/nemo-evaluator/src/nemo_evaluator/api/v2/metrics.py (1)
109-120: 🔒 Security & Privacy | 🟡 Minor | ⚡ Quick win
Raw exception messages bypass
_sanitize_for_log.You sanitize
workspace/namebut log{e}unsanitized (Lines 110, 119). Exception text can carry user-controlled input, reopening the log-injection vector_sanitize_for_logwas added to close.🛡️ Proposed fix
except EntityValidationError as e: - logger.warning(f"Entity store validation error during metric creation: {e}") + logger.warning(f"Entity store validation error during metric creation: {_sanitize_for_log(e)}") raise HTTPException(status_code=status.HTTP_422_UNPROCESSABLE_ENTITY, detail=str(e)) except ValueError as e: if "already exists" in str(e).lower(): logger.warning(f"Metric already exists: {safe_workspace}/{safe_name}") raise HTTPException( status_code=status.HTTP_409_CONFLICT, detail=f"Metric with workspace '{workspace}' and name '{name}' already exists", ) - logger.warning(f"Metric creation validation error: {e}") + logger.warning(f"Metric creation validation error: {_sanitize_for_log(e)}") raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="Invalid metric data")🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/nemo-evaluator/src/nemo_evaluator/api/v2/metrics.py` around lines 109 - 120, The exception messages in the EntityValidationError and ValueError except blocks are being logged directly without sanitization, bypassing the _sanitize_for_log protection against log injection. Sanitize the exception object `e` by passing it through the _sanitize_for_log function before including it in the logger.warning calls at lines where you log `{e}` directly. Apply this sanitization consistently to both the EntityValidationError handler and the ValueError handler to ensure all user-controlled exception text is properly sanitized before appearing in logs.
Summary
Plugin HTTP authorization is currently split between two mechanisms: the
nemo.authz/get_authz_contribution()surface and the service routers. This change makesNemoServicethe single source of plugin auth policy. A plugin attaches authorization rules directly to its route handlers, and the platform derives the policy from the mounted routes when it builds the OPA bundle. There is no separate permission list to keep in sync.The model is fail-closed. Any route the platform cannot confidently authorize becomes an explicit deny, so a missing or malformed rule cannot fall through to an allow.
How it works
@path_rule(...)on each route handler listing the caller kinds and permissions that route requires.auth.on_invalid_plugin(default: deny only the offending routes).Behavior changes worth checking during review
systemworkspace rather than only authentication. This is not a regression on a default deployment: the platform seeds a wildcard Viewer binding insystemfor all authenticated users, so everyone already holds it. It only changes behavior for deployments that do not have that grant.Performance
The new rules run on every request and pushed policy evaluation cost up enough to matter for the embedded engine. The per-request endpoint and workspace lookups are now computed once per evaluation instead of repeated at each call site, which brings cost back well under budget.
How to verify
opa test services/core/auth/src/nmp/core/auth/app/policies services/core/auth/src/nmp/core/auth/app/policy_tests services/core/auth/src/nmp/core/auth/assets/static-authz.yaml. Thestatic-authz.yamlargument is required; without it the inference-gateway cases fail because they read endpoint data from that file.test_authz.pyasserting it derives a valid policy with every route ruled and every permission inside its own namespace.e2e/authz_oidcruns the policy against a live platform using real signed JWTs from a test OIDC issuer (pytest e2e/authz_oidc --run-e2e).Commits
refactor: the model, bundle validation, and the plugin migrations.perf: the policy-engine evaluation-cost fix.test: the end-to-end OIDC harness.Out of scope
Authentication and machine identity are unchanged. Plugin-defined roles, plugin-defined scopes, and IAM/UI surfacing are not part of this change.
Summary by CodeRabbit
New Features
Bug Fixes & Improvements
Configuration
auth.on_invalid_pluginto control fail-mode (deny_route,quarantine,hard_fail).auth.platform_admin_exempt_from_service_onlyfor PlatformAdmin exemptions.Documentation & Testing