Skip to content

feat: set nodegroup and computezone on agent#200

Merged
ambermingxin merged 6 commits into
mainfrom
feat/add-ng-cz
May 22, 2026
Merged

feat: set nodegroup and computezone on agent#200
ambermingxin merged 6 commits into
mainfrom
feat/add-ng-cz

Conversation

@ambermingxin
Copy link
Copy Markdown
Collaborator

@ambermingxin ambermingxin commented May 21, 2026

Description

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

Summary by CodeRabbit

  • New Features
    • Enrollment adds optional --nodegroup and --compute-zone flags; values persist and are propagated to inventory upserts, health exports (OTLP resource attributes), and collected health data.
  • Bug Fixes
    • Unenroll removes nodegroup and compute_zone from local metadata.
  • Validation
    • Added 255-char length checks for nodegroup and compute_zone in outbound validation.
  • Tests
    • Added/updated tests for storing, clearing when explicitly empty, metadata propagation, export behavior, and non‑fatal metadata read errors.
  • Documentation
    • Usage docs updated to describe the new flags and metadata behavior.

Review Change Stack

Signed-off-by: Amber Xue <ambermingxin@nvidia.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 21, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds optional node metadata flags (--nodegroup, --compute-zone) to enrollment; builds nullable EnrollMetadata, normalizes and conditionally persists node metadata; propagates values into backend NodeUpsert requests and OTLP resource attributes; adds cleanup, validation, tests, and docs.

Changes

Node Metadata Enrollment and System Integration

Layer / File(s) Summary
Agent State Interface and Storage Layer
internal/agentstate/state.go, internal/agentstate/sqlite.go, internal/agentstate/sqlite_test.go
New metadata key constants (MetadataKeyNodeGroup, MetadataKeyComputeZone) and interface methods (GetNodeGroup/SetNodeGroup/GetComputeZone/SetComputeZone) are defined and implemented in SQLite-backed persistent state with round-trip test coverage.
Enrollment Metadata Capture and Persistence
cmd/fleetint/root.go, cmd/fleetint/enroll.go, internal/enrollment/enrollment.go, cmd/fleetint/enroll_test.go, internal/enrollment/enrollment_test.go
New --nodegroup and --compute-zone CLI flags are wired through enrollCommand into an EnrollMetadata struct (nil when flags omitted), threaded into the updated EnrollWithConfigAndMetadata entry point, normalized (trimmed), and conditionally persisted to agent state. Tests validate flag propagation, normalization, preservation, and clearing semantics.
Backend DTO and Inventory Sink Integration
internal/backendclient/types.go, internal/inventory/sink/backend.go, internal/inventory/sink/backend_test.go
NodeUpsertRequest DTO is extended with NodeGroup and ComputeZone fields. Inventory sink reads optional metadata from agent state and populates the upsert request, marshals the request for logging, and treats metadata-read errors as non-fatal. Tests cover populated, missing, and error-injection scenarios.
Health Data Collection and OTLP Export
internal/exporter/collector/collector.go, internal/exporter/exporter.go, internal/exporter/converter/otlp.go, internal/exporter/converter/otlp_test.go
HealthData collector struct is extended with NodeGroup and ComputeZone fields. Exporter reads optional metadata from storage and populates the health data. OTLP converter conditionally adds node_group and compute_zone resource attributes for non-empty values. Tests verify attribute presence when populated and absence when unset.
Unenroll Cleanup, Validation, and Test Helpers
cmd/fleetint/unenroll.go, cmd/fleetint/unenroll_test.go, internal/attestation/backend_test.go, internal/validation/outbound/validator.go, docs/usage.md, go.mod
Unenroll command removes metadata keys during cleanup. Validator enforces 255-character max-length for NodeGroup and ComputeZone in upsert requests. Attestation test stub and inventory/backfill tests are extended. Documentation describes new flags and metadata persistence behavior (preserve when omitted, overwrite when provided, clear when empty). Indirect Go module bumps included.

Sequence Diagram(s)

sequenceDiagram
  participant User as User
  participant CLI as enrollCommand
  participant EnrollFlow as EnrollWithConfigAndMetadata
  participant StateDB as sqliteState (agent state)
  participant Inventory as Backend Inventory Sink
  participant Exporter as OTLP Exporter
  
  User->>CLI: fleetint enroll --nodegroup=ng1 --compute-zone=us-west
  CLI->>CLI: Build EnrollMetadata from flags
  CLI->>EnrollFlow: EnrollWithConfigAndMetadata(metadata)
  EnrollFlow->>EnrollFlow: Normalize optional fields (trim whitespace)
  EnrollFlow->>StateDB: SetNodeGroup("ng1")
  EnrollFlow->>StateDB: SetComputeZone("us-west")
  StateDB->>StateDB: Persist to gpud_metadata table
  
  Note over Inventory: Later during inventory export
  Inventory->>StateDB: GetNodeGroup()
  Inventory->>StateDB: GetComputeZone()
  StateDB-->>Inventory: ("ng1", true, nil), ("us-west", true, nil)
  Inventory->>Inventory: Populate NodeUpsertRequest fields
  Inventory->>Inventory: Call UpsertNode with nodeGroup/computeZone
  
  Note over Exporter: During metrics export
  Exporter->>StateDB: GetNodeGroup()
  Exporter->>StateDB: GetComputeZone()
  StateDB-->>Exporter: Retrieved metadata values
  Exporter->>Exporter: Populate HealthData fields
  Exporter->>Exporter: Add OTLP resource attributes
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • mukilsh

Poem

🐇 A rabbit hops through metadata plains,
Enrolling nodes with nodegroup chains,
Compute zones stored in SQLite's keep,
Through inventory and metrics they leap,
From CLI flags to observability deep! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 10.71% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately summarizes the main feature: adding support for setting nodegroup and computezone metadata on the agent during enrollment.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/add-ng-cz

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/usage.md`:
- Line 143: The sentence listing stored metadata uses the human-friendly key
`compute-zone` but the persisted state uses `compute_zone`; update the sentence
to use the persisted key name so readers inspecting the metadata see the exact
key (replace `compute-zone` with `compute_zone` in the metadata list that
mentions `nodegroup`, `compute-zone`).

In `@internal/backendclient/types.go`:
- Around line 39-40: The NodeGroup and ComputeZone fields in the struct should
be changed from plain string to *string with `json:"...,omitempty"` to preserve
tri-state semantics (nil = unset, non-nil empty = explicit clear, non-nil
non-empty = value); update the struct declarations for NodeGroup and ComputeZone
to use *string and add omitempty, and modify the sink/constructors that populate
this type to assign pointers only when a value is actually present (create &str
or nil accordingly) so you don't accidentally overwrite backend metadata.

In `@internal/inventory/sink/backend.go`:
- Around line 105-109: The current code marshals and logs the full upsert
request using json.Marshal(req) and log.Logger.Infow("inventory node upsert
request", ...) which exposes sensitive machine metadata; change this to either
log a redacted version of req (create a shallow copy of req, clear or replace
sensitive fields such as machine ID, system UUID, private/private_ip, MAC
addresses, etc., then marshal that) or lower the log level to Debug (use
log.Logger.Debugw) so full payload is not emitted at info; keep the existing
warn-on-marshal-error behavior and ensure nodeUUID remains in the log for
correlation while the request body is redacted or debug-only.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: e8082c7c-9b99-4e51-b014-54e4328b5f74

📥 Commits

Reviewing files that changed from the base of the PR and between 9dda598 and c6fce9f.

📒 Files selected for processing (20)
  • cmd/fleetint/enroll.go
  • cmd/fleetint/enroll_test.go
  • cmd/fleetint/root.go
  • cmd/fleetint/unenroll.go
  • cmd/fleetint/unenroll_test.go
  • docs/usage.md
  • internal/agentstate/sqlite.go
  • internal/agentstate/sqlite_test.go
  • internal/agentstate/state.go
  • internal/attestation/backend_test.go
  • internal/backendclient/types.go
  • internal/enrollment/enrollment.go
  • internal/enrollment/enrollment_test.go
  • internal/exporter/collector/collector.go
  • internal/exporter/converter/otlp.go
  • internal/exporter/converter/otlp_test.go
  • internal/exporter/exporter.go
  • internal/inventory/sink/backend.go
  • internal/inventory/sink/backend_test.go
  • internal/validation/outbound/validator.go

Comment thread docs/usage.md Outdated
Comment thread internal/backendclient/types.go
Comment thread internal/inventory/sink/backend.go Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c6fce9fdd4

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread internal/inventory/sink/backend.go Outdated
Signed-off-by: Amber Xue <ambermingxin@nvidia.com>
Copy link
Copy Markdown
Contributor

@mukilsh mukilsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but can we address this one comment? Then will approve

Comment thread internal/exporter/converter/otlp.go Outdated
Signed-off-by: Amber Xue <ambermingxin@nvidia.com>
Signed-off-by: Amber Xue <ambermingxin@nvidia.com>
Copy link
Copy Markdown
Contributor

@mukilsh mukilsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Signed-off-by: Amber Xue <ambermingxin@nvidia.com>
Comment thread cmd/fleetint/enroll.go Outdated
Comment thread docs/usage.md Outdated
Signed-off-by: Amber Xue <ambermingxin@nvidia.com>
@ambermingxin ambermingxin merged commit 18dfbe6 into main May 22, 2026
9 checks passed
@ambermingxin ambermingxin deleted the feat/add-ng-cz branch May 22, 2026 17:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants