From 0450557be248da7c29d426fe1ab265486c28c021 Mon Sep 17 00:00:00 2001
From: Luke Sy <sylukewicent@gmail.com>
Date: Mon, 23 Feb 2026 05:25:29 +1100
Subject: [PATCH 1/5] Add design documents: MANIFESTO, ROSGRAPH proposal, and
 FAQ

Three founding documents for the rosgraph project:
- MANIFESTO.md: project direction (why, what, how)
- ROSGRAPH.md: full technical proposal (schema, architecture, phasing)
- FAQ.md: audience-organized FAQ covering 9 perspectives

Signed-off-by: Luke Sy <sylukewicent@gmail.com>
---
 docs/FAQ.md       |  874 +++++++++++++++++++++++
 docs/MANIFESTO.md |   15 +
 docs/ROSGRAPH.md  | 1724 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 2613 insertions(+)
 create mode 100644 docs/FAQ.md
 create mode 100644 docs/MANIFESTO.md
 create mode 100644 docs/ROSGRAPH.md

diff --git a/docs/FAQ.md b/docs/FAQ.md
new file mode 100644
index 0000000..b01d3c6
--- /dev/null
+++ b/docs/FAQ.md
@@ -0,0 +1,874 @@
+# rosgraph — Frequently Asked Questions
+
+> **Parent:** [ROSGRAPH.md](ROSGRAPH.md) (technical proposal)
+
+Organized by who's asking. Find your perspective, jump to the
+questions that matter to you.
+
+---
+
+## Table of Contents
+
+1. [New ROS Developer](#1-new-ros-developer)
+2. [AI-Assisted Developer](#2-ai-assisted-developer)
+3. [Engineering Lead / System Integrator / DevOps](#3-engineering-lead--system-integrator--devops)
+4. [Safety-Critical Engineer](#4-safety-critical-engineer)
+5. [MoveIt / nav2 / Popular Module User](#5-moveit--nav2--popular-module-user)
+6. [The Skeptic](#6-the-skeptic)
+7. [Package Maintainer / ROS Governance](#7-package-maintainer--ros-governance)
+8. [Educator / University Researcher](#8-educator--university-researcher)
+9. [Embedded / Resource-Constrained Developer](#9-embedded--resource-constrained-developer)
+
+---
+
+## 1. New ROS Developer
+
+### What problem does rosgraph solve?
+
+ROS 2 doesn't verify that your nodes are wired correctly until
+runtime — and often not even then. Type mismatches between publishers
+and subscribers fail silently. QoS incompatibilities drop connections
+with no error. Parameter renames break launch files with no build
+error.
+
+rosgraph catches these at build time. See [PROPOSAL.md §1, "The
+Problem, Concretely"](PROPOSAL.md#the-problem-concretely) for four
+real-world examples.
+
+### How much do I need to learn?
+
+Write one `interface.yaml` per node (~15 lines for a basic pub/sub
+node). Run three commands:
+
+```bash
+rosgraph generate .   # generates code
+rosgraph lint .       # checks for issues
+rosgraph monitor      # watches the running system
+```
+
+The YAML schema has IDE autocompletion via JSON Schema. See the [Quick
+Start](PROPOSAL.md#quick-start-what-it-looks-like) for a complete
+minimal example.
+
+### What do I stop doing when I adopt rosgraph?
+
+- **Stop writing pub/sub boilerplate.** Publisher creation, subscriber
+  setup, parameter declaration — all generated from `interface.yaml`.
+- **Stop manually syncing parameters between code and launch files.**
+  `interface.yaml` is the single source of truth for parameter names,
+  types, defaults, and validation ranges.
+- **Stop debugging silent QoS mismatches.** `rosgraph lint` catches
+  incompatible QoS profiles before you launch.
+- **Stop wondering if your launch files reference the right nodes.**
+  `rosgraph lint` validates node refs, remappings, and parameter
+  overrides.
+
+### Will error messages actually be helpful?
+
+Error quality is a design requirement, not an afterthought. The
+architecture follows Ruff's model ([PROPOSAL.md
+§10.3](PROPOSAL.md#103-static-analysis-architecture)):
+
+- Every diagnostic includes a rule code (`TOP001`), the location in
+  `interface.yaml`, and what's wrong.
+- Safe fixes can be auto-applied. Unsafe fixes are flagged but not
+  auto-applied.
+- SARIF output enables inline annotations in GitHub PRs.
+- `--add-noqa` generates suppression comments for existing issues,
+  so you can adopt gradually without noise.
+
+### Do I need to learn YAML schema syntax?
+
+Not really. If your editor has the YAML Language Server (most do),
+you get autocompletion, inline validation, and hover docs from the
+JSON Schema ([PROPOSAL.md §6, G2](PROPOSAL.md#6-feature-list)). Write
+a few fields, let the editor fill in the structure.
+
+---
+
+## 2. AI-Assisted Developer
+
+### How does rosgraph work with AI coding tools?
+
+`interface.yaml` is a machine-readable contract — exactly what LLMs
+are good at consuming and generating. The `InterfaceDescriptor` IR
+([PROPOSAL.md §3.3](PROPOSAL.md#33-the-interfacedescriptor-ir)) is a
+JSON blob containing a node's complete API: topics, types, QoS,
+parameters, lifecycle state. An AI agent reads this to understand what
+a node does, generate implementation code, write tests, or suggest
+fixes — without parsing C++ or Python source.
+
+See [PROPOSAL.md §3.13](PROPOSAL.md#313-ai--tooling-integration) for
+the full AI integration design.
+
+### Can I use `rosgraph generate` as an agent tool?
+
+Yes. An AI agent writing a ROS node can:
+1. Generate `interface.yaml` from a natural language description
+2. Run `rosgraph generate .` as a tool call to get type-safe
+   scaffolding
+3. Write only the business logic into the generated skeleton
+4. Run `rosgraph lint .` to verify the graph is correct
+
+This avoids the common failure mode of LLMs hallucinating ROS
+boilerplate (wrong QoS defaults, missing component registration,
+incorrect parameter declaration).
+
+### Will there be an MCP server?
+
+It's architecturally planned ([PROPOSAL.md
+§3.13](PROPOSAL.md#313-ai--tooling-integration)). An MCP server
+would expose:
+- Graph state (which nodes exist, what they publish/subscribe)
+- Lint results (current issues in the workspace)
+- Interface schemas (what a specific node expects)
+- Resolved topic names (after remapping/namespacing)
+
+This lets Claude Code, Cursor, or Copilot answer "what topics does
+the perception pipeline publish?" from structured data, not grep.
+
+### Can an AI generate `interface.yaml` from a description?
+
+Yes — the constrained schema makes this tractable. The schema has
+~10 top-level keys with well-defined types. "I need a node that
+subscribes to a lidar point cloud, filters it, and publishes the
+result" produces a valid `interface.yaml` that `rosgraph generate`
+immediately scaffolds.
+
+`rosgraph discover` ([PROPOSAL.md
+§3.10](PROPOSAL.md#310-rosgraph-discover--runtime-to-spec-generation))
+can also generate `interface.yaml` from a running node, which an LLM
+can then refine — adding descriptions, suggesting QoS rationale, and
+grouping related interfaces.
+
+### What about IDE / LSP integration?
+
+Phase 1 delivers JSON Schema validation (IDE autocompletion for
+`interface.yaml`). A dedicated LSP server would add:
+- Hover for message type definitions
+- Go-to-definition for `$ref` targets
+- Inline diagnostics from `rosgraph lint`
+- Cross-file rename support
+
+This benefits both human developers and AI agents operating within
+IDE contexts. See [PROPOSAL.md
+§3.13](PROPOSAL.md#313-ai--tooling-integration).
+
+---
+
+## 3. Engineering Lead / System Integrator / DevOps
+
+### Who owns an `interface.yaml`?
+
+The node author defines it. Downstream consumers depend on the
+installed version in `share/<package>/interfaces/`. Changes are
+coordinated via:
+
+- `rosgraph breaking` ([PROPOSAL.md
+  §3.9](PROPOSAL.md#39-rosgraph-breaking--breaking-change-detection))
+  — automated detection of breaking changes in CI, blocking merges
+  that break downstream consumers.
+- Installed interfaces — downstream teams depend on published
+  interfaces without pulling source code.
+- Semantic versioning alignment — breaking = major, dangerous = minor,
+  safe = patch.
+
+See [PROPOSAL.md §3.14](PROPOSAL.md#314-scale--fleet-considerations).
+
+### How does this scale to hundreds of packages?
+
+- **Lint performance target:** 100 packages in under 5 seconds
+  (Design Principle 7). Analysis is single-pass over the graph model
+  with parallel per-package processing and content caching.
+- **Multi-workspace analysis:** Installed `interface.yaml` files in
+  underlays serve as cached facts. Only your workspace is analyzed,
+  not the entire underlay. See [PROPOSAL.md
+  §3.12](PROPOSAL.md#312-multi-workspace-analysis).
+- **Differential analysis:** `--new-only` reports only issues
+  introduced since the base branch. No noise from existing code.
+- **Per-package configuration:** Override lint rules per package via
+  `rosgraph.toml`.
+
+### I compose nodes from multiple vendors. How does rosgraph help?
+
+`system.yaml` (Layer 2 schema, [PROPOSAL.md
+§3.2](PROPOSAL.md#32-schema-layers)) declares the intended system
+composition — which nodes, which namespaces, which parameter overrides,
+which remappings. `rosgraph lint` validates the composed graph:
+
+- **Type mismatches** across package boundaries (Node A publishes
+  `Twist`, Node B subscribes expecting `TwistStamped`)
+- **QoS incompatibilities** between a vendor's publisher and your
+  subscriber
+- **Disconnected subgraphs** — nodes that should be connected but
+  aren't due to a namespace or remapping error
+- **Invalid remappings** — remaps pointing to nonexistent topics
+
+If a vendor doesn't ship `interface.yaml`, use `rosgraph discover`
+([PROPOSAL.md
+§3.10](PROPOSAL.md#310-rosgraph-discover--runtime-to-spec-generation))
+to generate one from a running instance of the vendor's node. The
+discovered spec becomes your integration contract.
+
+### How does rosgraph fit into CI?
+
+rosgraph is CI-first by design (Design Principle 8):
+
+```yaml
+# GitHub Actions example
+- name: Lint graph
+  run: rosgraph lint . --output-format sarif --new-only --base main
+  # SARIF output → GitHub Security tab, PR annotations
+
+- name: Check breaking changes
+  run: rosgraph breaking --base main
+  # Exit code 1 if breaking changes detected
+
+- name: Run contract tests
+  run: rosgraph test
+  # Schema-driven interface conformance tests
+```
+
+Output formats: `text`, `json`, `sarif` (GitHub Security tab),
+`github` (Actions annotations), `junit` (test reports). All
+configurable via `rosgraph.toml` or `--output-format`. See
+[PROPOSAL.md §3.11](PROPOSAL.md#311-configuration).
+
+For brownfield adoption, `--add-noqa` generates inline suppression
+comments for all existing issues, creating a clean baseline. You
+don't get 500 warnings on your first PR.
+
+### What about the colcon build workflow?
+
+`colcon-rosgraph` (Phase 2) is a thin colcon verb plugin that delegates
+to the standalone `rosgraph` binary. It adds `colcon lint`,
+`colcon docs`, `colcon discover`, and `colcon breaking` — iterating
+packages in dependency order with parallel execution. See [PROPOSAL.md
+§3.15](PROPOSAL.md#315-colcon-integration).
+
+Phase 1 works standalone: `rosgraph lint .` in any directory. No
+colcon dependency required.
+
+### What about fleet-level monitoring?
+
+`rosgraph monitor` runs per-robot. For fleet-scale observability:
+
+- The Prometheus `/metrics` exporter (M7) enables standard Grafana
+  dashboards aggregated across the fleet.
+- The `/rosgraph/diff` topic on each robot can be bridged to a
+  central system for aggregated drift analysis.
+- The architecture uses standard observability patterns (Prometheus,
+  structured logs, `/diagnostics`) rather than inventing fleet-specific
+  infrastructure.
+
+Runtime performance targets: reconciliation < 500ms for 200 nodes,
+< 50MB memory, < 5% CPU at steady state. See [PROPOSAL.md
+§3.14](PROPOSAL.md#314-scale--fleet-considerations).
+
+### Can we enforce org-specific conventions?
+
+Yes. `rosgraph.toml` supports per-package rule overrides, custom
+naming patterns, and rule selection. The Spectral-inspired YAML-native
+rule system ([PROPOSAL.md
+§10.3](PROPOSAL.md#103-static-analysis-architecture)) means a
+robotics engineer can write custom rules without knowing Rust or C++.
+
+### Does rosgraph handle launch file complexity?
+
+Three strategies, phased by tractability ([PROPOSAL.md
+§3.5](PROPOSAL.md#35-rosgraph-lint--static-analysis)):
+
+1. **YAML launch files** — fully parseable, Phase 1
+2. **`system.yaml`** — static composition schema, fully analyzable,
+   Phase 1
+3. **Python launch AST** — pattern matching for `Node()`,
+   `LaunchConfiguration()`, etc., Phase 2
+
+Python launch files with complex conditionals, loops, or dynamically
+computed node sets can't be fully statically analyzed. `system.yaml`
+is the escape hatch for systems that need full analyzability.
+
+---
+
+## 4. Safety-Critical Engineer
+
+### Does rosgraph help with certification?
+
+rosgraph is not a safety tool — it's a development and verification
+tool that produces artifacts useful in safety cases. See [PROPOSAL.md
+§11](PROPOSAL.md#11-safety--certification) for the full mapping.
+
+Key artifacts:
+
+| rosgraph artifact | Evidence type |
+|---|---|
+| `interface.yaml` | Software architecture description |
+| `rosgraph lint` SARIF output | Static analysis results |
+| `rosgraph monitor` logs | Runtime verification evidence |
+| `rosgraph test` results | Interface conformance evidence |
+| `rosgraph breaking` output | Change impact analysis |
+
+### Which safety standards does this map to?
+
+IEC 61508 (general functional safety), ISO 26262 (automotive),
+IEC 62304 (medical), DO-178C (aerospace), ISO 13482 (service robots),
+and ISO 21448 / SOTIF. See [PROPOSAL.md
+§11.1](PROPOSAL.md#111-relevant-standards) for how rosgraph maps to
+each.
+
+### What about behavioral properties?
+
+Phase 1-2 covers structural properties: type matches, QoS
+compatibility, graph connectivity. This is a necessary precondition
+for behavioral safety — you can't reason about message timing if the
+messages aren't connected correctly.
+
+Behavioral analysis (Phase 3+) adds temporal and causal properties,
+inspired by HAROS HPL:
+
+```
+globally: /emergency_stop causes /motor_disable within 100ms
+globally: /heartbeat absent_for 500ms causes /safe_stop
+```
+
+See [PROPOSAL.md §11.4](PROPOSAL.md#114-behavioral-properties-future).
+
+### Are monitor alert thresholds configurable?
+
+Yes. The defaults (10s for `NodeMissing`, 30s for `UnexpectedNode`)
+are tuned for general robotics. Safety-critical deployments override
+them via `rosgraph.toml`:
+
+```toml
+[monitor.alerts]
+NodeMissing = { grace_period_ms = 1000, severity = "critical" }
+TopicMissing = { grace_period_ms = 500, severity = "critical" }
+```
+
+See [PROPOSAL.md §11.3](PROPOSAL.md#113-configurable-safety-levels).
+
+### Are there safety-specific lint rules?
+
+Planned for Phase 2-3:
+
+| Rule | Description |
+|---|---|
+| `SAF001` | Critical subscriber has < N publishers (no redundancy) |
+| `SAF002` | Single point of failure in graph topology |
+| `SAF003` | Safety-critical node is not lifecycle-managed |
+| `TF001` | Declared `frame_id` not published by any node |
+| `TF002` | Broken frame chain (no transform path) |
+
+The analyzer architecture supports adding these without changes.
+See [PROPOSAL.md §11.5](PROPOSAL.md#115-safety-relevant-lint-rules-future).
+
+### What about determinism and real-time guarantees?
+
+`rosgraph monitor` is an observation tool, not a safety-critical
+component. It runs in its own process, does not interfere with the
+monitored system, and its failure does not affect the system under
+observation. It is not designed to be real-time safe.
+
+For hard real-time requirements, the monitor's output (Prometheus
+metrics, diagnostics topics) can be consumed by a separate real-time
+safety monitor. rosgraph provides the graph model; the real-time
+enforcement layer is a separate concern.
+
+### What about audit trails?
+
+`rosgraph lint` produces SARIF output with timestamps, tool version,
+rule versions, and results. This can be stored as CI artifacts for
+audit purposes. A dedicated audit log format for `rosgraph monitor`
+(continuous verification evidence) is not in Phase 1 but the
+structured output (JSON, SARIF) makes it straightforward to add.
+
+---
+
+## 5. MoveIt / nav2 / Popular Module User
+
+### Does rosgraph work with nav2's plugin system?
+
+Yes, via the mixin system ([PROPOSAL.md
+§3.2](PROPOSAL.md#32-schema-layers)). Plugins that inject interfaces
+into a host node are declared as mixins:
+
+```yaml
+# nodes/follow_path/interface.yaml
+node:
+  name: follow_path
+  package: nav2_controller
+
+parameters:
+  controller_plugin:
+    type: string
+    default_value: "dwb_core::DWBLocalPlanner"
+
+mixins:
+  - ref: dwb_core/dwb_local_planner   # brings in max_vel_x, etc.
+  - ref: nav2_costmap_2d/costmap       # brings in costmap params
+```
+
+The host's effective interface = its own declaration + all mixin
+interfaces merged. This gives `rosgraph lint` and `rosgraph monitor`
+the complete picture.
+
+Mixins are Phase 2 (G15). Phase 1 works for nodes without plugins.
+
+### What happens when I switch plugins (e.g., DWB → MPPI)?
+
+You update the mixin reference in `interface.yaml`. The effective
+interface changes at build time, and `rosgraph generate` produces new
+scaffolding. This is a build-time concern — `rosgraph lint` validates
+the graph with the new plugin's interface.
+
+If the plugin is selected at runtime via parameter, this falls under
+"dynamic interfaces" (Design Principle 12) — rosgraph declares the
+static portion and `rosgraph monitor` flags unexpected interfaces.
+
+### Does rosgraph validate TF frames?
+
+Planned for Phase 2-3. `TF001` checks that declared `frame_id` values
+are published by some node in the graph. `TF002` checks that frame
+chains are connected (no broken transform paths). See [PROPOSAL.md
+§11.5](PROPOSAL.md#115-safety-relevant-lint-rules-future).
+
+TF is the #1 source of silent bugs in ROS 2 navigation and
+manipulation. This is high-value but requires the graph model to
+include TF publisher information, which depends on `interface.yaml`
+having a `frame_id` annotation.
+
+### What about `generate_parameter_library` compatibility?
+
+Full compatibility is a non-negotiable design principle ([PROPOSAL.md
+§2, DP9](PROPOSAL.md#2-design-principles)). The `parameters:` section
+of `interface.yaml` IS the `generate_parameter_library` format. A
+standalone gen_param_lib YAML file works as-is when placed in
+`interface.yaml`. rosgraph delegates to gen_param_lib at build time.
+See [PROPOSAL.md §9.2](PROPOSAL.md#92-tool-assessments).
+
+### Can rosgraph lint my existing launch files?
+
+Phase 1 supports YAML launch files (direct parse) and `system.yaml`
+(Layer 2 schema). Phase 2 adds Python launch file AST analysis for
+standard `launch_ros` patterns — `Node()`, `LaunchConfiguration()`,
+`DeclareLaunchArgument()`.
+
+Limitations: Python launch files that use conditionals, loops, or
+dynamically computed node sets cannot be fully statically analyzed.
+`system.yaml` is the escape hatch for systems that need full static
+analyzability. See [PROPOSAL.md
+§3.5](PROPOSAL.md#35-rosgraph-lint--static-analysis).
+
+### Does this work with Gazebo / Isaac Sim?
+
+Simulators expose ROS interfaces that look identical to real hardware.
+`rosgraph discover` can introspect a simulated system and generate
+`interface.yaml`. `rosgraph monitor` can verify that a simulated
+system matches the declared graph. `rosgraph lint` doesn't
+distinguish between real and simulated — it validates the graph model.
+
+### What about message type changes across ROS distros?
+
+`interface.yaml` references message types by name (e.g.,
+`geometry_msgs/msg/Twist`). Message type compatibility across distros
+is a ROS infrastructure concern, not a rosgraph concern. rosgraph
+validates that publishers and subscribers on the same topic agree on
+type — it doesn't validate that the type definition itself is
+compatible across distros.
+
+`rosgraph breaking` can detect when a type reference changes between
+versions of an `interface.yaml`.
+
+---
+
+## 6. The Skeptic
+
+### I write good tests. Why do I need another YAML file?
+
+Tests catch type mismatches and QoS issues at launch time — after you
+wait 30 seconds for the stack to start, watch it fail, read the logs,
+and figure out which of 40 nodes has the wrong type. Then you fix it,
+rebuild, relaunch, and wait again.
+
+`rosgraph lint` catches the same bugs in under 5 seconds, before
+launch, in CI, before anyone else has to debug it. It's the difference
+between "tests catch bugs" and "bugs never reach the test phase."
+
+### What's the overhead?
+
+Per node: one `interface.yaml` file (~15-30 lines). Most of it is
+information you're already specifying in code (topic names, message
+types, QoS settings, parameter names) — `interface.yaml` centralizes
+it.
+
+What you get back:
+- No pub/sub boilerplate (generated)
+- No parameter declaration boilerplate (generated via
+  `generate_parameter_library`)
+- Pre-launch graph validation
+- Runtime graph monitoring
+- Auto-generated API documentation
+
+The net line-count change is typically negative for nodes with
+parameters.
+
+### What if rosgraph can't express what I need?
+
+Escape hatches:
+- **`# rosgraph: noqa: TOP001`** — suppress specific lint rules per
+  line.
+- **Per-package ignores** — exclude entire packages from specific
+  rules via `rosgraph.toml`.
+- **Undeclared interfaces** — if your code creates publishers that
+  aren't in `interface.yaml`, the code still works. `rosgraph monitor`
+  flags them as `UnexpectedTopic` (a warning, not an error).
+- **Composition pattern** — generated code holds a `rclcpp::Node`
+  (has-a), not inherits from it. You always have access to the
+  underlying node for anything the schema can't express.
+
+See [PROPOSAL.md §12](PROPOSAL.md#12-scope--limitations) for the full
+limitations discussion.
+
+### Does code generation add runtime overhead?
+
+The composition pattern (has-a Node) adds one level of indirection
+compared to direct inheritance. This is a pointer dereference — single
+nanoseconds. The generated pub/sub wrappers are thin forwarding calls.
+No virtual dispatch is added that wouldn't already exist in the ROS
+client library.
+
+The parameter validation code (from `generate_parameter_library`) runs
+at parameter-set time, not in the hot path.
+
+### What happens when only my package has an `interface.yaml`?
+
+You still get:
+- **Code generation** — less boilerplate in your node
+- **Parameter validation** — runtime type and range checking
+- **Self-documentation** — your node's API is machine-readable
+
+Cross-package value (type mismatch detection, QoS compatibility
+checking, contract testing) grows with adoption. `rosgraph discover`
+lets you generate specs for neighboring packages from a running
+system, bootstrapping the cross-package graph incrementally.
+
+### This proposal has 51 features. Is this realistic?
+
+Phase 1 ([PROPOSAL.md §4](PROPOSAL.md#4-phasing)) is the commitment:
+~12 features covering core schema, basic code generation, and
+highest-value lint and monitor rules. Later phases are contingent on
+adoption.
+
+The tool builds on existing work — cake for code generation,
+`generate_parameter_library` for parameters, `graph-monitor` message
+definitions for runtime. Phase 1 is stabilizing and unifying existing
+pieces, not building from scratch.
+
+### Won't the spec just drift from reality like NoDL?
+
+NoDL died because it was a pure description format — no code
+generation. Maintaining a spec that doesn't produce anything is
+thankless work.
+
+`interface.yaml` generates code. If you change the spec, the generated
+code changes. If you change the code without changing the spec,
+`rosgraph monitor` flags the discrepancy at runtime. The two-way
+binding (codegen + runtime monitoring) is what prevents the drift
+that killed NoDL.
+
+The honest limitation: business logic is hand-written. If a developer
+adds an undeclared publisher inside a callback, `rosgraph lint` won't
+catch it at build time. `rosgraph monitor` catches it at runtime as
+`UnexpectedTopic`. See [PROPOSAL.md
+§12](PROPOSAL.md#12-scope--limitations).
+
+### When should I NOT use rosgraph?
+
+- **Quick prototyping** — single throwaway node, not worth the file.
+- **Single-node packages** — minimal lint value, though codegen may
+  still save boilerplate.
+- **Highly dynamic interfaces** — nodes that create/destroy publishers
+  at runtime based on conditions can't be fully declared.
+
+See [PROPOSAL.md §12, "When Not to Use
+rosgraph"](PROPOSAL.md#when-not-to-use-rosgraph).
+
+---
+
+## 7. Package Maintainer / ROS Governance
+
+### What does rosgraph mean for my package?
+
+If you maintain a ROS 2 package, `interface.yaml` is a machine-readable
+contract for your node's public API — topics, services, actions,
+parameters, QoS. It replaces the informal contract currently scattered
+across READMEs, launch file comments, and source code.
+
+For consumers of your package, this means:
+- **API discoverability.** `rosgraph docs` auto-generates browsable API
+  reference from your `interface.yaml`. No more stale READMEs.
+- **Breaking change visibility.** `rosgraph breaking` classifies
+  interface changes as breaking/dangerous/safe, giving downstream users
+  clear upgrade guidance. See [PROPOSAL.md
+  §3.9](PROPOSAL.md#39-rosgraph-breaking--breaking-change-detection).
+- **Contract testing.** Downstream packages can run `rosgraph test`
+  against your declared interface to verify compatibility. See
+  [PROPOSAL.md §3.7](PROPOSAL.md#37-rosgraph-test--contract-testing).
+
+### Do I have to adopt rosgraph to be compatible with it?
+
+No. Packages without `interface.yaml` are skipped, not errored (Design
+Principle 6). Downstream users can run `rosgraph discover` against your
+running node to generate a spec for their own use. Your package doesn't
+need to ship `interface.yaml` for others to benefit — though shipping
+one is much better, since discovered specs require human review and may
+miss QoS details.
+
+### How does this affect my release process?
+
+`rosgraph breaking` runs in CI comparing the current `interface.yaml`
+against the previous release. Breaking changes block the merge unless
+explicitly acknowledged. This is opt-in per package via `rosgraph.toml`
+and maps to semantic versioning: breaking = major, dangerous = minor,
+safe = patch. See [PROPOSAL.md
+§3.14](PROPOSAL.md#314-scale--fleet-considerations).
+
+### What about packages with plugin systems?
+
+If your package exposes a plugin API (like nav2's controller plugins),
+the mixin system (Phase 2, G15) lets plugin authors declare the
+interfaces they inject into the host node. The host's effective
+interface is the merge of its own declaration plus all mixin fragments.
+See [PROPOSAL.md §3.2](PROPOSAL.md#32-schema-layers).
+
+Until mixins ship in Phase 2, the host node's `interface.yaml` covers
+its own direct interfaces. Plugins that add extra topics/parameters
+are flagged by `rosgraph monitor` as unexpected — visible but not
+validated.
+
+### What's the adoption path toward `ros_core`?
+
+Deliberately incremental ([PROPOSAL.md §4, "Adoption
+Path"](PROPOSAL.md#adoption-path)):
+
+1. **`ros-tooling` organization** — institutional backing, CI
+   infrastructure, release process. graph-monitor already lives here.
+2. **REP for `interface.yaml` schema** — formalizes the declaration
+   format as a community standard, independent of the rosgraph tool.
+3. **docs.ros.org tutorial integration** — if "write your first node"
+   uses `interface.yaml`, every new ROS developer learns it from day
+   one. This is the highest-leverage adoption path.
+4. **`ros_core` proposal** — after demonstrated adoption across
+   multiple distros.
+
+### Why not extend existing tools instead?
+
+Each existing tool covers one capability but none covers the full
+scope. The gap analysis ([PROPOSAL.md
+§9.3](PROPOSAL.md#93-gap-analysis)) shows five major gaps: graph diff,
+graph linting, QoS static analysis, behavioral properties, and CI graph
+validation. No single existing tool can be extended to fill all five.
+
+rosgraph builds on existing work where possible:
+- `generate_parameter_library` for parameters (used as-is)
+- `rosgraph_monitor_msgs` for runtime message definitions (adopted)
+- cake's design decisions for code generation (validated)
+- HAROS's metamodel for the graph model (adapted)
+
+### What's the maintenance burden?
+
+Phase 1 is ~12 features covering core schema, basic codegen, and
+highest-value lint/monitor rules. The design minimizes ongoing
+maintenance:
+
+- **Schema versioning** (G14) — `schema_version` field with migration
+  tooling prevents breaking changes to `interface.yaml` format.
+- **IR-based plugin protocol** — code generation plugins are standalone
+  executables, independently maintained.
+- **Analyzer DAG** — lint rules are isolated, independently testable
+  values (not subclasses). Adding or removing a rule doesn't affect
+  others.
+
+The risk factor: this is a new tool, not an extension of something with
+existing momentum. It requires sustained contributor commitment.
+
+### How does this interact with the ROS 2 type system?
+
+rosgraph references existing `.msg`, `.srv`, and `.action` types — it
+doesn't replace them (Design Principle 9). `interface.yaml` declares
+which types a node uses; `rosidl` still defines the types themselves.
+
+The graph model ([PROPOSAL.md §3.1](PROPOSAL.md#31-the-graph-model))
+includes a `MessageTypeDB` that resolves type references to their
+definitions for compatibility checking. This uses the existing
+`rosidl` output — rosgraph doesn't parse `.msg` files directly.
+
+### What about governance and community standards?
+
+The REP process is the standard mechanism for formalizing ROS community
+standards. A REP for the `interface.yaml` schema would:
+
+- Define the YAML schema specification independent of the rosgraph tool
+- Allow alternative implementations (someone could build a different
+  tool that consumes the same schema)
+- Provide a formal review process for schema changes
+- Signal community endorsement
+
+The REP is Step 2 of the adoption path — after the tool has proven
+itself in `ros-tooling` with real users.
+
+### What's the risk if this doesn't get adopted?
+
+The worst case: rosgraph becomes another single-maintainer tool in the
+ecosystem (like cake and breadcrumb today). The mitigation strategy:
+
+- **`ros-tooling` hosting** — institutional backing reduces bus factor
+- **REP-based schema** — the schema outlives the tool if it becomes a
+  standard
+- **`generate_parameter_library` compatibility** — the parameters
+  portion works with the most mature tool in the space, regardless of
+  rosgraph's fate
+- **Standalone value** — even without ecosystem adoption, a single team
+  gets code generation and parameter validation from day one
+
+---
+
+## 8. Educator / University Researcher
+
+### Can I use rosgraph for teaching ROS 2?
+
+Yes, and this is one of the highest-leverage adoption paths. The Quick
+Start ([PROPOSAL.md §1](PROPOSAL.md#quick-start-what-it-looks-like))
+shows a complete workflow in 3 commands:
+
+```bash
+rosgraph generate .   # generates node scaffolding
+rosgraph lint .       # checks for issues
+rosgraph monitor      # watches the running system
+```
+
+For teaching, `interface.yaml` forces students to think about their
+node's API before writing implementation code — topics, types, QoS,
+parameters. This is better pedagogy than the current approach of
+copy-pasting publisher boilerplate and tweaking it.
+
+### Does this lower the barrier for students?
+
+Significantly. A student writes ~15 lines of YAML declaring what their
+node does, runs `rosgraph generate`, and gets a working scaffold with
+type-safe publishers, subscribers, and validated parameters. They write
+only the business logic. No boilerplate, no silent type mismatches, no
+mysterious QoS failures.
+
+Error messages are designed to be helpful — rule codes, file locations,
+clear descriptions of what's wrong and how to fix it. See [PROPOSAL.md
+§10.3](PROPOSAL.md#103-static-analysis-architecture).
+
+### How does rosgraph relate to HAROS?
+
+HAROS ([PROPOSAL.md §10.6](PROPOSAL.md#106-ros-domain-prior-art-haros))
+was the prior art for graph analysis in ROS — built at the University
+of Minho (2016–2021). rosgraph borrows HAROS's metamodel and HPL
+property language concepts, but differs fundamentally:
+
+- **HAROS extracted interfaces from source code.** rosgraph uses
+  explicit declarations (`interface.yaml`). Declarations are simpler,
+  more reliable, and enable code generation.
+- **HAROS was ROS 1 only.** rosgraph is built for ROS 2 concepts:
+  QoS, lifecycle, components, actions, DDS discovery.
+- **HAROS died because extraction broke.** catkin → ament, rospack →
+  colcon, XML launch → Python launch. Declaration-based tools don't
+  break when the build system changes.
+
+### Can I use rosgraph for research on ROS system verification?
+
+The graph model ([PROPOSAL.md §3.1](PROPOSAL.md#31-the-graph-model))
+is a structured representation of the ROS computation graph — nodes,
+topics, services, actions, parameters, QoS, connections. It's
+exportable as JSON via the `InterfaceDescriptor` IR ([PROPOSAL.md
+§3.3](PROPOSAL.md#33-the-interfacedescriptor-ir)).
+
+Research opportunities:
+- **Formal verification.** The graph model is a natural input for model
+  checkers. Behavioral properties (Phase 3+, [PROPOSAL.md
+  §11.4](PROPOSAL.md#114-behavioral-properties-future)) enable temporal
+  logic specifications.
+- **Static analysis.** The analyzer DAG architecture ([PROPOSAL.md
+  §3.5](PROPOSAL.md#35-rosgraph-lint--static-analysis)) supports custom
+  analysis passes without modifying core code.
+- **Runtime monitoring.** The declared-vs-observed diff ([PROPOSAL.md
+  §3.6](PROPOSAL.md#36-rosgraph-monitor--runtime-reconciliation)) is a
+  rich data source for anomaly detection research.
+- **ROS ecosystem studies.** Interface coverage, graph topology
+  patterns, common QoS configurations — all extractable from
+  `interface.yaml` files across the ecosystem.
+
+### What about publishing results that use rosgraph?
+
+The tool is open-source (planned for `ros-tooling` organization). The
+SARIF and JSON output formats produce structured, reproducible results
+suitable for academic publication. The graph model provides a formal
+vocabulary for describing ROS system architectures.
+
+---
+
+## 9. Embedded / Resource-Constrained Developer
+
+### Does rosgraph add runtime overhead to my nodes?
+
+The generated code uses a composition pattern (has-a `Node`, not is-a
+`Node`). This adds one pointer indirection — single nanoseconds. The
+generated pub/sub wrappers are thin forwarding calls. No virtual
+dispatch is added beyond what the ROS client library already uses.
+
+Parameter validation (via `generate_parameter_library`) runs at
+parameter-set time, not in the hot path. See [PROPOSAL.md §3.4,
+"Design decisions"](PROPOSAL.md#34-rosgraph-generate--code-generation).
+
+### Does `rosgraph monitor` run on the robot?
+
+Yes, but it's optional. `rosgraph monitor` is a separate process — it
+doesn't instrument or modify your nodes. If your platform can't spare
+the resources, don't run it. You still get full value from build-time
+tools (`rosgraph generate`, `rosgraph lint`).
+
+Runtime targets for `rosgraph monitor` ([PROPOSAL.md
+§3.14](PROPOSAL.md#314-scale--fleet-considerations)):
+- Memory: < 50MB resident
+- CPU: < 5% of one core at steady-state (5s scrape interval)
+- No additional DDS traffic beyond standard discovery
+
+For very constrained platforms, run `rosgraph monitor` off-board
+(e.g., on a companion computer) observing the same DDS domain.
+
+### Does rosgraph work with micro-ROS?
+
+micro-ROS nodes communicate via the standard DDS/XRCE-DDS bridge.
+`rosgraph discover` and `rosgraph monitor` observe them through the
+bridge like any other node. `interface.yaml` declarations work for
+micro-ROS nodes — the schema is language-agnostic.
+
+Code generation for micro-ROS C is not in Phase 1. The IR-based plugin
+architecture ([PROPOSAL.md
+§3.3](PROPOSAL.md#33-the-interfacedescriptor-ir)) supports adding a
+micro-ROS code generation plugin without changes to the core tool.
+
+### What about real-time constraints?
+
+`rosgraph monitor` is not real-time safe — it's an observation tool
+running in its own process. It does not interfere with the monitored
+system, and its failure does not affect the system under observation.
+
+For hard real-time requirements, the monitor's Prometheus metrics and
+diagnostics topics can be consumed by a separate real-time safety
+monitor. rosgraph provides the graph model; real-time enforcement is a
+separate concern. See [PROPOSAL.md
+§11](PROPOSAL.md#11-safety--certification).
+
+### Does the build toolchain add cross-compilation complexity?
+
+`rosgraph generate` runs at build time on the host, producing standard
+C++ and Python source files. These are compiled by the normal
+cross-compilation toolchain (`colcon build --cmake-args
+-DCMAKE_TOOLCHAIN_FILE=...`). rosgraph itself doesn't need to run on
+the target — it's a host-side tool, like `cmake` or `protoc`.
diff --git a/docs/MANIFESTO.md b/docs/MANIFESTO.md
new file mode 100644
index 0000000..70faf6b
--- /dev/null
+++ b/docs/MANIFESTO.md
@@ -0,0 +1,15 @@
+# ROSGraph — Direction
+
+## Why
+
+Robotics engineers spend too much time on ROS plumbing — writing boilerplate, debugging invisible wiring, and keeping launch files in sync with code — instead of building their application.
+
+## What
+
+A declarative, observable ROS graph. Engineers declare what their system should be; tooling generates the code and verifies the running system matches the spec.
+
+## How
+
+1. **Language** — a formal spec to describe node interfaces and system graphs.
+2. **Tooling** — translate declarations into working code.
+3. **Verification** — compare spec against reality, both at runtime and statically before launch.
diff --git a/docs/ROSGRAPH.md b/docs/ROSGRAPH.md
new file mode 100644
index 0000000..8469ec5
--- /dev/null
+++ b/docs/ROSGRAPH.md
@@ -0,0 +1,1724 @@
+# rosgraph — Technical Proposal
+
+> **Status:** Proposal
+> **Date:** 2026-02-22
+> **Parent:** [MANIFESTO.md](MANIFESTO.md) (direction)
+
+---
+
+## Table of Contents
+
+1. [Executive Summary](#1-executive-summary)
+2. [Design Principles](#2-design-principles)
+3. [Architecture](#3-architecture)
+4. [Phasing](#4-phasing)
+5. [Language Choice](#5-language-choice)
+6. [Feature List](#6-feature-list)
+7. [Lint Rule Codes](#7-lint-rule-codes)
+8. [Monitor Alert Rules](#8-monitor-alert-rules)
+9. [Existing ROS 2 Ecosystem](#9-existing-ros-2-ecosystem)
+10. [Prior Art](#10-prior-art)
+11. [Safety & Certification](#11-safety--certification)
+12. [Scope & Limitations](#12-scope--limitations)
+13. [Resolved Questions](#13-resolved-questions)
+
+---
+
+## 1. Executive Summary
+
+ROS 2 has no production-ready tool for verifying that a running system
+matches its declared architecture, no standard schema for declaring node
+interfaces, and no unified CLI for graph analysis. The ecosystem is
+fragmented across single-purpose tools with overlapping scope and bus
+factors of one.
+
+| Category | Capability | Current tool | Status |
+|---|---|---|---|
+| **Schema** | Node interface declaration | cake / nodl / gen_param_lib | cake early; nodl dead; gpl params-only |
+| **Codegen** | Static graph from launch files | breadcrumb + clingwrap | Early-stage, solo dev |
+| **Runtime** | Runtime graph monitoring | graph-monitor | Mid-stage, institutional |
+| **Runtime** | Runtime tracing | ros2_tracing | Mature, production |
+| **Runtime** | Latency analysis | CARET | Mature, Tier IV |
+| **Runtime** | Graph visualisation | Foxglove, Dear RosNodeViewer | Mature but live-only |
+| **Runtime** | **Graph diff (expected vs. actual)** | **Nothing** | **Major gap** |
+| **Static** | **Graph linting (pre-launch)** | **Nothing** | **Major gap** |
+| **Static** | **QoS static analysis** | breadcrumb (partial) | Early-stage |
+| **Static** | **CI graph validation** | **Nothing** | **Major gap** |
+| **Docs** | **Node API documentation** | **Nothing** (hand-written only) | **Major gap** |
+| — | **Behavioural properties** | **Nothing** (HPL was ROS 1) | **Major gap** |
+
+### The Problem, Concretely
+
+Today in ROS 2:
+
+- Node A publishes `/cmd_vel` as `Twist`. Node B subscribes to
+  `/cmd_vel` as `String`. You discover this at runtime — or don't,
+  because the subscriber silently receives nothing.
+- A publisher uses `BEST_EFFORT` QoS, a subscriber uses `RELIABLE`.
+  DDS refuses the connection. A warning is logged but easy to miss in
+  a busy console. The subscriber just never gets messages.
+- A node crashes mid-deployment. The rest of the system keeps running.
+  Nobody knows until a customer reports a failure 20 minutes later.
+- You rename a parameter. Three launch files reference the old name.
+  `colcon build` succeeds. The system launches. The parameter silently
+  takes its default value.
+
+These are real, common bugs in production ROS 2 systems. rosgraph
+catches all four — the first two at build time (`rosgraph lint`), the
+third at runtime (`rosgraph monitor`), the fourth at lint time.
+
+This document proposes **`rosgraph`** — a single tool with subcommands
+covering the four goals of the ROSGraph Working Group:
+
+```
+rosgraph
+├── rosgraph generate    (Goal 2: spec → code)
+├── rosgraph lint        (Goal 4: static graph analysis)
+├── rosgraph monitor     (Goal 3: runtime reconciliation)
+├── rosgraph test        (Goal 3: contract testing)
+├── rosgraph docs        (documentation generation)
+├── rosgraph breaking    (breaking change detection)
+└── rosgraph discover    (runtime → spec, brownfield adoption)
+```
+
+Three key insights drive the design:
+
+1. **The ROS computation graph is not source code — it is a typed,
+   directed graph with QoS-annotated edges.** Analysis tools should
+   operate on a graph model, not on ASTs. Source code parsing is a
+   loader that feeds the model, not the analysis target.
+
+2. **Goals 3–4 are schema conformance problems** ("does reality match
+   the spec?"), not traditional program analysis. Once you have a
+   machine-readable spec (`interface.yaml`), verification falls out
+   naturally — the same pattern as `buf lint`, Pact contract tests,
+   and Kubernetes reconciliation.
+
+3. **A declaration without code generation is a non-starter.** NoDL
+   proved this. The schema must generate code, documentation, and
+   validation to stay in sync with reality. `interface.yaml` is
+   simultaneously the source for code generation, the lint target for
+   static analysis, the contract for runtime verification, and the
+   reference for documentation.
+
+### Quick Start (What It Looks Like)
+
+A minimal `interface.yaml`:
+
+```yaml
+schema_version: "1.0"
+node:
+  name: talker
+  package: demo_pkg
+
+publishers:
+  - topic: ~/chatter
+    type: std_msgs/msg/String
+    qos: { reliability: RELIABLE, depth: 10 }
+
+parameters:
+  publish_rate:
+    type: double
+    default_value: 1.0
+    description: "Publishing rate in Hz"
+    validation:
+      bounds<>: [0.1, 100.0]
+```
+
+What you get:
+
+```bash
+rosgraph generate .   # → C++ header, Python module, parameter validation
+rosgraph lint .       # → "no issues" or "TOP001: type mismatch on /cmd_vel"
+rosgraph monitor      # → live diff: declared graph vs. running system
+```
+
+The generated code gives you a typed context struct with publishers,
+subscribers, and validated parameters — no boilerplate. You write
+business logic; rosgraph generates the wiring.
+
+---
+
+## 2. Design Principles
+
+### Core Philosophy
+
+1. **The graph is the program.** Analysis operates on the typed,
+   QoS-annotated computation graph — not source code ASTs. Source
+   parsing is a loader that feeds the model, not the analysis target.
+
+2. **Declare first, verify always.** `interface.yaml` is the single
+   source of truth. Code generation, static analysis, and runtime
+   monitoring all verify against the declaration.
+
+3. **One schema, many consumers.** The same `interface.yaml` drives
+   code generation, documentation, linting, monitoring, contract
+   testing, and security policy generation.
+
+4. **One tool, not ten.** `rosgraph` with subcommands replaces
+   fragmented single-purpose tools. One CLI, one config, one output
+   format.
+
+### Developer Experience
+
+5. **Zero-config value, progressive disclosure.** Given
+   `interface.yaml` files, the default rules catch real bugs (type
+   mismatches, QoS incompatibilities) with no additional configuration.
+   A minimal 10-line `interface.yaml` produces a working node;
+   lifecycle, mixins, and parameterized QoS are opt-in.
+
+6. **Brownfield first, gradual adoption.** `rosgraph discover`
+   generates specs from running nodes. `--add-noqa` suppresses existing
+   issues. Packages without `interface.yaml` are skipped, not errored.
+
+7. **Speed is a feature.** An architectural property, not an
+   afterthought. Target: lint a 100-package workspace in under 5
+   seconds.
+
+8. **Backward compatibility is non-negotiable.** Existing
+   `generate_parameter_library` YAML works as-is inside `parameters:`.
+   Existing `.msg`/`.srv`/`.action` files are referenced, not replaced.
+
+### Verification & CI
+
+9. **CI-first.** SARIF output, GitHub annotations, exit codes, and
+   differential analysis are primary design targets.
+
+10. **Validation at every stage.** Author time: JSON Schema. Build
+    time: structural + semantic. Launch time: declared vs. configured.
+    Runtime: declared vs. observed.
+
+11. **Correctness rules are errors; style rules are warnings.** Type
+    mismatches and QoS incompatibilities fail CI. Naming conventions
+    warn.
+
+### Scope
+
+12. **Declared interfaces are the primary target.** The schema
+    describes the *intended* interface — the same boundary drawn by
+    Protobuf, AsyncAPI, Smithy, and OpenAPI. For partially dynamic
+    nodes (e.g., nav2 plugin hosts), worst-case bounds can be declared
+    with `optional: true`; `rosgraph monitor` validates these at
+    runtime and flags truly undeclared interfaces as `UnexpectedTopic`.
+
+13. **Structural first, behavioural later.** Phase 1–2: type matches,
+    QoS compatibility, graph connectivity — the foundation that
+    safety-critical systems (ISO 26262, IEC 61508) require as evidence.
+    Behavioural properties (temporal/causal, e.g. "/e_stop causes
+    /motor_disable within 100ms") are Phase 3+, drawing on prior art
+    from HAROS HPL and runtime verification tools like STL/MTL
+    monitors. The structural graph model is designed to extend to
+    behavioural annotations without schema redesign.
+
+---
+
+## 3. Architecture
+
+One tool. One graph model. Four capabilities.
+
+```
+                         ┌──────────────────────┐
+                         │    Graph Model        │
+                         │  (shared library)     │
+                         │                       │
+                         │  Nodes, Topics,       │
+                         │  Services, Actions,   │
+                         │  Parameters, QoS,     │
+                         │  Connections           │
+                         └───────┬──────┬────────┘
+                                 │      │
+                    ┌────────────┘      └───────────────┐
+                    │                                   │
+          ┌─────────▼──────────┐            ┌──────────▼───────────┐
+          │  Build-time tools  │            │  Runtime tools        │
+          │                    │            │                       │
+          │  rosgraph generate │            │  rosgraph monitor     │
+          │  rosgraph lint     │            │  rosgraph test        │
+          │  rosgraph docs     │            │  rosgraph discover    │
+          │  rosgraph breaking │            │                       │
+          └────────────────────┘            └───────────────────────┘
+```
+
+### 3.1 The Graph Model
+
+A language-agnostic representation of the ROS computation graph. Every
+loader produces it; every analyzer consumes it.
+
+```
+ComputationGraph
+├── nodes: [NodeInterface]
+│   ├── name, namespace, package, executable
+│   ├── publishers:     [{topic, msg_type, qos}]
+│   ├── subscribers:    [{topic, msg_type, qos}]
+│   ├── services:       [{name, srv_type}]
+│   ├── clients:        [{name, srv_type}]
+│   ├── action_servers: [{name, action_type}]
+│   ├── action_clients: [{name, action_type}]
+│   ├── parameters:     [{name, type, default, validators}]
+│   └── lifecycle_state: str | None
+├── topics: [TopicInfo]
+│   ├── name, msg_type
+│   ├── publishers:  [NodeRef]
+│   ├── subscribers: [NodeRef]
+│   └── qos_profiles: [QoSProfile]
+├── services: [ServiceInfo]
+├── actions: [ActionInfo]
+└── connections: [Connection]
+    ├── source: NodeRef
+    ├── target: NodeRef
+    ├── channel: TopicRef | ServiceRef | ActionRef
+    └── qos_compatible: bool
+```
+
+### 3.2 Schema Layers
+
+Three schema levels, each building on the previous:
+
+**Layer 1 — Node Interface Schema** (per-node declaration)
+
+```yaml
+# interface.yaml
+schema_version: "1.0"
+
+node:
+  name: lidar_processor
+  package: perception_pkg
+  lifecycle: managed              # managed | unmanaged (default)
+
+parameters:
+  # Exact generate_parameter_library format (backward-compatible)
+  voxel_size:
+    type: double
+    default_value: 0.05
+    description: "Voxel grid filter leaf size (meters)"
+    validation:
+      bounds<>: [0.01, 1.0]
+    read_only: false
+  robot_frame:
+    type: string
+    default_value: "base_link"
+    read_only: true
+
+publishers:
+  - topic: ~/filtered_points
+    type: sensor_msgs/msg/PointCloud2
+    qos:
+      history: 5
+      reliability: RELIABLE
+      durability: TRANSIENT_LOCAL
+    description: "Filtered and downsampled point cloud"
+
+subscribers:
+  - topic: ~/raw_points
+    type: sensor_msgs/msg/PointCloud2
+    qos:
+      history: 1
+      reliability: BEST_EFFORT
+    description: "Raw point cloud from lidar driver"
+
+services:
+  - name: ~/set_filter_params
+    type: perception_msgs/srv/SetFilterParams
+
+actions:
+  - name: ~/process_scan
+    type: perception_msgs/action/ProcessScan
+
+timers:
+  - name: process_timer
+    period_ms: 100
+    description: "Main processing loop"
+```
+
+**Layer 2 — Composed System Schema** (launch-level declaration)
+
+```yaml
+# system.yaml
+schema_version: "1.0"
+name: perception_pipeline
+
+nodes:
+  - ref: perception_pkg/lidar_processor
+    namespace: /robot1
+    parameters:
+      voxel_size: 0.1
+    remappings:
+      ~/raw_points: /lidar/points
+
+  - ref: perception_pkg/object_detector
+    namespace: /robot1
+
+connections:                    # Explicit wiring (optional, for validation)
+  - from: lidar_processor/~/filtered_points
+    to: object_detector/~/input_cloud
+```
+
+**Mixins — Composable Interface Fragments** (G15, Phase 2)
+
+Plugins that inject interfaces into a host node (e.g., nav2 controller
+plugins adding parameters and topics via the node handle) are declared
+via `mixins:`. Each mixin is itself an `interface.yaml` fragment
+declaring the topics, parameters, and services it adds. The host node's
+effective interface is the merge of its own declaration plus all mixins.
+
+```yaml
+# nodes/follow_path/interface.yaml
+node:
+  name: follow_path
+  package: nav2_controller
+
+parameters:
+  controller_plugin:
+    type: string
+    default_value: "dwb_core::DWBLocalPlanner"
+
+mixins:
+  - ref: dwb_core/dwb_local_planner   # brings in max_vel_x, min_vel_y, etc.
+  - ref: nav2_costmap_2d/costmap       # brings in costmap params + topics
+```
+
+This pattern (borrowed from Smithy's mixin concept) gives `rosgraph
+lint` and `rosgraph monitor` the complete interface picture without
+requiring the host node to redeclare everything its plugins add.
+Requires the `$ref` / fragment system (G15) as a prerequisite.
+
+**Layer 3 — Observation Schema** (runtime-observed state)
+
+```yaml
+# observed.yaml (auto-generated from running system)
+node:
+  name: lidar_processor
+  package: perception_pkg
+  pid: 12345
+  state: active                 # lifecycle state if managed
+
+publishers:
+  - topic: /robot1/lidar_processor/filtered_points
+    type: sensor_msgs/msg/PointCloud2
+    qos:
+      reliability: RELIABLE
+      durability: TRANSIENT_LOCAL
+      depth: 5
+    stats:
+      message_count: 14523
+      frequency_hz: 9.98
+      subscribers_matched: 2
+
+# ... subscribers, services, actions, parameters with actual values
+```
+
+### 3.3 The InterfaceDescriptor (IR)
+
+The parsed, validated, fully-resolved representation of a node's
+interface. Serializable as JSON for plugin communication:
+
+```json
+{
+  "schema_version": "1.0",
+  "node": {
+    "name": "lidar_processor",
+    "package": "perception_pkg",
+    "lifecycle": "managed"
+  },
+  "parameters": [
+    {
+      "name": "voxel_size",
+      "type": "double",
+      "default_value": 0.05,
+      "description": "Voxel grid filter leaf size (meters)",
+      "validation": { "bounds": [0.01, 1.0] },
+      "read_only": false
+    }
+  ],
+  "publishers": [
+    {
+      "topic": "~/filtered_points",
+      "resolved_topic": "/robot1/lidar_processor/filtered_points",
+      "message_type": "sensor_msgs/msg/PointCloud2",
+      "qos": { "history": 5, "reliability": "RELIABLE", "durability": "TRANSIENT_LOCAL" },
+      "description": "Filtered and downsampled point cloud"
+    }
+  ]
+}
+```
+
+Plugins receive this IR via stdin (or file path) and produce generated
+files.
+
+### 3.4 `rosgraph generate` — Code Generation
+
+Translates `interface.yaml` into working node implementations.
+
+```
+┌─────────────────────────────────────────┐
+│         interface.yaml (per node)        │
+└────────────────┬────────────────────────┘
+                 │
+┌────────────────▼────────────────────────┐
+│          Parser / Validator              │
+│  1. YAML parse                          │
+│  2. JSON Schema validation (structural) │
+│  3. Semantic validation (type refs, QoS)│
+│  4. Produce InterfaceDescriptor (IR)    │
+└────────────────┬────────────────────────┘
+                 │
+      ┌──────────┼──────────────────┐
+      │          │                  │
+┌─────▼─────┐ ┌─▼──────────┐ ┌────▼──────┐
+│ C++ Plugin│ │Python Plugin│ │Docs Plugin│
+│           │ │             │ │           │
+│ - header  │ │ - module    │ │ - API ref │
+│ - reg.cpp │ │ - params    │ │ - graph   │
+│ - params  │ │ - __init__  │ │   fragment│
+└───────────┘ └─────────────┘ └───────────┘
+```
+
+**Build integration:**
+
+```cmake
+cmake_minimum_required(VERSION 3.22)
+project(perception_pkg)
+find_package(rosgraph REQUIRED)
+rosgraph_auto_package()
+```
+
+Under the hood, `rosgraph_auto_package()`:
+1. Scans `nodes/` for subdirectories with `interface.yaml`
+2. Validates each `interface.yaml` (structural + semantic)
+3. Invokes C++ plugin → header, registration, params YAML
+4. Delegates to `generate_parameter_library()` for parameters
+5. Compiles and links
+6. Installs interface YAMLs to `share/<package>/interfaces/`
+
+**Design decisions:**
+- **Composition over inheritance.** Generated code holds a
+  `rclcpp::Node` (has-a), not inherits from it. Context struct is a
+  flat aggregation of generated components plus user state.
+- **`generate_parameter_library` as backend.** Uses the existing,
+  widely-adopted parameter library rather than reimplementing.
+- **Convention-over-configuration.** Directory layout (`nodes/`,
+  `interfaces/`, `launch/`, `config/`) determines behavior.
+
+### 3.5 `rosgraph lint` — Static Analysis
+
+Pre-launch verification of the ROS graph.
+
+```
+┌────────────────────────────────────┐
+│           Loaders                  │
+│  ┌───────────┐ ┌───────────────┐  │
+│  │interface.  │ │ launch files  │  │
+│  │yaml parser │ │ (clingwrap/   │  │
+│  │            │ │  native)      │  │
+│  └─────┬─────┘ └──────┬────────┘  │
+│        └───────┬───────┘           │
+│                ▼                   │
+│  ┌──────────────────────────┐     │
+│  │     Graph Model          │     │
+│  └────────────┬─────────────┘     │
+│               ▼                   │
+│  ┌──────────────────────────┐     │
+│  │   Analyzer DAG           │     │
+│  │   (parallel execution)   │     │
+│  │                          │     │
+│  │  [topic_resolver]        │     │
+│  │       ↓                  │     │
+│  │  [type_mismatch_checker] │     │
+│  │  [qos_compat_checker]    │     │
+│  │  [naming_convention]     │     │
+│  │  [disconnected_subgraph] │     │
+│  │  [unused_node]           │     │
+│  │  [launch_linter]         │     │
+│  │  ...                     │     │
+│  └────────────┬─────────────┘     │
+│               ▼                   │
+│  ┌──────────────────────────┐     │
+│  │  Post-Processing         │     │
+│  │  - suppression filter    │     │
+│  │  - severity assignment   │     │
+│  │  - deduplication         │     │
+│  │  - differential (new     │     │
+│  │    issues only for CI)   │     │
+│  └────────────┬─────────────┘     │
+│               ▼                   │
+│  ┌──────────────────────────┐     │
+│  │  Output Formatters       │     │
+│  │  text, JSON, SARIF,      │     │
+│  │  GitHub, JUnit           │     │
+│  └──────────────────────────┘     │
+└────────────────────────────────────┘
+```
+
+**Analyzer definition pattern (from Go analysis framework):**
+
+```python
+# Each analyzer is a value, not a subclass
+topic_resolver = GraphAnalyzer(
+    name="topic_resolver",
+    doc="Resolves topic names to their message types across the graph",
+    requires=[],
+    result_type=TopicTypeMap,
+    run=resolve_topics,
+)
+
+type_mismatch = GraphAnalyzer(
+    name="type_mismatch",
+    doc="Checks that all pub/sub on a topic agree on message type",
+    requires=[topic_resolver],
+    result_type=None,
+    run=check_type_mismatches,
+)
+```
+
+See [§7 Lint Rule Codes](#7-lint-rule-codes) for the full rule system.
+
+**Launch file loading strategy:**
+
+Three loader paths, not mutually exclusive, phased by tractability:
+
+| Loader | Launch format | Extraction method | Phase | Limitations |
+|---|---|---|---|---|
+| YAML launch | YAML launch files | Direct parse | 1 | Limited expressiveness |
+| `system.yaml` | Layer 2 schema | Direct parse | 1 | Requires manual authoring |
+| Python launch AST | Standard `launch_ros` | AST pattern matching | 2 | Cannot handle dynamic logic (conditionals, loops) |
+
+- **YAML launch files** are statically parseable — `rosgraph lint` can
+  extract node declarations, remappings, and parameter overrides
+  directly.
+- **Python launch files** are imperative and Turing-complete, but most
+  are declarative-in-spirit. AST-level pattern matching for common
+  patterns (`Node()`, `LaunchConfiguration()`,
+  `DeclareLaunchArgument()`) captures ~80% of real launch files without
+  execution.
+- **Layer 2 `system.yaml`** (§3.2) sidesteps the problem entirely —
+  a static YAML file declaring the intended system composition. Launch
+  files still run the system, but `system.yaml` is the lint/monitor
+  source of truth for graph analysis.
+
+The lint diagram's "launch files" loader encompasses all three paths.
+
+### 3.6 `rosgraph monitor` — Runtime Reconciliation
+
+Kubernetes-style reconciliation loop comparing declared vs. observed
+graph state.
+
+```
+┌─────────────────────────────────────────────────┐
+│              rosgraph monitor                    │
+│                                                  │
+│  ┌───────────────┐     ┌──────────────────────┐ │
+│  │ Declared State │     │   Observed State     │ │
+│  │ (from YAML /  │     │   (from DDS          │ │
+│  │  interface    │     │    discovery)         │ │
+│  │  files)       │     │                      │ │
+│  └───────┬───────┘     └──────────┬───────────┘ │
+│          │                        │              │
+│          └──────────┬─────────────┘              │
+│                     ▼                            │
+│  ┌──────────────────────────────────┐            │
+│  │     Reconciliation Engine        │            │
+│  │                                  │            │
+│  │  Level-triggered (not edge)      │            │
+│  │  Idempotent                      │            │
+│  │  Requeue with backoff            │            │
+│  └──────────────┬───────────────────┘            │
+│                 ▼                                │
+│  ┌──────────────────────────────────┐            │
+│  │     Diff Computation             │            │
+│  │                                  │            │
+│  │  - Missing/extra nodes           │            │
+│  │  - Missing/extra topics          │            │
+│  │  - QoS mismatches                │            │
+│  │  - Type mismatches               │            │
+│  │  - Parameter drift               │            │
+│  └──────────────┬───────────────────┘            │
+│                 ▼                                │
+│  ┌──────────────────────────────────┐            │
+│  │     Exporters                    │            │
+│  │                                  │            │
+│  │  - ROS topics (graph_diff msg)   │            │
+│  │  - Prometheus /metrics endpoint  │            │
+│  │  - Structured log output         │            │
+│  │  - Alerting (via diagnostics)    │            │
+│  └──────────────────────────────────┘            │
+└─────────────────────────────────────────────────┘
+```
+
+**Reconciliation loop:**
+
+```python
+while running:
+    declared = load_declared_graph(interface_files, launch_files)
+    observed = scrape_live_graph(dds_discovery)
+
+    diff = compute_graph_diff(declared, observed)
+
+    if diff.has_issues():
+        publish_diff(diff)           # ROS topic: /rosgraph/diff
+        update_metrics(diff)         # Prometheus: rosgraph_missing_nodes, etc.
+        emit_diagnostics(diff)       # /diagnostics for standard tooling
+
+    publish_status(observed)         # ROS topic: /rosgraph/status
+
+    # Adaptive interval: faster when drifting, slower when stable
+    if diff.has_critical():
+        sleep(1s)
+    else:
+        sleep(5s)
+```
+
+See [§8 Monitor Alert Rules](#8-monitor-alert-rules) for the alert
+system.
+
+**Relationship to graph-monitor:** `rosgraph monitor` is a new
+implementation, not an extension of the existing graph-monitor package.
+graph-monitor's value is its `rmw_stats_shim` and
+`rosgraph_monitor_msgs` message definitions — these are reusable
+regardless of architecture. However, graph-monitor lacks the
+reconciliation engine (declared vs. observed diff) that is the core of
+`rosgraph monitor`, and retrofitting it would constrain the design.
+
+The integration path: adopt or align with graph-monitor's message
+definitions (`rosgraph_monitor_msgs`), reimplement the graph scraping
+and reconciliation, and offer to upstream the reconciliation capability
+back to graph-monitor if its maintainers are interested.
+
+### 3.7 `rosgraph test` — Contract Testing
+
+Schema-driven verification of running nodes against their declarations.
+
+Three testing modes (modelled on Schemathesis, Dredd, and Pact):
+
+**Interface conformance** (Dredd model): Run a node, then
+systematically verify its actual interface matches its
+`interface.yaml`. Check every declared publisher is active, call every
+declared service, verify every parameter exists with the declared type
+and default.
+
+**Fuzz testing** (Schemathesis model): Auto-generate messages matching
+declared subscriber types, publish them, verify the node produces
+outputs on declared publisher topics with correct types.
+
+**Cross-node contract testing** (Pact model): Node A's
+`interface.yaml` declares it subscribes to `/cmd_vel` (Twist). Node
+B's `interface.yaml` declares it publishes `/cmd_vel` (Twist). The
+contract test verifies they agree on type and QoS compatibility.
+
+### 3.8 `rosgraph docs` — Documentation Generation
+
+Auto-generated "Swagger UI for ROS nodes" — browsable API reference
+docs from `interface.yaml`. Covers topics, services, actions,
+parameters, QoS settings, and message type definitions.
+
+Output formats: Markdown (for GitHub Pages / docs.ros.org), HTML
+(standalone), JSON (for embedding in other tools).
+
+### 3.9 `rosgraph breaking` — Breaking Change Detection
+
+Compares two versions of `interface.yaml` and classifies changes:
+
+| Classification | Examples |
+|---|---|
+| **Breaking** | Removed topic, changed message type, removed parameter, incompatible QoS change |
+| **Dangerous** | Changed QoS (may affect connectivity), narrowed parameter range |
+| **Safe** | Added optional parameter, added new publisher, widened parameter range |
+
+Modelled on `buf breaking` and `graphql-inspector`.
+
+### 3.10 `rosgraph discover` — Runtime-to-Spec Generation
+
+Introspects a running node via DDS discovery and generates an
+`interface.yaml` from the observed interface. The "slice of cake"
+brownfield adoption path.
+
+```bash
+# Generate interface.yaml from a running node
+rosgraph discover /lidar_processor -o nodes/lidar_processor/interface.yaml
+```
+
+Modelled on Terraform's `import` command.
+
+### 3.11 Configuration
+
+**`rosgraph.toml`** — single configuration file for all subcommands:
+
+```toml
+[lint]
+select = ["TOP", "SRV", "QOS", "GRF"]  # enable these rule families
+ignore = ["NME001"]                      # except this specific rule
+
+[lint.per-package-ignores]
+"generated_*" = ["ALL"]                  # skip generated packages
+"*_test" = ["GRF002"]                    # allow unused nodes in tests
+
+[generate]
+plugins = ["cpp", "python"]
+out_dir = "generated"
+
+[output]
+format = "text"                          # text | json | sarif | github
+
+[ci]
+new-only = true                          # only new issues (differential)
+base-branch = "main"
+```
+
+### 3.12 Multi-Workspace Analysis
+
+ROS 2 workspaces overlay each other (e.g., `ros_base` underlay + your
+packages + a vendor overlay). When `rosgraph lint` analyzes your
+workspace, it needs interface information from packages in the underlay.
+
+The solution follows the Go analysis framework's per-package fact
+caching pattern: installed `interface.yaml` files in
+`share/<package>/interfaces/` (placed there by `rosgraph_auto_package()`
+at install time) serve as cached analysis artifacts. `rosgraph lint`
+reads these from the underlay without re-analyzing underlay packages,
+only analyzing packages in the current workspace.
+
+This is a Phase 2 concern. Phase 1 assumes a single workspace.
+
+### 3.13 AI & Tooling Integration
+
+`interface.yaml` and the `InterfaceDescriptor` IR (§3.3) are
+machine-readable contracts describing a node's complete API. This
+makes them natural integration points for AI-assisted development
+tools and IDE infrastructure.
+
+**AI as IR consumer.** The JSON-serialized `InterfaceDescriptor`
+contains everything an LLM needs to understand a node's interface:
+topics, types, QoS, parameters, lifecycle state. An AI agent can read
+this to generate implementation code, write tests, suggest fixes, or
+answer questions about the system — without parsing source code.
+
+**MCP server.** A Model Context Protocol server exposing graph state,
+lint results, and interface schemas enables AI coding tools (Claude
+Code, Cursor, Copilot) to query the ROS graph as structured context.
+"What topics does the perception pipeline publish?" answered from
+the graph model, not from grep.
+
+**AI-assisted discovery.** `rosgraph discover` (§3.10) generates
+`interface.yaml` from a running system. The raw output from DDS
+discovery is complete but lacks descriptions, rationale, and grouping.
+An LLM can refine the generated spec — inferring descriptions from
+topic names and message types, suggesting QoS profiles based on
+message patterns, and grouping related interfaces.
+
+**Language Server Protocol (LSP).** An LSP server for `interface.yaml`
+enables IDE features beyond JSON Schema validation: hover for message
+type definitions, go-to-definition for `$ref` targets, inline
+diagnostics from `rosgraph lint`, and cross-file rename support. This
+benefits both human developers and AI agents operating within IDE
+contexts.
+
+**Natural language to spec.** The constrained schema makes
+`interface.yaml` a tractable generation target for LLMs. "I need a
+node that subscribes to a lidar point cloud, filters it, and publishes
+the result" produces a valid `interface.yaml` that `rosgraph generate`
+can immediately scaffold into working code.
+
+These are not Phase 1 deliverables, but the architecture should not
+preclude them. The IR-based plugin protocol (§3.3) and structured
+output formats (JSON, SARIF) are the key enablers — they exist for
+code generation and CI, but AI consumers are a natural extension.
+
+### 3.14 Scale & Fleet Considerations
+
+§3.12 covers multi-workspace analysis. This section addresses
+concerns beyond a single developer's workstation.
+
+**Interface ownership.** In multi-team organizations, `interface.yaml`
+files are shared contracts. The owner is typically the node author
+(they define the interface), but downstream consumers depend on it.
+Changes require coordination. rosgraph supports this via:
+- `rosgraph breaking` (§3.9) — automated detection of breaking
+  changes in CI, blocking merges that break downstream consumers.
+- Installed interfaces in `share/<package>/interfaces/` — downstream
+  teams depend on published interfaces without pulling source code.
+- Semantic versioning alignment — the breaking/dangerous/safe
+  classification maps to semver: breaking = major, dangerous = minor
+  (review required), safe = patch.
+
+**Multi-robot systems.** The `system.yaml` (Layer 2, §3.2) supports
+namespaced node instances (`namespace: /robot1`). For multi-robot
+systems, each robot's graph is a namespaced instance of the same
+`system.yaml`. Fleet-level analysis — "which robots are running
+interface version X?" — is out of scope for Phase 1–2 but the
+architecture supports it: `rosgraph monitor` on each robot publishes
+graph snapshots that a fleet-level aggregator can collect.
+
+**Fleet monitoring.** `rosgraph monitor` (§3.6) runs per-robot. For
+fleet-scale observability, the monitor's Prometheus exporter (M7)
+enables standard fleet dashboards via Grafana. The `/rosgraph/diff`
+topic on each robot can be bridged to a central system for aggregated
+drift analysis. The architecture deliberately uses standard
+observability patterns (Prometheus metrics, structured logs,
+diagnostics topics) rather than inventing fleet-specific
+infrastructure.
+
+**Performance targets.** Build-time targets are stated in DP7 (100
+packages in 5 seconds). Runtime targets for `rosgraph monitor`:
+- Reconciliation cycle: < 500ms for a 200-node system
+- Memory overhead: < 50MB resident for graph state
+- CPU: < 5% of one core at steady-state (5s scrape interval)
+
+These are design targets, not commitments — they guide architectural
+decisions (e.g., choosing Rust for the diff engine).
+
+### 3.15 colcon Integration
+
+`colcon` uses a `VerbExtensionPoint` plugin system — any Python package
+can register new verbs via `setup.cfg` entry points. Existing examples:
+`colcon-clean` adds `colcon clean`, `colcon-cache` adds `colcon cache`.
+
+The architecture is **`rosgraph` as standalone tool, `colcon-rosgraph`
+as thin workspace wrapper**:
+
+```
+colcon-rosgraph (Python, verb plugin)
+  └── delegates to → rosgraph (standalone binary)
+```
+
+This mirrors how `colcon-cmake` shells out to `cmake` — the colcon verb
+handles workspace iteration, package ordering, and parallel execution;
+the core tool handles single-package analysis.
+
+**What maps naturally to colcon verbs:**
+
+| Command | colcon verb | Notes |
+|---|---|---|
+| `rosgraph generate` | — | Already runs via `rosgraph_auto_package()` in `colcon build` |
+| `rosgraph test` | — | Already runs via CTest in `colcon test` |
+| `rosgraph lint` | `colcon lint` | Iterates packages in dependency order, parallel per-package lint |
+| `rosgraph docs` | `colcon docs` | Generates docs per package, aggregates into workspace docs |
+| `rosgraph discover` | `colcon discover` | Generates `interface.yaml` for all running nodes |
+| `rosgraph breaking` | `colcon breaking` | Checks all packages against their previous interface versions |
+
+**What doesn't fit:**
+
+`rosgraph monitor` is a long-running daemon, not a build-and-exit verb.
+It stays as a standalone command (or a `ros2 launch` node).
+
+**Why both CLIs:**
+
+- `colcon lint` for the workspace workflow — lint all packages, respect
+  dependency order, parallel execution, workspace-level reporting.
+- `rosgraph lint path/to/interface.yaml` for single-file use, CI
+  pipelines, and environments without colcon.
+
+**Language independence.** The colcon plugin is always Python (colcon
+requires it), but it delegates to `rosgraph` via subprocess — so the
+core tool's language is unconstrained. Rust, Python, or hybrid all work
+identically. The colcon integration does not factor into the language
+choice (§5).
+
+The colcon plugin is a Phase 2 deliverable — Phase 1 focuses on the
+standalone `rosgraph` tool. The plugin is trivial once the core tool
+exists.
+
+---
+
+## 4. Phasing
+
+### Phase 1 — Foundation
+
+Deliver the core schema, basic code generation, and highest-value
+static + runtime checks.
+
+**Schema & generate:**
+- G1-G10 (existing cake features — stabilize and adopt)
+- G11 (lifecycle nodes — blocks nav2/ros2_control adoption)
+- G14 (schema versioning — needed before v1.0)
+
+**Lint (P0 rules):**
+- L1 (topic type mismatch), L2 (QoS compatibility), L3 (disconnected
+  subgraph)
+- L5 (SARIF output), L6 (differential analysis)
+
+**Monitor (P0 features):**
+- M1 (declared-vs-observed diff), M2 (missing node alerting),
+  M5 (graph snapshots)
+
+### Phase 2 — Adoption Enablers
+
+Lower barriers for existing codebases. Fill out the rule set.
+
+**Schema & generate:**
+- G12 (timers), G13 (nested parameters), G15 (mixins)
+- O1 (`rosgraph docs`), O2 (`rosgraph discover`)
+
+**Lint (P1 rules + infrastructure):**
+- L4 (launch validation), L7 (naming), L8 (unused node),
+  L9 (parameter validation), L10 (circular deps)
+- L11 (inline suppression), L12 (per-package config),
+  L13 (`--add-noqa`), L14 (semantic validation)
+
+**Monitor (P1 features):**
+- M3 (QoS drift), M4 (runtime type mismatch), M6 (topic stats),
+  M8 (unexpected node), M9 (health diagnostics)
+
+### Phase 3 — Scale the Toolchain
+
+Enable community extension and advanced analysis.
+
+**Schema & generate:**
+- G16 (plugin architecture), G17 (callback groups),
+  G19 (system composition schema)
+- O3 (`rosgraph breaking`), O4 (`rosgraph test`)
+
+**Lint:**
+- L15 (interface coverage)
+
+**Monitor:**
+- M7 (Prometheus endpoint), M10 (adaptive scrape),
+  M11 (lifecycle state)
+
+### Phase 4 — Ecosystem Integration
+
+Future-proofing and niche use cases.
+
+- G18 (middleware bindings)
+- O5 (`rosgraph policy` — SROS 2 security policies)
+- M12 (runtime interface coverage)
+
+### Adoption Path
+
+rosgraph is unlikely to reach `ros_core` initially — that requires
+broad consensus and a high stability bar. A more realistic progression:
+
+1. **`ros-tooling` organization** (where graph-monitor already lives) —
+   institutional backing, CI infrastructure, release process.
+2. **REP (ROS Enhancement Proposal)** for the `interface.yaml` schema —
+   formalizes the declaration format as a community standard.
+3. **docs.ros.org tutorial integration** — if the "write your first
+   node" tutorial uses `interface.yaml`, every new ROS developer learns
+   it from day one. This is the highest-leverage adoption path.
+4. **`ros_core` proposal** — after demonstrated adoption across multiple
+   distros, propose for inclusion in a future distribution.
+
+---
+
+## 5. Language Choice
+
+The implementation language is an open decision for the WG. The
+trade-offs are structural, not preferential.
+
+### Option A: Rust
+
+Follows Ruff's model. Speed as an architectural property.
+
+| Axis | Assessment |
+|---|---|
+| Performance | Best. Single-pass analysis, zero-cost abstractions, no GC pauses. Achieves the "100 packages in 5s" target. |
+| Contribution barrier | Highest. Most ROS contributors know C++/Python, not Rust. |
+| Ecosystem fit | Moderate. `rclrs` exists but is not tier-1. CLI tools don't need ROS client library integration. |
+| Deployment | Single static binary. No runtime dependencies. |
+| Plugin story | WASM plugins (Extism) or process-based (protoc model). |
+
+### Option B: Python
+
+Follows the ROS 2 ecosystem convention.
+
+| Axis | Assessment |
+|---|---|
+| Performance | Weakest. 10-100x slower than Rust for analysis workloads. May not meet performance targets. |
+| Contribution barrier | Lowest. Every ROS developer knows Python. |
+| Ecosystem fit | Best. cake is Python. `launch_ros` is Python. Direct reuse of existing parsing libraries. |
+| Deployment | Requires Python runtime. `pip install` or ROS package. |
+| Plugin story | Native Python plugins. Trivial to write. |
+
+### Option C: Rust core + Python bindings
+
+Hybrid via PyO3. Performance-critical core (parsing, graph model, diff
+engine, lint rules) in Rust; Python CLI and plugin layer on top.
+
+| Axis | Assessment |
+|---|---|
+| Performance | Near-Rust for analysis; Python overhead for CLI/plugin dispatch only. |
+| Contribution barrier | Moderate. Core contributors need Rust; plugin authors use Python. |
+| Ecosystem fit | Good. Python-facing API integrates with ROS ecosystem. |
+| Deployment | Python package with native extension. Requires build toolchain for distribution. |
+| Plugin story | Python plugins (native) + WASM plugins (for sandboxing). |
+
+### Decision factors
+
+The choice depends on which constraint the WG prioritizes:
+- If **speed** is the binding constraint → Rust or hybrid
+- If **community contribution** is the binding constraint → Python
+- If **both matter** → hybrid, accepting the build complexity
+
+Note: the colcon integration (§3.15) does not constrain this choice.
+The `colcon-rosgraph` plugin is always Python but delegates to the
+`rosgraph` binary via subprocess, so the core tool can be any language.
+
+---
+
+## 6. Feature List
+
+### Schema & Code Generation (`rosgraph generate`)
+
+| # | Feature | Priority | Description |
+|---|---------|----------|-------------|
+| G1 | YAML interface declaration | P0 | Single `interface.yaml` per node declaring all ROS 2 entities |
+| G2 | JSON Schema validation | P0 | Structural validation with IDE autocompletion via YAML Language Server |
+| G3 | C++ code generation | P0 | Typed context, pub/sub/srv/action wrappers, component registration |
+| G4 | Python code generation | P0 | Dataclass context, pub/sub/srv/action wrappers |
+| G5 | Parameter generation | P0 | Delegates to `generate_parameter_library` (backward-compatible) |
+| G6 | QoS declaration | P0 | Required for pub/sub, supports all DDS QoS policies |
+| G7 | Parameterized QoS | P0 | `${param:name}` references in QoS fields |
+| G8 | Dynamic topic names | P0 | `${param:name}` and `${for_each_param:name}` |
+| G9 | Composition pattern | P0 | Has-a `Node`, not is-a `Node` |
+| G10 | Zero-boilerplate build | P0 | `rosgraph_auto_package()` CMake macro |
+| G11 | Lifecycle node support | P0 | `lifecycle: managed` in node spec |
+| G12 | Timer declarations | P1 | `timers:` section with period, callback name |
+| G13 | Nested parameters | P1 | Hierarchical parameter structures (parity with gen_param_lib) |
+| G14 | Schema versioning | P1 | `schema_version` field with migration tooling |
+| G15 | Mixins / shared fragments | P1 | `$ref` to common interface fragments |
+| G16 | Plugin architecture | P2 | IR-based pipeline, standalone plugins per language |
+| G17 | Callback group declarations | P2 | `callback_groups:` with entity assignment |
+| G18 | Middleware bindings | P3 | Protocol-specific config (DDS, Zenoh) |
+| G19 | System composition schema | P2 | Multi-node graph declaration (`system.yaml`, Layer 2) |
+
+### Static Analysis (`rosgraph lint`)
+
+| # | Feature | Priority | Description |
+|---|---------|----------|-------------|
+| L1 | Topic type mismatch detection | P0 | Flag when pub and sub on same topic disagree on message type |
+| L2 | QoS compatibility checking | P0 | Flag incompatible QoS profiles (reliability, durability, deadline) |
+| L3 | Disconnected subgraph detection | P0 | Flag nodes/topics with no connections |
+| L4 | Launch file validation | P0 | Detect undefined node refs, invalid remaps, unresolved substitutions |
+| L5 | SARIF / CI output | P0 | Structured output for GitHub Security tab, PR annotations |
+| L6 | Differential analysis | P0 | `--new-only` reports only issues introduced since base branch |
+| L7 | Naming convention enforcement | P1 | Check names against configurable patterns |
+| L8 | Unused node detection | P1 | Flag nodes declared but not in any launch config |
+| L9 | Parameter validation | P1 | Check values against declared types, ranges, validators |
+| L10 | Circular dependency detection | P1 | Flag service/action chains that could deadlock |
+| L11 | Inline suppression | P1 | `# rosgraph: noqa: TOP001` in launch/YAML files |
+| L12 | Per-package configuration | P1 | Override rules per package via `rosgraph.toml` |
+| L13 | `--add-noqa` for adoption | P1 | Generate suppression comments for all existing issues |
+| L14 | Semantic validation | P1 | Full type resolution, QoS compatibility checks |
+| L15 | Interface coverage reporting | P2 | Which declared topics/services are exercised in tests |
+
+### Runtime Monitoring (`rosgraph monitor`)
+
+| # | Feature | Priority | Description |
+|---|---------|----------|-------------|
+| M1 | Declared-vs-observed graph diff | P0 | Compare declared interfaces against live DDS discovery |
+| M2 | Missing node alerting | P0 | Alert when a declared node is not present |
+| M3 | QoS drift detection | P0 | Alert when observed QoS differs from declared |
+| M4 | Type mismatch detection (runtime) | P0 | Alert when observed types differ from declaration |
+| M5 | Graph snapshot publishing | P0 | Periodic `rosgraph_monitor_msgs/Graph` snapshots |
+| M6 | Topic statistics | P1 | Message rate, latency, queue depth per topic |
+| M7 | Prometheus /metrics endpoint | P1 | Export graph metrics for Grafana dashboards |
+| M8 | Unexpected node detection | P1 | Alert on nodes present but not declared |
+| M9 | Health diagnostics integration | P1 | Publish to `/diagnostics` for standard ROS tooling |
+| M10 | Adaptive scrape interval | P2 | Faster scraping when drift detected, slower when stable |
+| M11 | Lifecycle state monitoring | P2 | Track lifecycle transitions against expectations |
+| M12 | Interface coverage tracking | P2 | Which declared interfaces are exercised at runtime |
+
+### Other Subcommands
+
+| # | Feature | Subcommand | Priority | Description |
+|---|---------|------------|----------|-------------|
+| O1 | Documentation generation | `rosgraph docs` | P1 | Auto-generated API reference from schema |
+| O2 | Runtime-to-spec discovery | `rosgraph discover` | P1 | Introspect running nodes → `interface.yaml` |
+| O3 | Breaking change detection | `rosgraph breaking` | P2 | Detect breaking interface changes across releases |
+| O4 | Contract testing | `rosgraph test` | P2 | Schema-driven verification of running nodes |
+| O5 | Security policy generation | `rosgraph policy` | P3 | Auto-generate SROS 2 policies from schema |
+
+---
+
+## 7. Lint Rule Codes
+
+Rule codes use hierarchical prefix system (modelled on Ruff). Rules
+can be selected at any granularity: `TOP` (all topic rules),
+`TOP001` (specific rule).
+
+| Prefix | Category | Example rules |
+|--------|----------|---------------|
+| `TOP` | Topic rules | `TOP001` type mismatch, `TOP002` no subscribers, `TOP003` naming convention |
+| `SRV` | Service rules | `SRV001` unmatched client, `SRV002` type mismatch |
+| `ACT` | Action rules | `ACT001` unmatched client, `ACT002` type mismatch |
+| `PRM` | Parameter rules | `PRM001` missing default, `PRM002` type violation, `PRM003` undeclared |
+| `QOS` | QoS rules | `QOS001` reliability mismatch, `QOS002` durability incompatible, `QOS003` deadline violation |
+| `LCH` | Launch rules | `LCH001` undefined node ref, `LCH002` invalid remap, `LCH003` unresolved substitution |
+| `GRF` | Graph-level rules | `GRF001` disconnected subgraph, `GRF002` unused node, `GRF003` circular dependency |
+| `NME` | Naming rules | `NME001` topic naming convention, `NME002` node naming convention |
+| `SAF` | Safety rules | `SAF001` insufficient redundancy, `SAF002` single point of failure, `SAF003` unmanaged safety node |
+| `TF` | TF frame rules | `TF001` undeclared frame_id, `TF002` broken frame chain |
+
+**Rule lifecycle:** preview → stable → deprecated → removed. New rules
+always enter as preview.
+
+**Fix applicability:** Safe (preserves semantics), unsafe (may alter
+behaviour), display-only (suggestion). Per-rule override via config.
+
+---
+
+## 8. Monitor Alert Rules
+
+| Alert | Condition | Grace period | Severity |
+|---|---|---|---|
+| `NodeMissing` | Declared node not observed | 10s | critical |
+| `UnexpectedNode` | Observed node not declared | 30s | warning |
+| `TopicMissing` | Declared topic not present | 5s | critical |
+| `QoSMismatch` | Declared QoS ≠ observed QoS | 0s | error |
+| `TypeMismatch` | Declared msg type ≠ observed | 0s | critical |
+| `ThroughputDrop` | Rate < expected minimum | 30s | warning |
+
+Grace periods prevent flapping during startup and transient states.
+All thresholds (grace period, severity) are configurable via
+`rosgraph.toml` — see §11.3 for safety-critical overrides.
+
+---
+
+## 9. Existing ROS 2 Ecosystem
+
+### 9.1 Maturity Matrix
+
+| Tool | Stars | Contributors | Last active | Maturity | Bus factor |
+|---|---|---|---|---|---|
+| **generate_parameter_library** | 353 | 41 | 2026-02 | Production | Healthy |
+| **ros2_tracing** | 237 | 30 | 2026-02 | Production (QL1) | Healthy |
+| **topic_tools** | 126 | 25 | 2025-08 | Mature | Healthy |
+| **launch_ros** | 78 | 71 | 2026-02 | Core infrastructure | Healthy |
+| **cake** | 36 | 1 | 2026-02 | Early-stage | 1 (risk) |
+| **graph-monitor** | 31 | 3 | 2025-11 | Mid-stage | Low |
+| **nodl** | 10 | 7 | 2022-11 | Dormant | N/A |
+| **clingwrap** | 9 | 1 | 2026-02 | Early-stage | 1 (risk) |
+| **breadcrumb** | 6 | 1 | 2026-02 | Early-stage | 1 (risk) |
+| **HAROS** (ROS 1) | 197 | — | 2021-09 | Abandoned | N/A |
+| **CARET** | 97 | 18 | active | Mature (Tier IV) | Healthy |
+
+### 9.2 Tool Assessments
+
+**cake** — Declarative code generation. `interface.yaml` → C++ and
+Python node scaffolding. Functional pattern (has-a Node, not is-a
+Node). The fundamental bet is correct: making the interface declaration
+the source of truth for code generation is the only way to prevent
+schema-code drift. Core design decisions (YAML-driven,
+composition-based, schema-validated, codegen-first) are sound.
+cake's author is a WG member; rosgraph's Layer 1 schema builds
+directly on cake's format, and G1–G10 represent stabilizing cake's
+capabilities under the rosgraph umbrella — addressing the bus-factor
+risk while preserving the design. Gaps: no lifecycle support, no
+timers, no nested parameters, no formal IR, no plugin architecture,
+no runtime-to-spec generation.
+
+**generate_parameter_library** — The most mature tool in the space.
+Production-proven in MoveIt2 and ros2_control. Rich validation. The
+unification path: the `parameters:` section of `interface.yaml` IS the
+`generate_parameter_library` format (already demonstrated in cake).
+rosgraph delegates to `generate_parameter_library` at build time rather
+than reimplementing parameter generation. The key invariant: a
+standalone gen_param_lib YAML file works as-is when placed in the
+`parameters:` block of `interface.yaml`. Ownership transfer to
+`ros-tooling` would be ideal but is not required — schema compatibility
+is sufficient.
+
+**graph-monitor** — Official ROSGraph WG backing. Publishes structured
+graph messages. The `rmw_stats_shim` approach is architecturally sound.
+Gap: can report *what exists* but not *what's wrong* — no comparison
+against a declared spec.
+
+**breadcrumb + clingwrap** — Proves the concept of static graph
+extraction from launch files. The tight coupling to clingwrap's
+non-standard launch API is the primary concern. Static analysis should
+work with standard `launch_ros` patterns.
+
+**nodl** — Dormant since 2022. Correct problem identification but
+fatal flaw: no code generation. Superseded by cake's YAML approach.
+Key lesson: **a description format without code generation is a
+non-starter.**
+
+**ros2_tracing + CARET** — The most mature dynamic analysis tools.
+QL1 certification, production-proven at Tier IV. Complementary to
+rosgraph: tracing provides instrumentation, CARET provides latency
+analysis, rosgraph provides graph structure analysis.
+
+### 9.3 Gap Analysis
+
+| Category | Capability | Current tool | Status |
+|---|---|---|---|
+| **Schema** | Node interface declaration | cake / nodl / gen_param_lib | cake early; nodl dead; gpl params-only |
+| **Codegen** | Static graph from launch files | breadcrumb + clingwrap | Early-stage, solo dev |
+| **Runtime** | Runtime graph monitoring | graph-monitor | Mid-stage, institutional |
+| **Runtime** | Runtime tracing | ros2_tracing | Mature, production |
+| **Runtime** | Latency analysis | CARET | Mature, Tier IV |
+| **Runtime** | Graph visualisation | Foxglove, Dear RosNodeViewer | Mature but live-only |
+| **Runtime** | **Graph diff (expected vs. actual)** | **Nothing** | **Major gap** |
+| **Static** | **Graph linting (pre-launch)** | **Nothing** | **Major gap** |
+| **Static** | **QoS static analysis** | breadcrumb (partial) | Early-stage |
+| **Static** | **CI graph validation** | **Nothing** | **Major gap** |
+| **Docs** | **Node API documentation** | **Nothing** (hand-written only) | **Major gap** |
+| — | **Behavioural properties** | **Nothing** (HPL was ROS 1) | **Major gap** |
+
+---
+
+## 10. Prior Art
+
+Organized by what we borrow, not by framework. Each framework appears
+once at its primary contribution.
+
+### 10.1 Schema Design
+
+#### AsyncAPI
+
+The closest structural match to ROS topics. Version 3 cleanly separates
+channels, messages, operations, and components at the top level.
+
+**What to borrow:**
+- **Structural separation.** `publishers`, `subscribers`, `services`,
+  `actions`, `parameters` as peer top-level sections.
+- **`components` + `$ref` pattern.** Define QoS profiles or common
+  parameter sets once, reference everywhere.
+- **Trait system.** Define a `reliable_sensor` trait with QoS settings,
+  apply to multiple publishers. Traits merge via JSON Merge Patch
+  (RFC 7386).
+- **Protocol bindings.** Core schema stays middleware-agnostic;
+  DDS-specific QoS, Zenoh settings, or shared-memory config in a
+  `bindings:` block.
+- **Parameterized addresses.** Topic name templates
+  (`sensors/{robot_name}/lidar`) map to ROS 2 namespace/remapping and
+  `${param:name}` syntax.
+
+**Gaps:** No services (as typed req/res pair), no actions, no
+parameters, no lifecycle, no timers, no TF frames. Single-application
+scope (which is actually the right scope for a node interface).
+
+#### Smithy (AWS)
+
+Protocol-agnostic interface definition language. Shapes decorated with
+traits.
+
+**What to borrow:**
+- **Typed, composable traits** for extensible metadata — the most
+  powerful metadata mechanism surveyed:
+  ```
+  @qos(reliability: "reliable", depth: 10)
+  @lifecycle(managed: true)
+  @parameter_range(min: 0.0, max: 10.0)
+  @frame_id("base_link")
+  ```
+- **Mixins** for shared structure. A `lifecycle_diagnostics` mixin adds
+  a diagnostics publisher and period parameter to any node that
+  includes it.
+- **Resource lifecycle operations** — maps to ROS 2 lifecycle node
+  transitions.
+
+#### CUE
+
+Constraint-based configuration language where types and values are the
+same thing. Not a codegen tool — a validation tool.
+
+**What to borrow:**
+- **Constraints as types.** `voxel_size: float & >=0.01 & <=1.0`. The
+  JSON Schema equivalent (`minimum`, `maximum`, `enum`) is already
+  used by the existing `interface.schema.yaml`.
+- **Incremental constraints.** Base schema + deployment-specific
+  overlays (e.g., production QoS profiles layered onto a base
+  `interface.yaml`).
+- **Configuration validation.** Validate that launch parameter
+  overrides are compatible with a node's declared interface.
+
+### 10.2 Pipeline & Code Generation
+
+#### Protocol Buffers / Buf CLI
+
+The single most important architectural lesson: **an intermediate
+representation (IR) between parsing and generation**.
+
+```
+interface.yaml ──> [Parser/Validator] ──> InterfaceDescriptor (IR)
+                                            ├──> [Plugin: C++]    ──> scaffolding
+                                            ├──> [Plugin: Python] ──> scaffolding
+                                            ├──> [Plugin: Docs]   ──> API reference
+                                            └──> [Plugin: Launch] ──> templates
+```
+
+**What to borrow:**
+- **IR-based plugin protocol.** Standalone executables consuming a
+  serialized `InterfaceDescriptor` via stdin/file. Community members
+  write `rosgraph-gen-rust` without touching the core codebase.
+- **Config-driven generation** (`buf.gen.yaml` pattern):
+  ```yaml
+  version: 1
+  plugins:
+    - name: cpp
+      out: generated/cpp
+      options: { lifecycle: managed }
+    - name: python
+      out: generated/python
+  ```
+- **Validation as separate layers.** Structural (does the YAML parse?)
+  → semantic (do referenced types exist?) → breaking (did the interface
+  change incompatibly?). Maps to `rosgraph lint`, `rosgraph validate`,
+  `rosgraph breaking`.
+- **Deterministic, reproducible output.** Same inputs → byte-identical
+  output. CI can verify generated code is up to date.
+
+**What to borrow from Buf CLI specifically:**
+- `buf lint` — configurable schema linting with ~50 rules by category.
+  Config-driven rule selection.
+- `buf breaking` — breaking change detection between schema versions.
+- Integrated toolchain: `buf generate`, `buf lint`, `buf breaking`,
+  `buf format` as subcommands of one tool.
+
+#### TypeSpec (Microsoft)
+
+**What to borrow:**
+- **Multi-emitter architecture.** One spec, many outputs:
+  ```
+  interface.yaml ──> C++ emitter       ──> node_interface.hpp
+                 ──> Python emitter    ──> interface.py
+                 ──> Docs emitter      ──> node_api_reference.md
+                 ──> Launch emitter    ──> default_launch.py
+                 ──> Graph emitter     ──> rosgraph_monitor_msgs/NodeInterface
+  ```
+- **Emitter-specific validation.** Each emitter adds its own checks
+  (e.g., C++ emitter warns about names that produce invalid C++
+  identifiers).
+
+#### OpenAPI
+
+**What to borrow:**
+- **The "Swagger UI" experience.** Auto-generated interactive
+  documentation from a schema. A "Swagger UI for ROS nodes" where every
+  node has browsable API docs showing topics, services, actions,
+  parameters, QoS, and message type definitions — generated from
+  `interface.yaml`.
+- **JSON Schema integration.** OpenAPI 3.1 aligned fully with JSON
+  Schema. The existing `interface.schema.yaml` (JSON Schema Draft
+  2020-12) is the right foundation.
+
+### 10.3 Static Analysis Architecture
+
+#### Ruff
+
+A Python linter written in Rust. Relevant not for Python linting but as
+the **best-in-class architecture for building a rule-based analysis
+tool**.
+
+**What to borrow:**
+
+| Ruff pattern | rosgraph equivalent |
+|---|---|
+| Rule enum + compile-time registry | `Rule` enum: `TOP001`, `SRV001`, `QOS001`, `GRF001` |
+| Hierarchical prefix codes | `TOP` (topic), `SRV` (service), `ACT` (action), `QOS`, `GRF` (graph) |
+| Single-pass traversal | Build graph model once, run all rules in one walk |
+| Safe/unsafe fix classification | Safe: add missing QoS. Unsafe: rename topic. Display-only: suggest restructure |
+| Preview → stable lifecycle | Same graduation for new rules |
+| Per-file-ignores | Per-package-ignores, per-launch-file-ignores |
+| Inline suppression | `# rosgraph: noqa: TOP001` |
+| SARIF output | GitHub Security tab integration |
+| Monolithic, no plugins initially | All rules built-in. WASM plugins later |
+| Zero-config defaults | Small, high-confidence default rule set |
+| `--add-noqa` for gradual adoption | Essential for existing ROS workspaces |
+
+**Key architectural lesson:** Speed is an architectural property, not an
+optimisation. Rust + hand-written parser + single-pass + parallel
+package processing + content caching + compile-time codegen.
+
+#### Go Analysis Framework
+
+The gold standard for pluggable static analysis architecture. Used by
+`go vet`, gopls, and golangci-lint.
+
+**What to borrow:**
+
+```
+GraphAnalyzer {
+    name:        str
+    doc:         str
+    requires:    [GraphAnalyzer]      # horizontal deps
+    result_type: Type | None          # typed output for dependent analyzers
+    fact_types:  [Fact]               # cross-package facts
+    run:         (GraphPass) → (result, [Diagnostic])
+}
+
+GraphPass {
+    graph:       ComputationGraph     # the full graph model
+    node:        NodeInterface        # current node under analysis
+    types:       MessageTypeDB        # all known msg/srv/action types
+    qos:         QoSProfileDB         # QoS profiles in the graph
+    result_of:   {Analyzer: Any}      # results from required analyzers
+    report:      (Diagnostic) → void
+    import_fact: (scope, Fact) → bool
+    export_fact: (scope, Fact) → void
+}
+```
+
+Key patterns:
+1. **Analyzers as values, not subclasses** — trivially composable
+2. **Pass as abstraction barrier** — same analyzer in CLI, IDE, CI
+3. **Horizontal dependencies** via `Requires`/`ResultOf` — typed data
+   flow between analyzers
+4. **Vertical facts** for cross-package analysis — cached per-package
+   results enabling separate modular analysis
+5. **Action graph** — 2D grid (analyzer x package), independent actions
+   execute in parallel
+
+#### golangci-lint
+
+**What to borrow:**
+- **Meta-linter pattern.** One CLI, one config, one output format
+  wrapping many analyzers.
+- **Shared parse.** All analyzers share one AST/model parse.
+- **Post-processing pipeline.** `noqa` filter → exclusion rules →
+  severity assignment → deduplication → output formatting.
+- **Differential analysis.** `new-from-merge-base: main` reports only
+  issues in code changed since the base branch. Critical for CI
+  adoption in large codebases.
+
+#### Spectral
+
+**What to borrow:**
+- **YAML-native lint rules** that work directly on `interface.yaml`
+  without language-specific parsing. Custom rulesets in YAML — a
+  robotics engineer can author a rule without knowing Rust or C++.
+  Low barrier to writing new rules.
+
+### 10.4 Runtime Monitoring Architecture
+
+#### OpenTelemetry
+
+Collector pipeline: Receiver → Processor → Exporter. Connectors join
+pipelines and enable signal type conversion.
+
+**What to borrow:**
+- **Pipeline architecture** for `rosgraph monitor`.
+- **Auto-instrumentation.** Two complementary paths:
+  - *Runtime observation* (zero-code): DDS discovery provides the graph
+    without modifying any node.
+  - *Code-generated instrumentation*: rosgraph-generated code embeds
+    topic stats, heartbeats, structured logging.
+  - The **three-way comparison** (declared vs. runtime-observed vs.
+    self-reported) catches issues that any two-way comparison misses.
+
+#### Prometheus
+
+**What to borrow:**
+- **Pull model.** Periodic scraping produces consistent point-in-time
+  snapshots. Absence of data is itself a signal (node is down).
+- **Alerting rules** with `for` durations to prevent flapping.
+- **Metric types mapping:**
+
+  | Prometheus type | ROS topic statistics equivalent |
+  |---|---|
+  | Counter | Messages published (total), dropped messages |
+  | Gauge | Active subscribers, queue depth, alive nodes |
+  | Histogram | Inter-arrival times, message sizes, latency distribution |
+
+#### Kubernetes Controllers
+
+**What to borrow:**
+- **Level-triggered reconciliation** (not edge-triggered). React to the
+  *current difference* between desired and actual state, not to
+  individual change events. If an event is missed, the next
+  reconciliation still catches the drift.
+- **Idempotent.** Running reconciliation twice with the same state
+  produces the same diff and alerts.
+- **Requeue with backoff.** After detecting drift, recheck sooner (1s).
+  If drift persists, escalate.
+- **Status reporting.** Maintained separately from the declared spec,
+  enabling external tools to query current state independently.
+
+### 10.5 Contract Testing & Verification
+
+| Framework | What it does | What to borrow for `rosgraph test` |
+|---|---|---|
+| **Schemathesis** | Fuzz a live API against its OpenAPI spec. Auto-generates test cases from schema. | Fuzz a running node against `interface.yaml` — auto-generate messages matching declared types, verify outputs. |
+| **Dredd** | Start a live server, send requests matching the spec, validate responses. The spec IS the test plan. | Run a node, systematically verify its interface matches declaration. Call every service, check every publisher. |
+| **Pact** | Consumer-driven contract testing. Consumer declares expectations; provider verifies. | Cross-node contract verification: Node A subscribes to `/cmd_vel` (Twist), Node B publishes it. Verify they agree on type. |
+| **gRPC health + reflection** | Standardized health checking + runtime introspection of services/methods. | Health reporting interface that rosgraph-generated nodes expose automatically. Runtime introspection vs. declared interface. |
+| **graphql-inspector** | Schema diff (breaking/dangerous/safe). Coverage: which fields are actually queried. | Interface coverage: "which declared topics are exercised in tests?" Schema diff between interface versions. |
+
+### 10.6 ROS Domain Prior Art: HAROS
+
+The High-Assurance ROS framework (University of Minho, 2016–2021). The
+only tool that accomplished Goals 3–4 for ROS, but only for ROS 1.
+
+**Pipeline:** Package discovery → CMake parsing → launch file parsing →
+source code parsing (libclang for C++, limited Python AST) →
+computation graph assembly → plugin-based analysis → JSON export.
+
+**The metamodel.** Formal classes for the ROS graph: `Node`,
+`NodeInstance`, `Topic`, `Service`, `Parameter`, plus typed link classes
+(`PublishLink`, `SubscribeLink`, etc.) carrying source conditions and
+dependency sets. This metamodel is HAROS's most transferable
+contribution.
+
+**HPL (HAROS Property Language).** Behavioural properties for
+message-passing systems:
+```
+globally: no /cmd_vel {linear.x > 1.0}           # speed limit
+globally: /bumper causes /stop_cmd                 # response
+globally: /cmd_vel requires /trajectory within 5s  # precedence
+```
+
+HPL drove three verification paths from a single spec: model checking
+(Electrum/Alloy), runtime monitors (generated), and property-based
+testing (Hypothesis strategies).
+
+**Why it died for ROS 2.** The extraction pipeline assumes catkin,
+`rospack`, XML launch files, `ros::NodeHandle`. ROS 2 changed
+everything. The maintainer closed ROS 2 support as *wontfix*.
+
+**What to borrow:** Metamodel, HPL's scope+pattern+event structure,
+plugin separation (source-level vs. model-level), one spec → multiple
+verification modes.
+
+**What to do differently:** Use declarations (`interface.yaml`) as
+primary source of truth (not source code parsing); support ROS 2
+concepts HAROS never had (QoS, lifecycle, components, actions, DDS
+discovery).
+
+---
+
+## 11. Safety & Certification
+
+rosgraph is not a safety tool — it is a development and verification
+tool that produces artifacts useful in safety cases. This section maps
+rosgraph capabilities to the evidence types required by safety
+standards.
+
+### 11.1 Relevant Standards
+
+| Standard | Domain | How rosgraph helps |
+|---|---|---|
+| **IEC 61508** | General functional safety | Design verification evidence (graph analysis), runtime monitoring |
+| **ISO 26262** | Automotive | Interface specification (`interface.yaml` as design artifact), static verification |
+| **IEC 62304** | Medical device software | Software architecture documentation, traceability |
+| **DO-178C** | Aerospace | Requirements traceability, structural coverage analysis |
+| **ISO 13482** | Service robots | Interface documentation, runtime monitoring |
+| **ISO 21448 (SOTIF)** | Safety of intended functionality | Graph analysis for identifying missing/unexpected interfaces |
+
+### 11.2 Artifact-to-Evidence Mapping
+
+| rosgraph artifact | Evidence type | Useful for |
+|---|---|---|
+| `interface.yaml` | Software architecture description | Design phase documentation |
+| `rosgraph lint` SARIF output | Static analysis results | Verification evidence |
+| `rosgraph monitor` logs | Runtime verification evidence | Validation phase |
+| `rosgraph test` results | Interface conformance evidence | Integration testing |
+| `rosgraph breaking` output | Change impact analysis | Change management |
+| `rosgraph docs` output | API documentation | Design review |
+
+### 11.3 Configurable Safety Levels
+
+Monitor alert grace periods (§8) and severity levels must be
+configurable for safety-critical deployments:
+
+```toml
+[monitor.alerts]
+NodeMissing = { grace_period_ms = 1000, severity = "critical" }   # 1s for surgical robot
+UnexpectedNode = { grace_period_ms = 5000, severity = "error" }
+TopicMissing = { grace_period_ms = 500, severity = "critical" }
+```
+
+The defaults in §8 are tuned for general robotics. Safety-critical
+deployments override them via `rosgraph.toml`.
+
+### 11.4 Behavioral Properties (Future)
+
+Structural analysis (Phase 1–2) proves the graph is correctly wired —
+a necessary precondition for behavioral safety. Behavioral analysis
+(Phase 3+) proves temporal and causal properties:
+
+```
+globally: /emergency_stop causes /motor_disable within 100ms
+globally: no /cmd_vel {linear.x > max_speed}
+globally: /heartbeat absent_for 500ms causes /safe_stop
+```
+
+This capability, inspired by HAROS HPL (§10.6), is where the deeper
+safety value lies. The structural graph model (§3.1) is designed to
+be extensible to behavioral annotations without schema redesign.
+
+### 11.5 Safety-Relevant Lint Rules (Future)
+
+| Rule | Description | Phase |
+|---|---|---|
+| `SAF001` | Critical subscriber has < N publishers (no redundancy) | 2 |
+| `SAF002` | Single point of failure in graph topology | 2 |
+| `SAF003` | Safety-critical node is not lifecycle-managed | 2 |
+| `TF001` | Declared `frame_id` not published by any node in graph | 2 |
+| `TF002` | Frame chain broken (no transform path between declared frames) | 3 |
+
+These rules are not in Phase 1 but the analyzer architecture (§3.5)
+supports adding them without architectural changes.
+
+---
+
+## 12. Scope & Limitations
+
+### When Not to Use rosgraph
+
+rosgraph adds value when the cost of interface bugs exceeds the cost
+of maintaining declarations. This trade-off favors rosgraph in
+multi-node systems, team environments, and production deployments.
+It does not favor rosgraph in every context:
+
+- **Quick prototyping.** If you're experimenting with a single node
+  and will throw it away next week, `interface.yaml` is overhead.
+  Use standard `rclcpp` / `rclpy` directly.
+- **Single-node packages.** A package with one node and no
+  cross-package interfaces gets minimal lint value. The code
+  generation may still be worthwhile for parameter validation.
+- **Highly dynamic interfaces.** Nodes that create publishers and
+  subscribers at runtime based on dynamic conditions (e.g., a
+  plugin host that discovers its interface at startup) are outside
+  scope (DP12). rosgraph can declare the static portion and flag
+  the dynamic portion as unexpected, but it cannot generate code
+  for interfaces it doesn't know about at build time.
+
+### Known Limitations
+
+**Spec-code drift for business logic.** Code generation covers the
+structural skeleton (pub/sub creation, parameter declaration, lifecycle
+transitions). Business logic is hand-written. If a developer adds an
+undeclared publisher inside a callback, `rosgraph lint` won't catch it
+at build time — only `rosgraph monitor` flags it at runtime as
+`UnexpectedTopic`. This is a fundamental limitation of any
+declaration-based approach: the declaration describes the intended
+interface, not the implementation.
+
+**Launch file coverage.** Python launch files are Turing-complete.
+AST pattern matching (§3.5) handles common declarative patterns but
+cannot resolve dynamic logic (conditionals based on environment
+variables, loops generating node sets). `system.yaml` (Layer 2) is
+the escape hatch for systems that need full static analyzability.
+
+**Ecosystem bootstrapping.** rosgraph's cross-package analysis (type
+mismatch detection, contract testing) requires multiple packages to
+have `interface.yaml`. The single-package value proposition is code
+generation and parameter validation. Cross-package value grows with
+adoption. `rosgraph discover` (§3.10) lowers the barrier by generating
+specs from running systems, but the generated specs require human
+review and refinement.
+
+**Scope of this proposal.** This document covers 51 features across
+7 subcommands. Not all will be built. Phase 1 (§4) is the commitment
+— the minimum viable tool that delivers value. Later phases are
+contingent on adoption and contributor capacity.
+
+---
+
+## 13. Resolved Questions
+
+The following questions were raised during the proposal drafting process
+and have been resolved. Answers are integrated into the relevant
+sections of this document.
+
+| # | Question | Resolution | Section |
+|---|----------|------------|---------|
+| 1 | Dynamic interfaces | Out of scope — rosgraph covers declared interfaces only (Design Principle 12). Undeclared runtime interfaces are flagged as `UnexpectedTopic` by monitor. | §2 |
+| 2 | Launch substitution evaluation | Three-path loader strategy: YAML launch (direct parse), `system.yaml` (static), Python launch AST (pattern matching). | §3.5 |
+| 3 | Behavioural properties | Structural first (Phase 1–2), behavioural later (Phase 3+) if adoption warrants it (Design Principle 13). | §2 |
+| 4 | `generate_parameter_library` unification | Keep as standalone, maintain schema compatibility. rosgraph delegates to gen_param_lib at build time. | §9.2 |
+| 5 | Multi-workspace analysis | Per-package fact caching via installed `interface.yaml` files. Phase 2 concern. | §3.12 |
+| 6 | Launch file extraction without clingwrap | Partial AST extraction for standard `launch_ros` patterns, with `system.yaml` as fully-static alternative. | §3.5 |
+| 7 | Relationship to graph-monitor | New implementation. Adopt graph-monitor's message definitions, reimplement scraping + reconciliation. | §3.6 |
+| 8 | Mixin pattern | `mixins:` section referencing interface fragments. Host's effective interface = own declaration + all mixins merged. | §3.2 |
+| 9 | Adoption path | `ros-tooling` org → REP for schema → docs.ros.org tutorials → `ros_core` (long-term). | §4 |
+| 10 | Declaration scope | Structural (node interfaces) only for Phase 1–2. Behavioural scope deferred to Phase 3+. | §2 |

From fa0478043d63101a44f929905e498cc46cf47e6f Mon Sep 17 00:00:00 2001
From: Luke Sy <sylukewicent@gmail.com>
Date: Mon, 23 Feb 2026 22:23:44 +1100
Subject: [PATCH 2/5] Trim FAQ, add system.yaml convergence note
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- FAQ: reduce from 880 to ~360 lines, add General section with
  cross-cutting questions, keep 2-3 essential per audience
- FAQ: fix all PROPOSAL.md cross-references to ROSGRAPH.md
- FAQ: add launch file / param config convergence question
- ROSGRAPH §3.2: add "Toward a single source of truth" note on
  system.yaml replacing launch files and parameter configs
- ROSGRAPH §12: remove resolved questions section (internal notes)

Signed-off-by: Luke Sy <sylukewicent@gmail.com>
---
 docs/FAQ.md      | 877 ++++++++++++-----------------------------------
 docs/ROSGRAPH.md |  33 +-
 2 files changed, 222 insertions(+), 688 deletions(-)

diff --git a/docs/FAQ.md b/docs/FAQ.md
index b01d3c6..583cd5c 100644
--- a/docs/FAQ.md
+++ b/docs/FAQ.md
@@ -9,171 +9,139 @@ questions that matter to you.
 
 ## Table of Contents
 
+0. [General](#0-general)
 1. [New ROS Developer](#1-new-ros-developer)
-2. [AI-Assisted Developer](#2-ai-assisted-developer)
-3. [Engineering Lead / System Integrator / DevOps](#3-engineering-lead--system-integrator--devops)
-4. [Safety-Critical Engineer](#4-safety-critical-engineer)
-5. [MoveIt / nav2 / Popular Module User](#5-moveit--nav2--popular-module-user)
-6. [The Skeptic](#6-the-skeptic)
-7. [Package Maintainer / ROS Governance](#7-package-maintainer--ros-governance)
-8. [Educator / University Researcher](#8-educator--university-researcher)
-9. [Embedded / Resource-Constrained Developer](#9-embedded--resource-constrained-developer)
+2. [Engineering Lead / System Integrator / DevOps](#2-engineering-lead--system-integrator--devops)
+3. [MoveIt / nav2 / Popular Module User](#3-moveit--nav2--popular-module-user)
+4. [AI-Assisted Developer](#4-ai-assisted-developer)
+5. [Package Maintainer / ROS Governance](#5-package-maintainer--ros-governance)
+6. [Educator / University Researcher](#6-educator--university-researcher)
+7. [Embedded / Resource-Constrained Developer](#7-embedded--resource-constrained-developer)
+8. [The Skeptic](#8-the-skeptic)
+9. [Safety-Critical Engineer](#9-safety-critical-engineer)
 
 ---
 
-## 1. New ROS Developer
+## 0. General
 
 ### What problem does rosgraph solve?
 
-ROS 2 doesn't verify that your nodes are wired correctly until
-runtime — and often not even then. Type mismatches between publishers
-and subscribers fail silently. QoS incompatibilities drop connections
-with no error. Parameter renames break launch files with no build
-error.
+When you connect ROS 2 nodes together, mistakes are invisible. If one
+node sends a `Twist` message but another node expects a
+`TwistStamped`, nothing warns you — the subscriber just never receives
+data. If you misspell a topic name in a launch file, the node launches
+fine but sits there doing nothing. You end up staring at
+`ros2 topic list` wondering why nothing is connected.
+
+rosgraph catches these wiring mistakes before you even launch your
+system. You describe what each node publishes, subscribes to, and what
+settings it needs in a short YAML file. Then `rosgraph lint` checks
+that everything fits together — like a spell checker, but for your
+ROS graph.
 
-rosgraph catches these at build time. See [PROPOSAL.md §1, "The
-Problem, Concretely"](PROPOSAL.md#the-problem-concretely) for four
-real-world examples.
+See [ROSGRAPH.md §1, "The Problem,
+Concretely"](ROSGRAPH.md#the-problem-concretely) for four real-world
+examples.
 
 ### How much do I need to learn?
 
-Write one `interface.yaml` per node (~15 lines for a basic pub/sub
-node). Run three commands:
+One file per node (`interface.yaml`, about 15 lines) and three
+commands:
 
 ```bash
-rosgraph generate .   # generates code
-rosgraph lint .       # checks for issues
-rosgraph monitor      # watches the running system
+rosgraph generate .   # creates starter code from your YAML
+rosgraph lint .       # checks for wiring mistakes
+rosgraph monitor      # watches the running system for problems
 ```
 
-The YAML schema has IDE autocompletion via JSON Schema. See the [Quick
-Start](PROPOSAL.md#quick-start-what-it-looks-like) for a complete
-minimal example.
-
-### What do I stop doing when I adopt rosgraph?
-
-- **Stop writing pub/sub boilerplate.** Publisher creation, subscriber
-  setup, parameter declaration — all generated from `interface.yaml`.
-- **Stop manually syncing parameters between code and launch files.**
-  `interface.yaml` is the single source of truth for parameter names,
-  types, defaults, and validation ranges.
-- **Stop debugging silent QoS mismatches.** `rosgraph lint` catches
-  incompatible QoS profiles before you launch.
-- **Stop wondering if your launch files reference the right nodes.**
-  `rosgraph lint` validates node refs, remappings, and parameter
-  overrides.
-
-### Will error messages actually be helpful?
+Your editor will autocomplete the YAML fields for you — no need to
+memorize the format. See the [Quick
+Start](ROSGRAPH.md#quick-start-what-it-looks-like) for a complete
+example.
 
-Error quality is a design requirement, not an afterthought. The
-architecture follows Ruff's model ([PROPOSAL.md
-§10.3](PROPOSAL.md#103-static-analysis-architecture)):
-
-- Every diagnostic includes a rule code (`TOP001`), the location in
-  `interface.yaml`, and what's wrong.
-- Safe fixes can be auto-applied. Unsafe fixes are flagged but not
-  auto-applied.
-- SARIF output enables inline annotations in GitHub PRs.
-- `--add-noqa` generates suppression comments for existing issues,
-  so you can adopt gradually without noise.
-
-### Do I need to learn YAML schema syntax?
+### What's the overhead?
 
-Not really. If your editor has the YAML Language Server (most do),
-you get autocompletion, inline validation, and hover docs from the
-JSON Schema ([PROPOSAL.md §6, G2](PROPOSAL.md#6-feature-list)). Write
-a few fields, let the editor fill in the structure.
+Per node: one `interface.yaml` file (~15-30 lines). Most of it is
+information you're already specifying in code (topic names, message
+types, QoS settings, parameter names) — `interface.yaml` centralizes
+it.
 
----
+What you get back:
+- No pub/sub boilerplate (generated)
+- No parameter declaration boilerplate (generated via
+  `generate_parameter_library`)
+- Pre-launch graph validation
+- Runtime graph monitoring
+- Auto-generated API documentation
 
-## 2. AI-Assisted Developer
+The net line-count change is typically negative for nodes with
+parameters.
 
-### How does rosgraph work with AI coding tools?
+### What about my launch files and parameter configs?
 
-`interface.yaml` is a machine-readable contract — exactly what LLMs
-are good at consuming and generating. The `InterfaceDescriptor` IR
-([PROPOSAL.md §3.3](PROPOSAL.md#33-the-interfacedescriptor-ir)) is a
-JSON blob containing a node's complete API: topics, types, QoS,
-parameters, lifecycle state. An AI agent reads this to understand what
-a node does, generate implementation code, write tests, or suggest
-fixes — without parsing C++ or Python source.
+`system.yaml` (Layer 2) overlaps heavily with both — all three
+describe which nodes run, with what parameters, and with what
+remappings. The long-term direction is convergence: `system.yaml`
+becomes the graph spec, the parameter config, *and* the launch
+description in one file. `rosgraph generate` emits a runnable launch
+file from the same spec that `rosgraph lint` validates — no drift
+between what you analyze and what you run.
 
-See [PROPOSAL.md §3.13](PROPOSAL.md#313-ai--tooling-integration) for
-the full AI integration design.
+For projects with multiple deployment configurations (sim, real, test),
+each gets its own `system.yaml`, replacing both the per-config launch
+file and the per-config parameter YAML. See [ROSGRAPH.md
+§3.2](ROSGRAPH.md#32-schema-layers).
 
-### Can I use `rosgraph generate` as an agent tool?
+### Won't the spec just drift from reality like NoDL?
 
-Yes. An AI agent writing a ROS node can:
-1. Generate `interface.yaml` from a natural language description
-2. Run `rosgraph generate .` as a tool call to get type-safe
-   scaffolding
-3. Write only the business logic into the generated skeleton
-4. Run `rosgraph lint .` to verify the graph is correct
+NoDL died because it was a pure description format — no code
+generation. Maintaining a spec that doesn't produce anything is
+thankless work.
 
-This avoids the common failure mode of LLMs hallucinating ROS
-boilerplate (wrong QoS defaults, missing component registration,
-incorrect parameter declaration).
+`interface.yaml` generates code. If you change the spec, the generated
+code changes. If you change the code without changing the spec,
+`rosgraph monitor` flags the discrepancy at runtime. The two-way
+binding (codegen + runtime monitoring) is what prevents the drift
+that killed NoDL.
 
-### Will there be an MCP server?
+The honest limitation: business logic is hand-written. If a developer
+adds an undeclared publisher inside a callback, `rosgraph lint` won't
+catch it at build time. `rosgraph monitor` catches it at runtime as
+`UnexpectedTopic`. See [ROSGRAPH.md
+§12](ROSGRAPH.md#12-scope--limitations).
 
-It's architecturally planned ([PROPOSAL.md
-§3.13](PROPOSAL.md#313-ai--tooling-integration)). An MCP server
-would expose:
-- Graph state (which nodes exist, what they publish/subscribe)
-- Lint results (current issues in the workspace)
-- Interface schemas (what a specific node expects)
-- Resolved topic names (after remapping/namespacing)
+---
 
-This lets Claude Code, Cursor, or Copilot answer "what topics does
-the perception pipeline publish?" from structured data, not grep.
+## 1. New ROS Developer
 
-### Can an AI generate `interface.yaml` from a description?
+### What does rosgraph do for me?
 
-Yes — the constrained schema makes this tractable. The schema has
-~10 top-level keys with well-defined types. "I need a node that
-subscribes to a lidar point cloud, filters it, and publishes the
-result" produces a valid `interface.yaml` that `rosgraph generate`
-immediately scaffolds.
+- **Writes the repetitive code.** Creating publishers, subscribers,
+  and declaring parameters — `rosgraph generate` handles this from
+  your YAML file. You write only the interesting part (what your node
+  actually *does*).
+- **Catches mistakes early.** Mismatched message types, misspelled
+  topic names, incompatible connection settings — found in seconds,
+  not after a 30-second launch-debug-relaunch cycle.
+- **Keeps settings in one place.** Parameter names, types, and default
+  values live in `interface.yaml` instead of scattered across your
+  code, launch files, and README.
 
-`rosgraph discover` ([PROPOSAL.md
-§3.10](PROPOSAL.md#310-rosgraph-discover--runtime-to-spec-generation))
-can also generate `interface.yaml` from a running node, which an LLM
-can then refine — adding descriptions, suggesting QoS rationale, and
-grouping related interfaces.
+### Will error messages make sense?
 
-### What about IDE / LSP integration?
+Yes — this is a design priority. Each error tells you:
 
-Phase 1 delivers JSON Schema validation (IDE autocompletion for
-`interface.yaml`). A dedicated LSP server would add:
-- Hover for message type definitions
-- Go-to-definition for `$ref` targets
-- Inline diagnostics from `rosgraph lint`
-- Cross-file rename support
+- **Where:** which file and line has the problem
+- **What:** a plain description of what's wrong
+- **How to fix it:** a suggested correction, auto-applied when safe
 
-This benefits both human developers and AI agents operating within
-IDE contexts. See [PROPOSAL.md
-§3.13](PROPOSAL.md#313-ai--tooling-integration).
+No cryptic stack traces. No silent failures. See [ROSGRAPH.md
+§10.3](ROSGRAPH.md#103-static-analysis-architecture) for the error
+design.
 
 ---
 
-## 3. Engineering Lead / System Integrator / DevOps
-
-### Who owns an `interface.yaml`?
-
-The node author defines it. Downstream consumers depend on the
-installed version in `share/<package>/interfaces/`. Changes are
-coordinated via:
-
-- `rosgraph breaking` ([PROPOSAL.md
-  §3.9](PROPOSAL.md#39-rosgraph-breaking--breaking-change-detection))
-  — automated detection of breaking changes in CI, blocking merges
-  that break downstream consumers.
-- Installed interfaces — downstream teams depend on published
-  interfaces without pulling source code.
-- Semantic versioning alignment — breaking = major, dangerous = minor,
-  safe = patch.
-
-See [PROPOSAL.md §3.14](PROPOSAL.md#314-scale--fleet-considerations).
+## 2. Engineering Lead / System Integrator / DevOps
 
 ### How does this scale to hundreds of packages?
 
@@ -182,33 +150,28 @@ See [PROPOSAL.md §3.14](PROPOSAL.md#314-scale--fleet-considerations).
   with parallel per-package processing and content caching.
 - **Multi-workspace analysis:** Installed `interface.yaml` files in
   underlays serve as cached facts. Only your workspace is analyzed,
-  not the entire underlay. See [PROPOSAL.md
-  §3.12](PROPOSAL.md#312-multi-workspace-analysis).
+  not the entire underlay. See [ROSGRAPH.md
+  §3.12](ROSGRAPH.md#312-multi-workspace-analysis).
 - **Differential analysis:** `--new-only` reports only issues
   introduced since the base branch. No noise from existing code.
-- **Per-package configuration:** Override lint rules per package via
-  `rosgraph.toml`.
 
 ### I compose nodes from multiple vendors. How does rosgraph help?
 
-`system.yaml` (Layer 2 schema, [PROPOSAL.md
-§3.2](PROPOSAL.md#32-schema-layers)) declares the intended system
+`system.yaml` (Layer 2 schema, [ROSGRAPH.md
+§3.2](ROSGRAPH.md#32-schema-layers)) declares the intended system
 composition — which nodes, which namespaces, which parameter overrides,
 which remappings. `rosgraph lint` validates the composed graph:
 
-- **Type mismatches** across package boundaries (Node A publishes
-  `Twist`, Node B subscribes expecting `TwistStamped`)
+- **Type mismatches** across package boundaries
 - **QoS incompatibilities** between a vendor's publisher and your
   subscriber
 - **Disconnected subgraphs** — nodes that should be connected but
   aren't due to a namespace or remapping error
-- **Invalid remappings** — remaps pointing to nonexistent topics
 
 If a vendor doesn't ship `interface.yaml`, use `rosgraph discover`
-([PROPOSAL.md
-§3.10](PROPOSAL.md#310-rosgraph-discover--runtime-to-spec-generation))
-to generate one from a running instance of the vendor's node. The
-discovered spec becomes your integration contract.
+([ROSGRAPH.md
+§3.10](ROSGRAPH.md#310-rosgraph-discover--runtime-to-spec-generation))
+to generate one from a running instance of the vendor's node.
 
 ### How does rosgraph fit into CI?
 
@@ -218,178 +181,26 @@ rosgraph is CI-first by design (Design Principle 8):
 # GitHub Actions example
 - name: Lint graph
   run: rosgraph lint . --output-format sarif --new-only --base main
-  # SARIF output → GitHub Security tab, PR annotations
 
 - name: Check breaking changes
   run: rosgraph breaking --base main
-  # Exit code 1 if breaking changes detected
 
 - name: Run contract tests
   run: rosgraph test
-  # Schema-driven interface conformance tests
 ```
 
 Output formats: `text`, `json`, `sarif` (GitHub Security tab),
-`github` (Actions annotations), `junit` (test reports). All
-configurable via `rosgraph.toml` or `--output-format`. See
-[PROPOSAL.md §3.11](PROPOSAL.md#311-configuration).
-
-For brownfield adoption, `--add-noqa` generates inline suppression
-comments for all existing issues, creating a clean baseline. You
-don't get 500 warnings on your first PR.
-
-### What about the colcon build workflow?
-
-`colcon-rosgraph` (Phase 2) is a thin colcon verb plugin that delegates
-to the standalone `rosgraph` binary. It adds `colcon lint`,
-`colcon docs`, `colcon discover`, and `colcon breaking` — iterating
-packages in dependency order with parallel execution. See [PROPOSAL.md
-§3.15](PROPOSAL.md#315-colcon-integration).
-
-Phase 1 works standalone: `rosgraph lint .` in any directory. No
-colcon dependency required.
-
-### What about fleet-level monitoring?
-
-`rosgraph monitor` runs per-robot. For fleet-scale observability:
-
-- The Prometheus `/metrics` exporter (M7) enables standard Grafana
-  dashboards aggregated across the fleet.
-- The `/rosgraph/diff` topic on each robot can be bridged to a
-  central system for aggregated drift analysis.
-- The architecture uses standard observability patterns (Prometheus,
-  structured logs, `/diagnostics`) rather than inventing fleet-specific
-  infrastructure.
-
-Runtime performance targets: reconciliation < 500ms for 200 nodes,
-< 50MB memory, < 5% CPU at steady state. See [PROPOSAL.md
-§3.14](PROPOSAL.md#314-scale--fleet-considerations).
-
-### Can we enforce org-specific conventions?
-
-Yes. `rosgraph.toml` supports per-package rule overrides, custom
-naming patterns, and rule selection. The Spectral-inspired YAML-native
-rule system ([PROPOSAL.md
-§10.3](PROPOSAL.md#103-static-analysis-architecture)) means a
-robotics engineer can write custom rules without knowing Rust or C++.
-
-### Does rosgraph handle launch file complexity?
-
-Three strategies, phased by tractability ([PROPOSAL.md
-§3.5](PROPOSAL.md#35-rosgraph-lint--static-analysis)):
-
-1. **YAML launch files** — fully parseable, Phase 1
-2. **`system.yaml`** — static composition schema, fully analyzable,
-   Phase 1
-3. **Python launch AST** — pattern matching for `Node()`,
-   `LaunchConfiguration()`, etc., Phase 2
-
-Python launch files with complex conditionals, loops, or dynamically
-computed node sets can't be fully statically analyzed. `system.yaml`
-is the escape hatch for systems that need full analyzability.
-
----
-
-## 4. Safety-Critical Engineer
-
-### Does rosgraph help with certification?
-
-rosgraph is not a safety tool — it's a development and verification
-tool that produces artifacts useful in safety cases. See [PROPOSAL.md
-§11](PROPOSAL.md#11-safety--certification) for the full mapping.
-
-Key artifacts:
-
-| rosgraph artifact | Evidence type |
-|---|---|
-| `interface.yaml` | Software architecture description |
-| `rosgraph lint` SARIF output | Static analysis results |
-| `rosgraph monitor` logs | Runtime verification evidence |
-| `rosgraph test` results | Interface conformance evidence |
-| `rosgraph breaking` output | Change impact analysis |
-
-### Which safety standards does this map to?
-
-IEC 61508 (general functional safety), ISO 26262 (automotive),
-IEC 62304 (medical), DO-178C (aerospace), ISO 13482 (service robots),
-and ISO 21448 / SOTIF. See [PROPOSAL.md
-§11.1](PROPOSAL.md#111-relevant-standards) for how rosgraph maps to
-each.
-
-### What about behavioral properties?
-
-Phase 1-2 covers structural properties: type matches, QoS
-compatibility, graph connectivity. This is a necessary precondition
-for behavioral safety — you can't reason about message timing if the
-messages aren't connected correctly.
-
-Behavioral analysis (Phase 3+) adds temporal and causal properties,
-inspired by HAROS HPL:
-
-```
-globally: /emergency_stop causes /motor_disable within 100ms
-globally: /heartbeat absent_for 500ms causes /safe_stop
-```
-
-See [PROPOSAL.md §11.4](PROPOSAL.md#114-behavioral-properties-future).
-
-### Are monitor alert thresholds configurable?
-
-Yes. The defaults (10s for `NodeMissing`, 30s for `UnexpectedNode`)
-are tuned for general robotics. Safety-critical deployments override
-them via `rosgraph.toml`:
-
-```toml
-[monitor.alerts]
-NodeMissing = { grace_period_ms = 1000, severity = "critical" }
-TopicMissing = { grace_period_ms = 500, severity = "critical" }
-```
-
-See [PROPOSAL.md §11.3](PROPOSAL.md#113-configurable-safety-levels).
-
-### Are there safety-specific lint rules?
-
-Planned for Phase 2-3:
-
-| Rule | Description |
-|---|---|
-| `SAF001` | Critical subscriber has < N publishers (no redundancy) |
-| `SAF002` | Single point of failure in graph topology |
-| `SAF003` | Safety-critical node is not lifecycle-managed |
-| `TF001` | Declared `frame_id` not published by any node |
-| `TF002` | Broken frame chain (no transform path) |
-
-The analyzer architecture supports adding these without changes.
-See [PROPOSAL.md §11.5](PROPOSAL.md#115-safety-relevant-lint-rules-future).
-
-### What about determinism and real-time guarantees?
-
-`rosgraph monitor` is an observation tool, not a safety-critical
-component. It runs in its own process, does not interfere with the
-monitored system, and its failure does not affect the system under
-observation. It is not designed to be real-time safe.
-
-For hard real-time requirements, the monitor's output (Prometheus
-metrics, diagnostics topics) can be consumed by a separate real-time
-safety monitor. rosgraph provides the graph model; the real-time
-enforcement layer is a separate concern.
-
-### What about audit trails?
-
-`rosgraph lint` produces SARIF output with timestamps, tool version,
-rule versions, and results. This can be stored as CI artifacts for
-audit purposes. A dedicated audit log format for `rosgraph monitor`
-(continuous verification evidence) is not in Phase 1 but the
-structured output (JSON, SARIF) makes it straightforward to add.
+`github` (Actions annotations), `junit` (test reports). See
+[ROSGRAPH.md §3.11](ROSGRAPH.md#311-configuration).
 
 ---
 
-## 5. MoveIt / nav2 / Popular Module User
+## 3. MoveIt / nav2 / Popular Module User
 
 ### Does rosgraph work with nav2's plugin system?
 
-Yes, via the mixin system ([PROPOSAL.md
-§3.2](PROPOSAL.md#32-schema-layers)). Plugins that inject interfaces
+Yes, via the mixin system ([ROSGRAPH.md
+§3.2](ROSGRAPH.md#32-schema-layers)). Plugins that inject interfaces
 into a host node are declared as mixins:
 
 ```yaml
@@ -398,222 +209,57 @@ node:
   name: follow_path
   package: nav2_controller
 
-parameters:
-  controller_plugin:
-    type: string
-    default_value: "dwb_core::DWBLocalPlanner"
-
 mixins:
-  - ref: dwb_core/dwb_local_planner   # brings in max_vel_x, etc.
-  - ref: nav2_costmap_2d/costmap       # brings in costmap params
+  - ref: dwb_core/dwb_local_planner
+  - ref: nav2_costmap_2d/costmap
 ```
 
 The host's effective interface = its own declaration + all mixin
-interfaces merged. This gives `rosgraph lint` and `rosgraph monitor`
-the complete picture.
-
-Mixins are Phase 2 (G15). Phase 1 works for nodes without plugins.
-
-### What happens when I switch plugins (e.g., DWB → MPPI)?
-
-You update the mixin reference in `interface.yaml`. The effective
-interface changes at build time, and `rosgraph generate` produces new
-scaffolding. This is a build-time concern — `rosgraph lint` validates
-the graph with the new plugin's interface.
-
-If the plugin is selected at runtime via parameter, this falls under
-"dynamic interfaces" (Design Principle 12) — rosgraph declares the
-static portion and `rosgraph monitor` flags unexpected interfaces.
-
-### Does rosgraph validate TF frames?
-
-Planned for Phase 2-3. `TF001` checks that declared `frame_id` values
-are published by some node in the graph. `TF002` checks that frame
-chains are connected (no broken transform paths). See [PROPOSAL.md
-§11.5](PROPOSAL.md#115-safety-relevant-lint-rules-future).
-
-TF is the #1 source of silent bugs in ROS 2 navigation and
-manipulation. This is high-value but requires the graph model to
-include TF publisher information, which depends on `interface.yaml`
-having a `frame_id` annotation.
+interfaces merged. Mixins are Phase 2 (G15). Phase 1 works for nodes
+without plugins.
 
 ### What about `generate_parameter_library` compatibility?
 
-Full compatibility is a non-negotiable design principle ([PROPOSAL.md
-§2, DP9](PROPOSAL.md#2-design-principles)). The `parameters:` section
+Full compatibility is a non-negotiable design principle ([ROSGRAPH.md
+§2, DP9](ROSGRAPH.md#2-design-principles)). The `parameters:` section
 of `interface.yaml` IS the `generate_parameter_library` format. A
 standalone gen_param_lib YAML file works as-is when placed in
 `interface.yaml`. rosgraph delegates to gen_param_lib at build time.
-See [PROPOSAL.md §9.2](PROPOSAL.md#92-tool-assessments).
-
-### Can rosgraph lint my existing launch files?
-
-Phase 1 supports YAML launch files (direct parse) and `system.yaml`
-(Layer 2 schema). Phase 2 adds Python launch file AST analysis for
-standard `launch_ros` patterns — `Node()`, `LaunchConfiguration()`,
-`DeclareLaunchArgument()`.
-
-Limitations: Python launch files that use conditionals, loops, or
-dynamically computed node sets cannot be fully statically analyzed.
-`system.yaml` is the escape hatch for systems that need full static
-analyzability. See [PROPOSAL.md
-§3.5](PROPOSAL.md#35-rosgraph-lint--static-analysis).
-
-### Does this work with Gazebo / Isaac Sim?
-
-Simulators expose ROS interfaces that look identical to real hardware.
-`rosgraph discover` can introspect a simulated system and generate
-`interface.yaml`. `rosgraph monitor` can verify that a simulated
-system matches the declared graph. `rosgraph lint` doesn't
-distinguish between real and simulated — it validates the graph model.
-
-### What about message type changes across ROS distros?
-
-`interface.yaml` references message types by name (e.g.,
-`geometry_msgs/msg/Twist`). Message type compatibility across distros
-is a ROS infrastructure concern, not a rosgraph concern. rosgraph
-validates that publishers and subscribers on the same topic agree on
-type — it doesn't validate that the type definition itself is
-compatible across distros.
-
-`rosgraph breaking` can detect when a type reference changes between
-versions of an `interface.yaml`.
+See [ROSGRAPH.md §9.2](ROSGRAPH.md#92-tool-assessments).
 
 ---
 
-## 6. The Skeptic
-
-### I write good tests. Why do I need another YAML file?
-
-Tests catch type mismatches and QoS issues at launch time — after you
-wait 30 seconds for the stack to start, watch it fail, read the logs,
-and figure out which of 40 nodes has the wrong type. Then you fix it,
-rebuild, relaunch, and wait again.
-
-`rosgraph lint` catches the same bugs in under 5 seconds, before
-launch, in CI, before anyone else has to debug it. It's the difference
-between "tests catch bugs" and "bugs never reach the test phase."
-
-### What's the overhead?
-
-Per node: one `interface.yaml` file (~15-30 lines). Most of it is
-information you're already specifying in code (topic names, message
-types, QoS settings, parameter names) — `interface.yaml` centralizes
-it.
-
-What you get back:
-- No pub/sub boilerplate (generated)
-- No parameter declaration boilerplate (generated via
-  `generate_parameter_library`)
-- Pre-launch graph validation
-- Runtime graph monitoring
-- Auto-generated API documentation
-
-The net line-count change is typically negative for nodes with
-parameters.
-
-### What if rosgraph can't express what I need?
-
-Escape hatches:
-- **`# rosgraph: noqa: TOP001`** — suppress specific lint rules per
-  line.
-- **Per-package ignores** — exclude entire packages from specific
-  rules via `rosgraph.toml`.
-- **Undeclared interfaces** — if your code creates publishers that
-  aren't in `interface.yaml`, the code still works. `rosgraph monitor`
-  flags them as `UnexpectedTopic` (a warning, not an error).
-- **Composition pattern** — generated code holds a `rclcpp::Node`
-  (has-a), not inherits from it. You always have access to the
-  underlying node for anything the schema can't express.
-
-See [PROPOSAL.md §12](PROPOSAL.md#12-scope--limitations) for the full
-limitations discussion.
-
-### Does code generation add runtime overhead?
-
-The composition pattern (has-a Node) adds one level of indirection
-compared to direct inheritance. This is a pointer dereference — single
-nanoseconds. The generated pub/sub wrappers are thin forwarding calls.
-No virtual dispatch is added that wouldn't already exist in the ROS
-client library.
+## 4. AI-Assisted Developer
 
-The parameter validation code (from `generate_parameter_library`) runs
-at parameter-set time, not in the hot path.
-
-### What happens when only my package has an `interface.yaml`?
-
-You still get:
-- **Code generation** — less boilerplate in your node
-- **Parameter validation** — runtime type and range checking
-- **Self-documentation** — your node's API is machine-readable
-
-Cross-package value (type mismatch detection, QoS compatibility
-checking, contract testing) grows with adoption. `rosgraph discover`
-lets you generate specs for neighboring packages from a running
-system, bootstrapping the cross-package graph incrementally.
-
-### This proposal has 51 features. Is this realistic?
-
-Phase 1 ([PROPOSAL.md §4](PROPOSAL.md#4-phasing)) is the commitment:
-~12 features covering core schema, basic code generation, and
-highest-value lint and monitor rules. Later phases are contingent on
-adoption.
-
-The tool builds on existing work — cake for code generation,
-`generate_parameter_library` for parameters, `graph-monitor` message
-definitions for runtime. Phase 1 is stabilizing and unifying existing
-pieces, not building from scratch.
-
-### Won't the spec just drift from reality like NoDL?
-
-NoDL died because it was a pure description format — no code
-generation. Maintaining a spec that doesn't produce anything is
-thankless work.
+### How does rosgraph work with AI coding tools?
 
-`interface.yaml` generates code. If you change the spec, the generated
-code changes. If you change the code without changing the spec,
-`rosgraph monitor` flags the discrepancy at runtime. The two-way
-binding (codegen + runtime monitoring) is what prevents the drift
-that killed NoDL.
+`interface.yaml` is a machine-readable contract — exactly what LLMs
+are good at consuming and generating. The `InterfaceDescriptor` IR
+([ROSGRAPH.md §3.3](ROSGRAPH.md#33-the-interfacedescriptor-ir)) is a
+JSON blob containing a node's complete API: topics, types, QoS,
+parameters, lifecycle state. An AI agent reads this to understand what
+a node does, generate implementation code, write tests, or suggest
+fixes — without parsing C++ or Python source.
 
-The honest limitation: business logic is hand-written. If a developer
-adds an undeclared publisher inside a callback, `rosgraph lint` won't
-catch it at build time. `rosgraph monitor` catches it at runtime as
-`UnexpectedTopic`. See [PROPOSAL.md
-§12](PROPOSAL.md#12-scope--limitations).
+See [ROSGRAPH.md §3.13](ROSGRAPH.md#313-ai--tooling-integration) for
+the full AI integration design.
 
-### When should I NOT use rosgraph?
+### Can I use `rosgraph generate` as an agent tool?
 
-- **Quick prototyping** — single throwaway node, not worth the file.
-- **Single-node packages** — minimal lint value, though codegen may
-  still save boilerplate.
-- **Highly dynamic interfaces** — nodes that create/destroy publishers
-  at runtime based on conditions can't be fully declared.
+Yes. An AI agent writing a ROS node can:
+1. Generate `interface.yaml` from a natural language description
+2. Run `rosgraph generate .` as a tool call to get type-safe
+   scaffolding
+3. Write only the business logic into the generated skeleton
+4. Run `rosgraph lint .` to verify the graph is correct
 
-See [PROPOSAL.md §12, "When Not to Use
-rosgraph"](PROPOSAL.md#when-not-to-use-rosgraph).
+This avoids the common failure mode of LLMs hallucinating ROS
+boilerplate (wrong QoS defaults, missing component registration,
+incorrect parameter declaration).
 
 ---
 
-## 7. Package Maintainer / ROS Governance
-
-### What does rosgraph mean for my package?
-
-If you maintain a ROS 2 package, `interface.yaml` is a machine-readable
-contract for your node's public API — topics, services, actions,
-parameters, QoS. It replaces the informal contract currently scattered
-across READMEs, launch file comments, and source code.
-
-For consumers of your package, this means:
-- **API discoverability.** `rosgraph docs` auto-generates browsable API
-  reference from your `interface.yaml`. No more stale READMEs.
-- **Breaking change visibility.** `rosgraph breaking` classifies
-  interface changes as breaking/dangerous/safe, giving downstream users
-  clear upgrade guidance. See [PROPOSAL.md
-  §3.9](PROPOSAL.md#39-rosgraph-breaking--breaking-change-detection).
-- **Contract testing.** Downstream packages can run `rosgraph test`
-  against your declared interface to verify compatibility. See
-  [PROPOSAL.md §3.7](PROPOSAL.md#37-rosgraph-test--contract-testing).
+## 5. Package Maintainer / ROS Governance
 
 ### Do I have to adopt rosgraph to be compatible with it?
 
@@ -624,48 +270,26 @@ need to ship `interface.yaml` for others to benefit — though shipping
 one is much better, since discovered specs require human review and may
 miss QoS details.
 
-### How does this affect my release process?
-
-`rosgraph breaking` runs in CI comparing the current `interface.yaml`
-against the previous release. Breaking changes block the merge unless
-explicitly acknowledged. This is opt-in per package via `rosgraph.toml`
-and maps to semantic versioning: breaking = major, dangerous = minor,
-safe = patch. See [PROPOSAL.md
-§3.14](PROPOSAL.md#314-scale--fleet-considerations).
-
-### What about packages with plugin systems?
-
-If your package exposes a plugin API (like nav2's controller plugins),
-the mixin system (Phase 2, G15) lets plugin authors declare the
-interfaces they inject into the host node. The host's effective
-interface is the merge of its own declaration plus all mixin fragments.
-See [PROPOSAL.md §3.2](PROPOSAL.md#32-schema-layers).
-
-Until mixins ship in Phase 2, the host node's `interface.yaml` covers
-its own direct interfaces. Plugins that add extra topics/parameters
-are flagged by `rosgraph monitor` as unexpected — visible but not
-validated.
-
 ### What's the adoption path toward `ros_core`?
 
-Deliberately incremental ([PROPOSAL.md §4, "Adoption
-Path"](PROPOSAL.md#adoption-path)):
+Deliberately incremental ([ROSGRAPH.md §4, "Adoption
+Path"](ROSGRAPH.md#adoption-path)):
 
 1. **`ros-tooling` organization** — institutional backing, CI
-   infrastructure, release process. graph-monitor already lives here.
+   infrastructure, release process.
 2. **REP for `interface.yaml` schema** — formalizes the declaration
    format as a community standard, independent of the rosgraph tool.
 3. **docs.ros.org tutorial integration** — if "write your first node"
    uses `interface.yaml`, every new ROS developer learns it from day
-   one. This is the highest-leverage adoption path.
+   one.
 4. **`ros_core` proposal** — after demonstrated adoption across
    multiple distros.
 
 ### Why not extend existing tools instead?
 
 Each existing tool covers one capability but none covers the full
-scope. The gap analysis ([PROPOSAL.md
-§9.3](PROPOSAL.md#93-gap-analysis)) shows five major gaps: graph diff,
+scope. The gap analysis ([ROSGRAPH.md
+§9.3](ROSGRAPH.md#93-gap-analysis)) shows five major gaps: graph diff,
 graph linting, QoS static analysis, behavioral properties, and CI graph
 validation. No single existing tool can be extended to fill all five.
 
@@ -675,144 +299,38 @@ rosgraph builds on existing work where possible:
 - cake's design decisions for code generation (validated)
 - HAROS's metamodel for the graph model (adapted)
 
-### What's the maintenance burden?
-
-Phase 1 is ~12 features covering core schema, basic codegen, and
-highest-value lint/monitor rules. The design minimizes ongoing
-maintenance:
-
-- **Schema versioning** (G14) — `schema_version` field with migration
-  tooling prevents breaking changes to `interface.yaml` format.
-- **IR-based plugin protocol** — code generation plugins are standalone
-  executables, independently maintained.
-- **Analyzer DAG** — lint rules are isolated, independently testable
-  values (not subclasses). Adding or removing a rule doesn't affect
-  others.
-
-The risk factor: this is a new tool, not an extension of something with
-existing momentum. It requires sustained contributor commitment.
-
-### How does this interact with the ROS 2 type system?
-
-rosgraph references existing `.msg`, `.srv`, and `.action` types — it
-doesn't replace them (Design Principle 9). `interface.yaml` declares
-which types a node uses; `rosidl` still defines the types themselves.
-
-The graph model ([PROPOSAL.md §3.1](PROPOSAL.md#31-the-graph-model))
-includes a `MessageTypeDB` that resolves type references to their
-definitions for compatibility checking. This uses the existing
-`rosidl` output — rosgraph doesn't parse `.msg` files directly.
-
-### What about governance and community standards?
-
-The REP process is the standard mechanism for formalizing ROS community
-standards. A REP for the `interface.yaml` schema would:
-
-- Define the YAML schema specification independent of the rosgraph tool
-- Allow alternative implementations (someone could build a different
-  tool that consumes the same schema)
-- Provide a formal review process for schema changes
-- Signal community endorsement
-
-The REP is Step 2 of the adoption path — after the tool has proven
-itself in `ros-tooling` with real users.
-
-### What's the risk if this doesn't get adopted?
-
-The worst case: rosgraph becomes another single-maintainer tool in the
-ecosystem (like cake and breadcrumb today). The mitigation strategy:
-
-- **`ros-tooling` hosting** — institutional backing reduces bus factor
-- **REP-based schema** — the schema outlives the tool if it becomes a
-  standard
-- **`generate_parameter_library` compatibility** — the parameters
-  portion works with the most mature tool in the space, regardless of
-  rosgraph's fate
-- **Standalone value** — even without ecosystem adoption, a single team
-  gets code generation and parameter validation from day one
-
 ---
 
-## 8. Educator / University Researcher
+## 6. Educator / University Researcher
 
 ### Can I use rosgraph for teaching ROS 2?
 
-Yes, and this is one of the highest-leverage adoption paths. The Quick
-Start ([PROPOSAL.md §1](PROPOSAL.md#quick-start-what-it-looks-like))
-shows a complete workflow in 3 commands:
-
-```bash
-rosgraph generate .   # generates node scaffolding
-rosgraph lint .       # checks for issues
-rosgraph monitor      # watches the running system
-```
-
-For teaching, `interface.yaml` forces students to think about their
-node's API before writing implementation code — topics, types, QoS,
-parameters. This is better pedagogy than the current approach of
-copy-pasting publisher boilerplate and tweaking it.
-
-### Does this lower the barrier for students?
-
-Significantly. A student writes ~15 lines of YAML declaring what their
-node does, runs `rosgraph generate`, and gets a working scaffold with
-type-safe publishers, subscribers, and validated parameters. They write
-only the business logic. No boilerplate, no silent type mismatches, no
-mysterious QoS failures.
-
-Error messages are designed to be helpful — rule codes, file locations,
-clear descriptions of what's wrong and how to fix it. See [PROPOSAL.md
-§10.3](PROPOSAL.md#103-static-analysis-architecture).
+Yes. The Quick Start
+([ROSGRAPH.md §1](ROSGRAPH.md#quick-start-what-it-looks-like))
+shows a complete workflow in 3 commands. For teaching,
+`interface.yaml` forces students to think about their node's API
+before writing implementation code — topics, types, QoS, parameters.
+This is better pedagogy than copy-pasting publisher boilerplate and
+tweaking it.
 
 ### How does rosgraph relate to HAROS?
 
-HAROS ([PROPOSAL.md §10.6](PROPOSAL.md#106-ros-domain-prior-art-haros))
+HAROS ([ROSGRAPH.md §10.6](ROSGRAPH.md#106-ros-domain-prior-art-haros))
 was the prior art for graph analysis in ROS — built at the University
 of Minho (2016–2021). rosgraph borrows HAROS's metamodel and HPL
 property language concepts, but differs fundamentally:
 
 - **HAROS extracted interfaces from source code.** rosgraph uses
-  explicit declarations (`interface.yaml`). Declarations are simpler,
-  more reliable, and enable code generation.
+  explicit declarations (`interface.yaml`).
 - **HAROS was ROS 1 only.** rosgraph is built for ROS 2 concepts:
   QoS, lifecycle, components, actions, DDS discovery.
 - **HAROS died because extraction broke.** catkin → ament, rospack →
   colcon, XML launch → Python launch. Declaration-based tools don't
   break when the build system changes.
 
-### Can I use rosgraph for research on ROS system verification?
-
-The graph model ([PROPOSAL.md §3.1](PROPOSAL.md#31-the-graph-model))
-is a structured representation of the ROS computation graph — nodes,
-topics, services, actions, parameters, QoS, connections. It's
-exportable as JSON via the `InterfaceDescriptor` IR ([PROPOSAL.md
-§3.3](PROPOSAL.md#33-the-interfacedescriptor-ir)).
-
-Research opportunities:
-- **Formal verification.** The graph model is a natural input for model
-  checkers. Behavioral properties (Phase 3+, [PROPOSAL.md
-  §11.4](PROPOSAL.md#114-behavioral-properties-future)) enable temporal
-  logic specifications.
-- **Static analysis.** The analyzer DAG architecture ([PROPOSAL.md
-  §3.5](PROPOSAL.md#35-rosgraph-lint--static-analysis)) supports custom
-  analysis passes without modifying core code.
-- **Runtime monitoring.** The declared-vs-observed diff ([PROPOSAL.md
-  §3.6](PROPOSAL.md#36-rosgraph-monitor--runtime-reconciliation)) is a
-  rich data source for anomaly detection research.
-- **ROS ecosystem studies.** Interface coverage, graph topology
-  patterns, common QoS configurations — all extractable from
-  `interface.yaml` files across the ecosystem.
-
-### What about publishing results that use rosgraph?
-
-The tool is open-source (planned for `ros-tooling` organization). The
-SARIF and JSON output formats produce structured, reproducible results
-suitable for academic publication. The graph model provides a formal
-vocabulary for describing ROS system architectures.
-
 ---
 
-## 9. Embedded / Resource-Constrained Developer
+## 7. Embedded / Resource-Constrained Developer
 
 ### Does rosgraph add runtime overhead to my nodes?
 
@@ -822,8 +340,8 @@ generated pub/sub wrappers are thin forwarding calls. No virtual
 dispatch is added beyond what the ROS client library already uses.
 
 Parameter validation (via `generate_parameter_library`) runs at
-parameter-set time, not in the hot path. See [PROPOSAL.md §3.4,
-"Design decisions"](PROPOSAL.md#34-rosgraph-generate--code-generation).
+parameter-set time, not in the hot path. See [ROSGRAPH.md §3.4,
+"Design decisions"](ROSGRAPH.md#34-rosgraph-generate--code-generation).
 
 ### Does `rosgraph monitor` run on the robot?
 
@@ -832,43 +350,68 @@ doesn't instrument or modify your nodes. If your platform can't spare
 the resources, don't run it. You still get full value from build-time
 tools (`rosgraph generate`, `rosgraph lint`).
 
-Runtime targets for `rosgraph monitor` ([PROPOSAL.md
-§3.14](PROPOSAL.md#314-scale--fleet-considerations)):
+Runtime targets ([ROSGRAPH.md
+§3.14](ROSGRAPH.md#314-scale--fleet-considerations)):
 - Memory: < 50MB resident
 - CPU: < 5% of one core at steady-state (5s scrape interval)
 - No additional DDS traffic beyond standard discovery
 
-For very constrained platforms, run `rosgraph monitor` off-board
-(e.g., on a companion computer) observing the same DDS domain.
+---
+
+## 8. The Skeptic
+
+### This proposal has 51 features. Is this realistic?
+
+Phase 1 ([ROSGRAPH.md §4](ROSGRAPH.md#4-phasing)) is the commitment:
+~12 features covering core schema, basic code generation, and
+highest-value lint and monitor rules. Later phases are contingent on
+adoption.
+
+The tool builds on existing work — cake for code generation,
+`generate_parameter_library` for parameters, `graph-monitor` message
+definitions for runtime. Phase 1 is stabilizing and unifying existing
+pieces, not building from scratch.
+
+### When should I NOT use rosgraph?
 
-### Does rosgraph work with micro-ROS?
+- **Quick prototyping** — single throwaway node, not worth the file.
+- **Single-node packages** — minimal lint value, though codegen may
+  still save boilerplate.
+- **Highly dynamic interfaces** — nodes that create/destroy publishers
+  at runtime based on conditions can't be fully declared.
 
-micro-ROS nodes communicate via the standard DDS/XRCE-DDS bridge.
-`rosgraph discover` and `rosgraph monitor` observe them through the
-bridge like any other node. `interface.yaml` declarations work for
-micro-ROS nodes — the schema is language-agnostic.
+See [ROSGRAPH.md §12, "When Not to Use
+rosgraph"](ROSGRAPH.md#when-not-to-use-rosgraph).
 
-Code generation for micro-ROS C is not in Phase 1. The IR-based plugin
-architecture ([PROPOSAL.md
-§3.3](PROPOSAL.md#33-the-interfacedescriptor-ir)) supports adding a
-micro-ROS code generation plugin without changes to the core tool.
+---
 
-### What about real-time constraints?
+## 9. Safety-Critical Engineer
 
-`rosgraph monitor` is not real-time safe — it's an observation tool
-running in its own process. It does not interfere with the monitored
-system, and its failure does not affect the system under observation.
+### Does rosgraph help with certification?
 
-For hard real-time requirements, the monitor's Prometheus metrics and
-diagnostics topics can be consumed by a separate real-time safety
-monitor. rosgraph provides the graph model; real-time enforcement is a
-separate concern. See [PROPOSAL.md
-§11](PROPOSAL.md#11-safety--certification).
+rosgraph is not a safety tool — it's a development and verification
+tool that produces artifacts useful in safety cases. See [ROSGRAPH.md
+§11](ROSGRAPH.md#11-safety--certification).
 
-### Does the build toolchain add cross-compilation complexity?
+Key artifacts:
+
+| rosgraph artifact | Evidence type |
+|---|---|
+| `interface.yaml` | Software architecture description |
+| `rosgraph lint` SARIF output | Static analysis results |
+| `rosgraph monitor` logs | Runtime verification evidence |
+| `rosgraph test` results | Interface conformance evidence |
+| `rosgraph breaking` output | Change impact analysis |
+
+### What about behavioral properties?
+
+Phase 1-2 covers structural properties: type matches, QoS
+compatibility, graph connectivity. Behavioral analysis (Phase 3+) adds
+temporal and causal properties, inspired by HAROS HPL:
+
+```
+globally: /emergency_stop causes /motor_disable within 100ms
+globally: /heartbeat absent_for 500ms causes /safe_stop
+```
 
-`rosgraph generate` runs at build time on the host, producing standard
-C++ and Python source files. These are compiled by the normal
-cross-compilation toolchain (`colcon build --cmake-args
--DCMAKE_TOOLCHAIN_FILE=...`). rosgraph itself doesn't need to run on
-the target — it's a host-side tool, like `cmake` or `protoc`.
+See [ROSGRAPH.md §11.4](ROSGRAPH.md#114-behavioral-properties-future).
diff --git a/docs/ROSGRAPH.md b/docs/ROSGRAPH.md
index 8469ec5..24e7308 100644
--- a/docs/ROSGRAPH.md
+++ b/docs/ROSGRAPH.md
@@ -20,7 +20,6 @@
 10. [Prior Art](#10-prior-art)
 11. [Safety & Certification](#11-safety--certification)
 12. [Scope & Limitations](#12-scope--limitations)
-13. [Resolved Questions](#13-resolved-questions)
 
 ---
 
@@ -353,6 +352,18 @@ connections:                    # Explicit wiring (optional, for validation)
     to: object_detector/~/input_cloud
 ```
 
+**Toward a single source of truth.** `system.yaml` overlaps
+significantly with YAML launch files and parameter config files — all
+three describe which nodes run, with what parameters, and with what
+remappings. The long-term direction is convergence: `system.yaml`
+becomes the graph spec, the parameter config, *and* the launch
+description. `rosgraph generate` (or a thin `rosgraph launch` shim)
+emits a runnable launch file from the same `system.yaml` that
+`rosgraph lint` validates. One file, no drift between what you analyze
+and what you run. For projects with multiple deployment configurations
+(sim, real, test), each gets its own `system.yaml` — replacing both
+the per-config launch file and the per-config parameter YAML.
+
 **Mixins — Composable Interface Fragments** (G15, Phase 2)
 
 Plugins that inject interfaces into a host node (e.g., nav2 controller
@@ -1702,23 +1713,3 @@ review and refinement.
 — the minimum viable tool that delivers value. Later phases are
 contingent on adoption and contributor capacity.
 
----
-
-## 13. Resolved Questions
-
-The following questions were raised during the proposal drafting process
-and have been resolved. Answers are integrated into the relevant
-sections of this document.
-
-| # | Question | Resolution | Section |
-|---|----------|------------|---------|
-| 1 | Dynamic interfaces | Out of scope — rosgraph covers declared interfaces only (Design Principle 12). Undeclared runtime interfaces are flagged as `UnexpectedTopic` by monitor. | §2 |
-| 2 | Launch substitution evaluation | Three-path loader strategy: YAML launch (direct parse), `system.yaml` (static), Python launch AST (pattern matching). | §3.5 |
-| 3 | Behavioural properties | Structural first (Phase 1–2), behavioural later (Phase 3+) if adoption warrants it (Design Principle 13). | §2 |
-| 4 | `generate_parameter_library` unification | Keep as standalone, maintain schema compatibility. rosgraph delegates to gen_param_lib at build time. | §9.2 |
-| 5 | Multi-workspace analysis | Per-package fact caching via installed `interface.yaml` files. Phase 2 concern. | §3.12 |
-| 6 | Launch file extraction without clingwrap | Partial AST extraction for standard `launch_ros` patterns, with `system.yaml` as fully-static alternative. | §3.5 |
-| 7 | Relationship to graph-monitor | New implementation. Adopt graph-monitor's message definitions, reimplement scraping + reconciliation. | §3.6 |
-| 8 | Mixin pattern | `mixins:` section referencing interface fragments. Host's effective interface = own declaration + all mixins merged. | §3.2 |
-| 9 | Adoption path | `ros-tooling` org → REP for schema → docs.ros.org tutorials → `ros_core` (long-term). | §4 |
-| 10 | Declaration scope | Structural (node interfaces) only for Phase 1–2. Behavioural scope deferred to Phase 3+. | §2 |

From fc02a8ce0071a348d5a35648f82428b7fd9d2bf4 Mon Sep 17 00:00:00 2001
From: Luke Sy <sylukewicent@gmail.com>
Date: Tue, 10 Mar 2026 03:14:37 +1100
Subject: [PATCH 3/5] Address PR feedback: summarise ROSGRAPH, update MANIFESTO

- MANIFESTO: add undocumented interfaces problem, update wording
- ROSGRAPH: greatly summarise from 1715 to 133 lines
  - Remove sections 3-12 (architecture, phasing, language choice, etc.)
  - Remove design principles (redundant with key insights)
  - Replace CLI subcommand tree with prioritised components
  - Add codegen requirements (plugin arch, distro-installable)
  - Condense gap table to bullet list
  - Clarify discovery vs monitoring distinction

Signed-off-by: Luke Sy <sylukewicent@gmail.com>
---
 docs/MANIFESTO.md |    4 +-
 docs/ROSGRAPH.md  | 1705 ++-------------------------------------------
 2 files changed, 64 insertions(+), 1645 deletions(-)

diff --git a/docs/MANIFESTO.md b/docs/MANIFESTO.md
index 70faf6b..e22ed9a 100644
--- a/docs/MANIFESTO.md
+++ b/docs/MANIFESTO.md
@@ -4,9 +4,11 @@
 
 Robotics engineers spend too much time on ROS plumbing — writing boilerplate, debugging invisible wiring, and keeping launch files in sync with code — instead of building their application.
 
+The main interfaces of ROS systems (topics, parameters, services, actions) are undocumented by default. As systems grow larger they become harder to reason about, and the lack of well-defined interface contracts blocks automated tooling from helping.
+
 ## What
 
-A declarative, observable ROS graph. Engineers declare what their system should be; tooling generates the code and verifies the running system matches the spec.
+A declarative, observable ROS graph. Engineers declare what their system should be; tooling generates the code and entities as needed, and verifies the running system matches the spec.
 
 ## How
 
diff --git a/docs/ROSGRAPH.md b/docs/ROSGRAPH.md
index 24e7308..9f452e1 100644
--- a/docs/ROSGRAPH.md
+++ b/docs/ROSGRAPH.md
@@ -6,45 +6,20 @@
 
 ---
 
-## Table of Contents
+## Executive Summary
 
-1. [Executive Summary](#1-executive-summary)
-2. [Design Principles](#2-design-principles)
-3. [Architecture](#3-architecture)
-4. [Phasing](#4-phasing)
-5. [Language Choice](#5-language-choice)
-6. [Feature List](#6-feature-list)
-7. [Lint Rule Codes](#7-lint-rule-codes)
-8. [Monitor Alert Rules](#8-monitor-alert-rules)
-9. [Existing ROS 2 Ecosystem](#9-existing-ros-2-ecosystem)
-10. [Prior Art](#10-prior-art)
-11. [Safety & Certification](#11-safety--certification)
-12. [Scope & Limitations](#12-scope--limitations)
+ROS 2 has no standard schema for declaring node interfaces and no
+production-ready tooling for verifying that a running system matches its
+declared architecture. The ecosystem is fragmented across single-purpose
+tools with overlapping scope and bus factors of one.
 
----
-
-## 1. Executive Summary
-
-ROS 2 has no production-ready tool for verifying that a running system
-matches its declared architecture, no standard schema for declaring node
-interfaces, and no unified CLI for graph analysis. The ecosystem is
-fragmented across single-purpose tools with overlapping scope and bus
-factors of one.
+Key gaps — no existing tooling:
 
-| Category | Capability | Current tool | Status |
-|---|---|---|---|
-| **Schema** | Node interface declaration | cake / nodl / gen_param_lib | cake early; nodl dead; gpl params-only |
-| **Codegen** | Static graph from launch files | breadcrumb + clingwrap | Early-stage, solo dev |
-| **Runtime** | Runtime graph monitoring | graph-monitor | Mid-stage, institutional |
-| **Runtime** | Runtime tracing | ros2_tracing | Mature, production |
-| **Runtime** | Latency analysis | CARET | Mature, Tier IV |
-| **Runtime** | Graph visualisation | Foxglove, Dear RosNodeViewer | Mature but live-only |
-| **Runtime** | **Graph diff (expected vs. actual)** | **Nothing** | **Major gap** |
-| **Static** | **Graph linting (pre-launch)** | **Nothing** | **Major gap** |
-| **Static** | **QoS static analysis** | breadcrumb (partial) | Early-stage |
-| **Static** | **CI graph validation** | **Nothing** | **Major gap** |
-| **Docs** | **Node API documentation** | **Nothing** (hand-written only) | **Major gap** |
-| — | **Behavioural properties** | **Nothing** (HPL was ROS 1) | **Major gap** |
+- **Graph diff** (expected vs. actual)
+- **Graph linting** (pre-launch static analysis)
+- **CI graph validation**
+- **Node API documentation** (hand-written only today)
+- **QoS static analysis** (breadcrumb is early-stage/partial)
 
 ### The Problem, Concretely
 
@@ -62,23 +37,46 @@ Today in ROS 2:
   `colcon build` succeeds. The system launches. The parameter silently
   takes its default value.
 
-These are real, common bugs in production ROS 2 systems. rosgraph
-catches all four — the first two at build time (`rosgraph lint`), the
-third at runtime (`rosgraph monitor`), the fourth at lint time.
+These are real, common bugs in production ROS 2 systems.
 
-This document proposes **`rosgraph`** — a single tool with subcommands
-covering the four goals of the ROSGraph Working Group:
+### Components
 
-```
-rosgraph
-├── rosgraph generate    (Goal 2: spec → code)
-├── rosgraph lint        (Goal 4: static graph analysis)
-├── rosgraph monitor     (Goal 3: runtime reconciliation)
-├── rosgraph test        (Goal 3: contract testing)
-├── rosgraph docs        (documentation generation)
-├── rosgraph breaking    (breaking change detection)
-└── rosgraph discover    (runtime → spec, brownfield adoption)
-```
+rosgraph is composed of the following components, ordered by priority.
+These components may be wrapped by user interfaces (e.g. a CLI), but
+are designed as independent, composable libraries.
+
+1. **Node Spec (NoDL)** — a formal, machine-readable schema for
+   declaring node interfaces (`interface.yaml`). This is the most core
+   part of the project; everything else builds on it.
+
+2. **Code Generation** — `nodl-generator` takes NoDL input and outputs
+   code for ROS client libraries (rclcpp, rclpy, rclrs). Must be
+   installable as part of a ROS distro (`apt-get install`). Requires a
+   plugin/sidechannel architecture so additional client libraries
+   (e.g. rcljava) can be supported without modifying the core generator.
+
+3. **Runtime Discovery** — introspect a running system and produce NoDL
+   specs from observed nodes. Enables brownfield adoption: point at an
+   existing system, generate `interface.yaml` files for every node, then
+   iteratively refine them. Unlike runtime monitoring (component 5),
+   discovery is a one-time migration tool, not a continuous process.
+
+4. **Node-level Unit Testing** — verify a single node conforms to its
+   declared spec in isolation.
+
+5. **Graph Analysis & Comparison** — integration-level verification.
+   Static analysis checks the full graph for type mismatches, QoS
+   incompatibilities, and missing connections before launch. Runtime
+   monitoring continuously diffs the declared graph against the live
+   system, flagging drift (crashed nodes, unexpected topics, QoS
+   changes) as it happens.
+
+6. **Documentation Generation** — produce API documentation directly
+   from NoDL specs.
+
+> **Open question:** implementation language for the generator tooling.
+
+### Key Insights
 
 Three key insights drive the design:
 
@@ -87,11 +85,11 @@ Three key insights drive the design:
    operate on a graph model, not on ASTs. Source code parsing is a
    loader that feeds the model, not the analysis target.
 
-2. **Goals 3–4 are schema conformance problems** ("does reality match
-   the spec?"), not traditional program analysis. Once you have a
-   machine-readable spec (`interface.yaml`), verification falls out
-   naturally — the same pattern as `buf lint`, Pact contract tests,
-   and Kubernetes reconciliation.
+2. **Verification and analysis are schema conformance problems**
+   ("does reality match the spec?"), not traditional program analysis.
+   Once you have a machine-readable spec (`interface.yaml`),
+   verification falls out naturally — the same pattern as `buf lint`,
+   Pact contract tests, and Kubernetes reconciliation.
 
 3. **A declaration without code generation is a non-starter.** NoDL
    proved this. The schema must generate code, documentation, and
@@ -100,7 +98,7 @@ Three key insights drive the design:
    static analysis, the contract for runtime verification, and the
    reference for documentation.
 
-### Quick Start (What It Looks Like)
+### Example
 
 A minimal `interface.yaml`:
 
@@ -124,1592 +122,11 @@ parameters:
       bounds<>: [0.1, 100.0]
 ```
 
-What you get:
-
-```bash
-rosgraph generate .   # → C++ header, Python module, parameter validation
-rosgraph lint .       # → "no issues" or "TOP001: type mismatch on /cmd_vel"
-rosgraph monitor      # → live diff: declared graph vs. running system
-```
-
-The generated code gives you a typed context struct with publishers,
-subscribers, and validated parameters — no boilerplate. You write
-business logic; rosgraph generates the wiring.
+From this single file, the tooling can:
+- **Generate** a typed C++/Python node context with publishers and validated parameters — no boilerplate
+- **Lint** the full workspace graph for type mismatches and QoS incompatibilities before launch
+- **Monitor** the running system and flag drift from the declared spec
+- **Discover** a running system's interfaces and produce draft specs for brownfield adoption
+- **Document** the node's API automatically
 
 ---
-
-## 2. Design Principles
-
-### Core Philosophy
-
-1. **The graph is the program.** Analysis operates on the typed,
-   QoS-annotated computation graph — not source code ASTs. Source
-   parsing is a loader that feeds the model, not the analysis target.
-
-2. **Declare first, verify always.** `interface.yaml` is the single
-   source of truth. Code generation, static analysis, and runtime
-   monitoring all verify against the declaration.
-
-3. **One schema, many consumers.** The same `interface.yaml` drives
-   code generation, documentation, linting, monitoring, contract
-   testing, and security policy generation.
-
-4. **One tool, not ten.** `rosgraph` with subcommands replaces
-   fragmented single-purpose tools. One CLI, one config, one output
-   format.
-
-### Developer Experience
-
-5. **Zero-config value, progressive disclosure.** Given
-   `interface.yaml` files, the default rules catch real bugs (type
-   mismatches, QoS incompatibilities) with no additional configuration.
-   A minimal 10-line `interface.yaml` produces a working node;
-   lifecycle, mixins, and parameterized QoS are opt-in.
-
-6. **Brownfield first, gradual adoption.** `rosgraph discover`
-   generates specs from running nodes. `--add-noqa` suppresses existing
-   issues. Packages without `interface.yaml` are skipped, not errored.
-
-7. **Speed is a feature.** An architectural property, not an
-   afterthought. Target: lint a 100-package workspace in under 5
-   seconds.
-
-8. **Backward compatibility is non-negotiable.** Existing
-   `generate_parameter_library` YAML works as-is inside `parameters:`.
-   Existing `.msg`/`.srv`/`.action` files are referenced, not replaced.
-
-### Verification & CI
-
-9. **CI-first.** SARIF output, GitHub annotations, exit codes, and
-   differential analysis are primary design targets.
-
-10. **Validation at every stage.** Author time: JSON Schema. Build
-    time: structural + semantic. Launch time: declared vs. configured.
-    Runtime: declared vs. observed.
-
-11. **Correctness rules are errors; style rules are warnings.** Type
-    mismatches and QoS incompatibilities fail CI. Naming conventions
-    warn.
-
-### Scope
-
-12. **Declared interfaces are the primary target.** The schema
-    describes the *intended* interface — the same boundary drawn by
-    Protobuf, AsyncAPI, Smithy, and OpenAPI. For partially dynamic
-    nodes (e.g., nav2 plugin hosts), worst-case bounds can be declared
-    with `optional: true`; `rosgraph monitor` validates these at
-    runtime and flags truly undeclared interfaces as `UnexpectedTopic`.
-
-13. **Structural first, behavioural later.** Phase 1–2: type matches,
-    QoS compatibility, graph connectivity — the foundation that
-    safety-critical systems (ISO 26262, IEC 61508) require as evidence.
-    Behavioural properties (temporal/causal, e.g. "/e_stop causes
-    /motor_disable within 100ms") are Phase 3+, drawing on prior art
-    from HAROS HPL and runtime verification tools like STL/MTL
-    monitors. The structural graph model is designed to extend to
-    behavioural annotations without schema redesign.
-
----
-
-## 3. Architecture
-
-One tool. One graph model. Four capabilities.
-
-```
-                         ┌──────────────────────┐
-                         │    Graph Model        │
-                         │  (shared library)     │
-                         │                       │
-                         │  Nodes, Topics,       │
-                         │  Services, Actions,   │
-                         │  Parameters, QoS,     │
-                         │  Connections           │
-                         └───────┬──────┬────────┘
-                                 │      │
-                    ┌────────────┘      └───────────────┐
-                    │                                   │
-          ┌─────────▼──────────┐            ┌──────────▼───────────┐
-          │  Build-time tools  │            │  Runtime tools        │
-          │                    │            │                       │
-          │  rosgraph generate │            │  rosgraph monitor     │
-          │  rosgraph lint     │            │  rosgraph test        │
-          │  rosgraph docs     │            │  rosgraph discover    │
-          │  rosgraph breaking │            │                       │
-          └────────────────────┘            └───────────────────────┘
-```
-
-### 3.1 The Graph Model
-
-A language-agnostic representation of the ROS computation graph. Every
-loader produces it; every analyzer consumes it.
-
-```
-ComputationGraph
-├── nodes: [NodeInterface]
-│   ├── name, namespace, package, executable
-│   ├── publishers:     [{topic, msg_type, qos}]
-│   ├── subscribers:    [{topic, msg_type, qos}]
-│   ├── services:       [{name, srv_type}]
-│   ├── clients:        [{name, srv_type}]
-│   ├── action_servers: [{name, action_type}]
-│   ├── action_clients: [{name, action_type}]
-│   ├── parameters:     [{name, type, default, validators}]
-│   └── lifecycle_state: str | None
-├── topics: [TopicInfo]
-│   ├── name, msg_type
-│   ├── publishers:  [NodeRef]
-│   ├── subscribers: [NodeRef]
-│   └── qos_profiles: [QoSProfile]
-├── services: [ServiceInfo]
-├── actions: [ActionInfo]
-└── connections: [Connection]
-    ├── source: NodeRef
-    ├── target: NodeRef
-    ├── channel: TopicRef | ServiceRef | ActionRef
-    └── qos_compatible: bool
-```
-
-### 3.2 Schema Layers
-
-Three schema levels, each building on the previous:
-
-**Layer 1 — Node Interface Schema** (per-node declaration)
-
-```yaml
-# interface.yaml
-schema_version: "1.0"
-
-node:
-  name: lidar_processor
-  package: perception_pkg
-  lifecycle: managed              # managed | unmanaged (default)
-
-parameters:
-  # Exact generate_parameter_library format (backward-compatible)
-  voxel_size:
-    type: double
-    default_value: 0.05
-    description: "Voxel grid filter leaf size (meters)"
-    validation:
-      bounds<>: [0.01, 1.0]
-    read_only: false
-  robot_frame:
-    type: string
-    default_value: "base_link"
-    read_only: true
-
-publishers:
-  - topic: ~/filtered_points
-    type: sensor_msgs/msg/PointCloud2
-    qos:
-      history: 5
-      reliability: RELIABLE
-      durability: TRANSIENT_LOCAL
-    description: "Filtered and downsampled point cloud"
-
-subscribers:
-  - topic: ~/raw_points
-    type: sensor_msgs/msg/PointCloud2
-    qos:
-      history: 1
-      reliability: BEST_EFFORT
-    description: "Raw point cloud from lidar driver"
-
-services:
-  - name: ~/set_filter_params
-    type: perception_msgs/srv/SetFilterParams
-
-actions:
-  - name: ~/process_scan
-    type: perception_msgs/action/ProcessScan
-
-timers:
-  - name: process_timer
-    period_ms: 100
-    description: "Main processing loop"
-```
-
-**Layer 2 — Composed System Schema** (launch-level declaration)
-
-```yaml
-# system.yaml
-schema_version: "1.0"
-name: perception_pipeline
-
-nodes:
-  - ref: perception_pkg/lidar_processor
-    namespace: /robot1
-    parameters:
-      voxel_size: 0.1
-    remappings:
-      ~/raw_points: /lidar/points
-
-  - ref: perception_pkg/object_detector
-    namespace: /robot1
-
-connections:                    # Explicit wiring (optional, for validation)
-  - from: lidar_processor/~/filtered_points
-    to: object_detector/~/input_cloud
-```
-
-**Toward a single source of truth.** `system.yaml` overlaps
-significantly with YAML launch files and parameter config files — all
-three describe which nodes run, with what parameters, and with what
-remappings. The long-term direction is convergence: `system.yaml`
-becomes the graph spec, the parameter config, *and* the launch
-description. `rosgraph generate` (or a thin `rosgraph launch` shim)
-emits a runnable launch file from the same `system.yaml` that
-`rosgraph lint` validates. One file, no drift between what you analyze
-and what you run. For projects with multiple deployment configurations
-(sim, real, test), each gets its own `system.yaml` — replacing both
-the per-config launch file and the per-config parameter YAML.
-
-**Mixins — Composable Interface Fragments** (G15, Phase 2)
-
-Plugins that inject interfaces into a host node (e.g., nav2 controller
-plugins adding parameters and topics via the node handle) are declared
-via `mixins:`. Each mixin is itself an `interface.yaml` fragment
-declaring the topics, parameters, and services it adds. The host node's
-effective interface is the merge of its own declaration plus all mixins.
-
-```yaml
-# nodes/follow_path/interface.yaml
-node:
-  name: follow_path
-  package: nav2_controller
-
-parameters:
-  controller_plugin:
-    type: string
-    default_value: "dwb_core::DWBLocalPlanner"
-
-mixins:
-  - ref: dwb_core/dwb_local_planner   # brings in max_vel_x, min_vel_y, etc.
-  - ref: nav2_costmap_2d/costmap       # brings in costmap params + topics
-```
-
-This pattern (borrowed from Smithy's mixin concept) gives `rosgraph
-lint` and `rosgraph monitor` the complete interface picture without
-requiring the host node to redeclare everything its plugins add.
-Requires the `$ref` / fragment system (G15) as a prerequisite.
-
-**Layer 3 — Observation Schema** (runtime-observed state)
-
-```yaml
-# observed.yaml (auto-generated from running system)
-node:
-  name: lidar_processor
-  package: perception_pkg
-  pid: 12345
-  state: active                 # lifecycle state if managed
-
-publishers:
-  - topic: /robot1/lidar_processor/filtered_points
-    type: sensor_msgs/msg/PointCloud2
-    qos:
-      reliability: RELIABLE
-      durability: TRANSIENT_LOCAL
-      depth: 5
-    stats:
-      message_count: 14523
-      frequency_hz: 9.98
-      subscribers_matched: 2
-
-# ... subscribers, services, actions, parameters with actual values
-```
-
-### 3.3 The InterfaceDescriptor (IR)
-
-The parsed, validated, fully-resolved representation of a node's
-interface. Serializable as JSON for plugin communication:
-
-```json
-{
-  "schema_version": "1.0",
-  "node": {
-    "name": "lidar_processor",
-    "package": "perception_pkg",
-    "lifecycle": "managed"
-  },
-  "parameters": [
-    {
-      "name": "voxel_size",
-      "type": "double",
-      "default_value": 0.05,
-      "description": "Voxel grid filter leaf size (meters)",
-      "validation": { "bounds": [0.01, 1.0] },
-      "read_only": false
-    }
-  ],
-  "publishers": [
-    {
-      "topic": "~/filtered_points",
-      "resolved_topic": "/robot1/lidar_processor/filtered_points",
-      "message_type": "sensor_msgs/msg/PointCloud2",
-      "qos": { "history": 5, "reliability": "RELIABLE", "durability": "TRANSIENT_LOCAL" },
-      "description": "Filtered and downsampled point cloud"
-    }
-  ]
-}
-```
-
-Plugins receive this IR via stdin (or file path) and produce generated
-files.
-
-### 3.4 `rosgraph generate` — Code Generation
-
-Translates `interface.yaml` into working node implementations.
-
-```
-┌─────────────────────────────────────────┐
-│         interface.yaml (per node)        │
-└────────────────┬────────────────────────┘
-                 │
-┌────────────────▼────────────────────────┐
-│          Parser / Validator              │
-│  1. YAML parse                          │
-│  2. JSON Schema validation (structural) │
-│  3. Semantic validation (type refs, QoS)│
-│  4. Produce InterfaceDescriptor (IR)    │
-└────────────────┬────────────────────────┘
-                 │
-      ┌──────────┼──────────────────┐
-      │          │                  │
-┌─────▼─────┐ ┌─▼──────────┐ ┌────▼──────┐
-│ C++ Plugin│ │Python Plugin│ │Docs Plugin│
-│           │ │             │ │           │
-│ - header  │ │ - module    │ │ - API ref │
-│ - reg.cpp │ │ - params    │ │ - graph   │
-│ - params  │ │ - __init__  │ │   fragment│
-└───────────┘ └─────────────┘ └───────────┘
-```
-
-**Build integration:**
-
-```cmake
-cmake_minimum_required(VERSION 3.22)
-project(perception_pkg)
-find_package(rosgraph REQUIRED)
-rosgraph_auto_package()
-```
-
-Under the hood, `rosgraph_auto_package()`:
-1. Scans `nodes/` for subdirectories with `interface.yaml`
-2. Validates each `interface.yaml` (structural + semantic)
-3. Invokes C++ plugin → header, registration, params YAML
-4. Delegates to `generate_parameter_library()` for parameters
-5. Compiles and links
-6. Installs interface YAMLs to `share/<package>/interfaces/`
-
-**Design decisions:**
-- **Composition over inheritance.** Generated code holds a
-  `rclcpp::Node` (has-a), not inherits from it. Context struct is a
-  flat aggregation of generated components plus user state.
-- **`generate_parameter_library` as backend.** Uses the existing,
-  widely-adopted parameter library rather than reimplementing.
-- **Convention-over-configuration.** Directory layout (`nodes/`,
-  `interfaces/`, `launch/`, `config/`) determines behavior.
-
-### 3.5 `rosgraph lint` — Static Analysis
-
-Pre-launch verification of the ROS graph.
-
-```
-┌────────────────────────────────────┐
-│           Loaders                  │
-│  ┌───────────┐ ┌───────────────┐  │
-│  │interface.  │ │ launch files  │  │
-│  │yaml parser │ │ (clingwrap/   │  │
-│  │            │ │  native)      │  │
-│  └─────┬─────┘ └──────┬────────┘  │
-│        └───────┬───────┘           │
-│                ▼                   │
-│  ┌──────────────────────────┐     │
-│  │     Graph Model          │     │
-│  └────────────┬─────────────┘     │
-│               ▼                   │
-│  ┌──────────────────────────┐     │
-│  │   Analyzer DAG           │     │
-│  │   (parallel execution)   │     │
-│  │                          │     │
-│  │  [topic_resolver]        │     │
-│  │       ↓                  │     │
-│  │  [type_mismatch_checker] │     │
-│  │  [qos_compat_checker]    │     │
-│  │  [naming_convention]     │     │
-│  │  [disconnected_subgraph] │     │
-│  │  [unused_node]           │     │
-│  │  [launch_linter]         │     │
-│  │  ...                     │     │
-│  └────────────┬─────────────┘     │
-│               ▼                   │
-│  ┌──────────────────────────┐     │
-│  │  Post-Processing         │     │
-│  │  - suppression filter    │     │
-│  │  - severity assignment   │     │
-│  │  - deduplication         │     │
-│  │  - differential (new     │     │
-│  │    issues only for CI)   │     │
-│  └────────────┬─────────────┘     │
-│               ▼                   │
-│  ┌──────────────────────────┐     │
-│  │  Output Formatters       │     │
-│  │  text, JSON, SARIF,      │     │
-│  │  GitHub, JUnit           │     │
-│  └──────────────────────────┘     │
-└────────────────────────────────────┘
-```
-
-**Analyzer definition pattern (from Go analysis framework):**
-
-```python
-# Each analyzer is a value, not a subclass
-topic_resolver = GraphAnalyzer(
-    name="topic_resolver",
-    doc="Resolves topic names to their message types across the graph",
-    requires=[],
-    result_type=TopicTypeMap,
-    run=resolve_topics,
-)
-
-type_mismatch = GraphAnalyzer(
-    name="type_mismatch",
-    doc="Checks that all pub/sub on a topic agree on message type",
-    requires=[topic_resolver],
-    result_type=None,
-    run=check_type_mismatches,
-)
-```
-
-See [§7 Lint Rule Codes](#7-lint-rule-codes) for the full rule system.
-
-**Launch file loading strategy:**
-
-Three loader paths, not mutually exclusive, phased by tractability:
-
-| Loader | Launch format | Extraction method | Phase | Limitations |
-|---|---|---|---|---|
-| YAML launch | YAML launch files | Direct parse | 1 | Limited expressiveness |
-| `system.yaml` | Layer 2 schema | Direct parse | 1 | Requires manual authoring |
-| Python launch AST | Standard `launch_ros` | AST pattern matching | 2 | Cannot handle dynamic logic (conditionals, loops) |
-
-- **YAML launch files** are statically parseable — `rosgraph lint` can
-  extract node declarations, remappings, and parameter overrides
-  directly.
-- **Python launch files** are imperative and Turing-complete, but most
-  are declarative-in-spirit. AST-level pattern matching for common
-  patterns (`Node()`, `LaunchConfiguration()`,
-  `DeclareLaunchArgument()`) captures ~80% of real launch files without
-  execution.
-- **Layer 2 `system.yaml`** (§3.2) sidesteps the problem entirely —
-  a static YAML file declaring the intended system composition. Launch
-  files still run the system, but `system.yaml` is the lint/monitor
-  source of truth for graph analysis.
-
-The lint diagram's "launch files" loader encompasses all three paths.
-
-### 3.6 `rosgraph monitor` — Runtime Reconciliation
-
-Kubernetes-style reconciliation loop comparing declared vs. observed
-graph state.
-
-```
-┌─────────────────────────────────────────────────┐
-│              rosgraph monitor                    │
-│                                                  │
-│  ┌───────────────┐     ┌──────────────────────┐ │
-│  │ Declared State │     │   Observed State     │ │
-│  │ (from YAML /  │     │   (from DDS          │ │
-│  │  interface    │     │    discovery)         │ │
-│  │  files)       │     │                      │ │
-│  └───────┬───────┘     └──────────┬───────────┘ │
-│          │                        │              │
-│          └──────────┬─────────────┘              │
-│                     ▼                            │
-│  ┌──────────────────────────────────┐            │
-│  │     Reconciliation Engine        │            │
-│  │                                  │            │
-│  │  Level-triggered (not edge)      │            │
-│  │  Idempotent                      │            │
-│  │  Requeue with backoff            │            │
-│  └──────────────┬───────────────────┘            │
-│                 ▼                                │
-│  ┌──────────────────────────────────┐            │
-│  │     Diff Computation             │            │
-│  │                                  │            │
-│  │  - Missing/extra nodes           │            │
-│  │  - Missing/extra topics          │            │
-│  │  - QoS mismatches                │            │
-│  │  - Type mismatches               │            │
-│  │  - Parameter drift               │            │
-│  └──────────────┬───────────────────┘            │
-│                 ▼                                │
-│  ┌──────────────────────────────────┐            │
-│  │     Exporters                    │            │
-│  │                                  │            │
-│  │  - ROS topics (graph_diff msg)   │            │
-│  │  - Prometheus /metrics endpoint  │            │
-│  │  - Structured log output         │            │
-│  │  - Alerting (via diagnostics)    │            │
-│  └──────────────────────────────────┘            │
-└─────────────────────────────────────────────────┘
-```
-
-**Reconciliation loop:**
-
-```python
-while running:
-    declared = load_declared_graph(interface_files, launch_files)
-    observed = scrape_live_graph(dds_discovery)
-
-    diff = compute_graph_diff(declared, observed)
-
-    if diff.has_issues():
-        publish_diff(diff)           # ROS topic: /rosgraph/diff
-        update_metrics(diff)         # Prometheus: rosgraph_missing_nodes, etc.
-        emit_diagnostics(diff)       # /diagnostics for standard tooling
-
-    publish_status(observed)         # ROS topic: /rosgraph/status
-
-    # Adaptive interval: faster when drifting, slower when stable
-    if diff.has_critical():
-        sleep(1s)
-    else:
-        sleep(5s)
-```
-
-See [§8 Monitor Alert Rules](#8-monitor-alert-rules) for the alert
-system.
-
-**Relationship to graph-monitor:** `rosgraph monitor` is a new
-implementation, not an extension of the existing graph-monitor package.
-graph-monitor's value is its `rmw_stats_shim` and
-`rosgraph_monitor_msgs` message definitions — these are reusable
-regardless of architecture. However, graph-monitor lacks the
-reconciliation engine (declared vs. observed diff) that is the core of
-`rosgraph monitor`, and retrofitting it would constrain the design.
-
-The integration path: adopt or align with graph-monitor's message
-definitions (`rosgraph_monitor_msgs`), reimplement the graph scraping
-and reconciliation, and offer to upstream the reconciliation capability
-back to graph-monitor if its maintainers are interested.
-
-### 3.7 `rosgraph test` — Contract Testing
-
-Schema-driven verification of running nodes against their declarations.
-
-Three testing modes (modelled on Schemathesis, Dredd, and Pact):
-
-**Interface conformance** (Dredd model): Run a node, then
-systematically verify its actual interface matches its
-`interface.yaml`. Check every declared publisher is active, call every
-declared service, verify every parameter exists with the declared type
-and default.
-
-**Fuzz testing** (Schemathesis model): Auto-generate messages matching
-declared subscriber types, publish them, verify the node produces
-outputs on declared publisher topics with correct types.
-
-**Cross-node contract testing** (Pact model): Node A's
-`interface.yaml` declares it subscribes to `/cmd_vel` (Twist). Node
-B's `interface.yaml` declares it publishes `/cmd_vel` (Twist). The
-contract test verifies they agree on type and QoS compatibility.
-
-### 3.8 `rosgraph docs` — Documentation Generation
-
-Auto-generated "Swagger UI for ROS nodes" — browsable API reference
-docs from `interface.yaml`. Covers topics, services, actions,
-parameters, QoS settings, and message type definitions.
-
-Output formats: Markdown (for GitHub Pages / docs.ros.org), HTML
-(standalone), JSON (for embedding in other tools).
-
-### 3.9 `rosgraph breaking` — Breaking Change Detection
-
-Compares two versions of `interface.yaml` and classifies changes:
-
-| Classification | Examples |
-|---|---|
-| **Breaking** | Removed topic, changed message type, removed parameter, incompatible QoS change |
-| **Dangerous** | Changed QoS (may affect connectivity), narrowed parameter range |
-| **Safe** | Added optional parameter, added new publisher, widened parameter range |
-
-Modelled on `buf breaking` and `graphql-inspector`.
-
-### 3.10 `rosgraph discover` — Runtime-to-Spec Generation
-
-Introspects a running node via DDS discovery and generates an
-`interface.yaml` from the observed interface. The "slice of cake"
-brownfield adoption path.
-
-```bash
-# Generate interface.yaml from a running node
-rosgraph discover /lidar_processor -o nodes/lidar_processor/interface.yaml
-```
-
-Modelled on Terraform's `import` command.
-
-### 3.11 Configuration
-
-**`rosgraph.toml`** — single configuration file for all subcommands:
-
-```toml
-[lint]
-select = ["TOP", "SRV", "QOS", "GRF"]  # enable these rule families
-ignore = ["NME001"]                      # except this specific rule
-
-[lint.per-package-ignores]
-"generated_*" = ["ALL"]                  # skip generated packages
-"*_test" = ["GRF002"]                    # allow unused nodes in tests
-
-[generate]
-plugins = ["cpp", "python"]
-out_dir = "generated"
-
-[output]
-format = "text"                          # text | json | sarif | github
-
-[ci]
-new-only = true                          # only new issues (differential)
-base-branch = "main"
-```
-
-### 3.12 Multi-Workspace Analysis
-
-ROS 2 workspaces overlay each other (e.g., `ros_base` underlay + your
-packages + a vendor overlay). When `rosgraph lint` analyzes your
-workspace, it needs interface information from packages in the underlay.
-
-The solution follows the Go analysis framework's per-package fact
-caching pattern: installed `interface.yaml` files in
-`share/<package>/interfaces/` (placed there by `rosgraph_auto_package()`
-at install time) serve as cached analysis artifacts. `rosgraph lint`
-reads these from the underlay without re-analyzing underlay packages,
-only analyzing packages in the current workspace.
-
-This is a Phase 2 concern. Phase 1 assumes a single workspace.
-
-### 3.13 AI & Tooling Integration
-
-`interface.yaml` and the `InterfaceDescriptor` IR (§3.3) are
-machine-readable contracts describing a node's complete API. This
-makes them natural integration points for AI-assisted development
-tools and IDE infrastructure.
-
-**AI as IR consumer.** The JSON-serialized `InterfaceDescriptor`
-contains everything an LLM needs to understand a node's interface:
-topics, types, QoS, parameters, lifecycle state. An AI agent can read
-this to generate implementation code, write tests, suggest fixes, or
-answer questions about the system — without parsing source code.
-
-**MCP server.** A Model Context Protocol server exposing graph state,
-lint results, and interface schemas enables AI coding tools (Claude
-Code, Cursor, Copilot) to query the ROS graph as structured context.
-"What topics does the perception pipeline publish?" answered from
-the graph model, not from grep.
-
-**AI-assisted discovery.** `rosgraph discover` (§3.10) generates
-`interface.yaml` from a running system. The raw output from DDS
-discovery is complete but lacks descriptions, rationale, and grouping.
-An LLM can refine the generated spec — inferring descriptions from
-topic names and message types, suggesting QoS profiles based on
-message patterns, and grouping related interfaces.
-
-**Language Server Protocol (LSP).** An LSP server for `interface.yaml`
-enables IDE features beyond JSON Schema validation: hover for message
-type definitions, go-to-definition for `$ref` targets, inline
-diagnostics from `rosgraph lint`, and cross-file rename support. This
-benefits both human developers and AI agents operating within IDE
-contexts.
-
-**Natural language to spec.** The constrained schema makes
-`interface.yaml` a tractable generation target for LLMs. "I need a
-node that subscribes to a lidar point cloud, filters it, and publishes
-the result" produces a valid `interface.yaml` that `rosgraph generate`
-can immediately scaffold into working code.
-
-These are not Phase 1 deliverables, but the architecture should not
-preclude them. The IR-based plugin protocol (§3.3) and structured
-output formats (JSON, SARIF) are the key enablers — they exist for
-code generation and CI, but AI consumers are a natural extension.
-
-### 3.14 Scale & Fleet Considerations
-
-§3.12 covers multi-workspace analysis. This section addresses
-concerns beyond a single developer's workstation.
-
-**Interface ownership.** In multi-team organizations, `interface.yaml`
-files are shared contracts. The owner is typically the node author
-(they define the interface), but downstream consumers depend on it.
-Changes require coordination. rosgraph supports this via:
-- `rosgraph breaking` (§3.9) — automated detection of breaking
-  changes in CI, blocking merges that break downstream consumers.
-- Installed interfaces in `share/<package>/interfaces/` — downstream
-  teams depend on published interfaces without pulling source code.
-- Semantic versioning alignment — the breaking/dangerous/safe
-  classification maps to semver: breaking = major, dangerous = minor
-  (review required), safe = patch.
-
-**Multi-robot systems.** The `system.yaml` (Layer 2, §3.2) supports
-namespaced node instances (`namespace: /robot1`). For multi-robot
-systems, each robot's graph is a namespaced instance of the same
-`system.yaml`. Fleet-level analysis — "which robots are running
-interface version X?" — is out of scope for Phase 1–2 but the
-architecture supports it: `rosgraph monitor` on each robot publishes
-graph snapshots that a fleet-level aggregator can collect.
-
-**Fleet monitoring.** `rosgraph monitor` (§3.6) runs per-robot. For
-fleet-scale observability, the monitor's Prometheus exporter (M7)
-enables standard fleet dashboards via Grafana. The `/rosgraph/diff`
-topic on each robot can be bridged to a central system for aggregated
-drift analysis. The architecture deliberately uses standard
-observability patterns (Prometheus metrics, structured logs,
-diagnostics topics) rather than inventing fleet-specific
-infrastructure.
-
-**Performance targets.** Build-time targets are stated in DP7 (100
-packages in 5 seconds). Runtime targets for `rosgraph monitor`:
-- Reconciliation cycle: < 500ms for a 200-node system
-- Memory overhead: < 50MB resident for graph state
-- CPU: < 5% of one core at steady-state (5s scrape interval)
-
-These are design targets, not commitments — they guide architectural
-decisions (e.g., choosing Rust for the diff engine).
-
-### 3.15 colcon Integration
-
-`colcon` uses a `VerbExtensionPoint` plugin system — any Python package
-can register new verbs via `setup.cfg` entry points. Existing examples:
-`colcon-clean` adds `colcon clean`, `colcon-cache` adds `colcon cache`.
-
-The architecture is **`rosgraph` as standalone tool, `colcon-rosgraph`
-as thin workspace wrapper**:
-
-```
-colcon-rosgraph (Python, verb plugin)
-  └── delegates to → rosgraph (standalone binary)
-```
-
-This mirrors how `colcon-cmake` shells out to `cmake` — the colcon verb
-handles workspace iteration, package ordering, and parallel execution;
-the core tool handles single-package analysis.
-
-**What maps naturally to colcon verbs:**
-
-| Command | colcon verb | Notes |
-|---|---|---|
-| `rosgraph generate` | — | Already runs via `rosgraph_auto_package()` in `colcon build` |
-| `rosgraph test` | — | Already runs via CTest in `colcon test` |
-| `rosgraph lint` | `colcon lint` | Iterates packages in dependency order, parallel per-package lint |
-| `rosgraph docs` | `colcon docs` | Generates docs per package, aggregates into workspace docs |
-| `rosgraph discover` | `colcon discover` | Generates `interface.yaml` for all running nodes |
-| `rosgraph breaking` | `colcon breaking` | Checks all packages against their previous interface versions |
-
-**What doesn't fit:**
-
-`rosgraph monitor` is a long-running daemon, not a build-and-exit verb.
-It stays as a standalone command (or a `ros2 launch` node).
-
-**Why both CLIs:**
-
-- `colcon lint` for the workspace workflow — lint all packages, respect
-  dependency order, parallel execution, workspace-level reporting.
-- `rosgraph lint path/to/interface.yaml` for single-file use, CI
-  pipelines, and environments without colcon.
-
-**Language independence.** The colcon plugin is always Python (colcon
-requires it), but it delegates to `rosgraph` via subprocess — so the
-core tool's language is unconstrained. Rust, Python, or hybrid all work
-identically. The colcon integration does not factor into the language
-choice (§5).
-
-The colcon plugin is a Phase 2 deliverable — Phase 1 focuses on the
-standalone `rosgraph` tool. The plugin is trivial once the core tool
-exists.
-
----
-
-## 4. Phasing
-
-### Phase 1 — Foundation
-
-Deliver the core schema, basic code generation, and highest-value
-static + runtime checks.
-
-**Schema & generate:**
-- G1-G10 (existing cake features — stabilize and adopt)
-- G11 (lifecycle nodes — blocks nav2/ros2_control adoption)
-- G14 (schema versioning — needed before v1.0)
-
-**Lint (P0 rules):**
-- L1 (topic type mismatch), L2 (QoS compatibility), L3 (disconnected
-  subgraph)
-- L5 (SARIF output), L6 (differential analysis)
-
-**Monitor (P0 features):**
-- M1 (declared-vs-observed diff), M2 (missing node alerting),
-  M5 (graph snapshots)
-
-### Phase 2 — Adoption Enablers
-
-Lower barriers for existing codebases. Fill out the rule set.
-
-**Schema & generate:**
-- G12 (timers), G13 (nested parameters), G15 (mixins)
-- O1 (`rosgraph docs`), O2 (`rosgraph discover`)
-
-**Lint (P1 rules + infrastructure):**
-- L4 (launch validation), L7 (naming), L8 (unused node),
-  L9 (parameter validation), L10 (circular deps)
-- L11 (inline suppression), L12 (per-package config),
-  L13 (`--add-noqa`), L14 (semantic validation)
-
-**Monitor (P1 features):**
-- M3 (QoS drift), M4 (runtime type mismatch), M6 (topic stats),
-  M8 (unexpected node), M9 (health diagnostics)
-
-### Phase 3 — Scale the Toolchain
-
-Enable community extension and advanced analysis.
-
-**Schema & generate:**
-- G16 (plugin architecture), G17 (callback groups),
-  G19 (system composition schema)
-- O3 (`rosgraph breaking`), O4 (`rosgraph test`)
-
-**Lint:**
-- L15 (interface coverage)
-
-**Monitor:**
-- M7 (Prometheus endpoint), M10 (adaptive scrape),
-  M11 (lifecycle state)
-
-### Phase 4 — Ecosystem Integration
-
-Future-proofing and niche use cases.
-
-- G18 (middleware bindings)
-- O5 (`rosgraph policy` — SROS 2 security policies)
-- M12 (runtime interface coverage)
-
-### Adoption Path
-
-rosgraph is unlikely to reach `ros_core` initially — that requires
-broad consensus and a high stability bar. A more realistic progression:
-
-1. **`ros-tooling` organization** (where graph-monitor already lives) —
-   institutional backing, CI infrastructure, release process.
-2. **REP (ROS Enhancement Proposal)** for the `interface.yaml` schema —
-   formalizes the declaration format as a community standard.
-3. **docs.ros.org tutorial integration** — if the "write your first
-   node" tutorial uses `interface.yaml`, every new ROS developer learns
-   it from day one. This is the highest-leverage adoption path.
-4. **`ros_core` proposal** — after demonstrated adoption across multiple
-   distros, propose for inclusion in a future distribution.
-
----
-
-## 5. Language Choice
-
-The implementation language is an open decision for the WG. The
-trade-offs are structural, not preferential.
-
-### Option A: Rust
-
-Follows Ruff's model. Speed as an architectural property.
-
-| Axis | Assessment |
-|---|---|
-| Performance | Best. Single-pass analysis, zero-cost abstractions, no GC pauses. Achieves the "100 packages in 5s" target. |
-| Contribution barrier | Highest. Most ROS contributors know C++/Python, not Rust. |
-| Ecosystem fit | Moderate. `rclrs` exists but is not tier-1. CLI tools don't need ROS client library integration. |
-| Deployment | Single static binary. No runtime dependencies. |
-| Plugin story | WASM plugins (Extism) or process-based (protoc model). |
-
-### Option B: Python
-
-Follows the ROS 2 ecosystem convention.
-
-| Axis | Assessment |
-|---|---|
-| Performance | Weakest. 10-100x slower than Rust for analysis workloads. May not meet performance targets. |
-| Contribution barrier | Lowest. Every ROS developer knows Python. |
-| Ecosystem fit | Best. cake is Python. `launch_ros` is Python. Direct reuse of existing parsing libraries. |
-| Deployment | Requires Python runtime. `pip install` or ROS package. |
-| Plugin story | Native Python plugins. Trivial to write. |
-
-### Option C: Rust core + Python bindings
-
-Hybrid via PyO3. Performance-critical core (parsing, graph model, diff
-engine, lint rules) in Rust; Python CLI and plugin layer on top.
-
-| Axis | Assessment |
-|---|---|
-| Performance | Near-Rust for analysis; Python overhead for CLI/plugin dispatch only. |
-| Contribution barrier | Moderate. Core contributors need Rust; plugin authors use Python. |
-| Ecosystem fit | Good. Python-facing API integrates with ROS ecosystem. |
-| Deployment | Python package with native extension. Requires build toolchain for distribution. |
-| Plugin story | Python plugins (native) + WASM plugins (for sandboxing). |
-
-### Decision factors
-
-The choice depends on which constraint the WG prioritizes:
-- If **speed** is the binding constraint → Rust or hybrid
-- If **community contribution** is the binding constraint → Python
-- If **both matter** → hybrid, accepting the build complexity
-
-Note: the colcon integration (§3.15) does not constrain this choice.
-The `colcon-rosgraph` plugin is always Python but delegates to the
-`rosgraph` binary via subprocess, so the core tool can be any language.
-
----
-
-## 6. Feature List
-
-### Schema & Code Generation (`rosgraph generate`)
-
-| # | Feature | Priority | Description |
-|---|---------|----------|-------------|
-| G1 | YAML interface declaration | P0 | Single `interface.yaml` per node declaring all ROS 2 entities |
-| G2 | JSON Schema validation | P0 | Structural validation with IDE autocompletion via YAML Language Server |
-| G3 | C++ code generation | P0 | Typed context, pub/sub/srv/action wrappers, component registration |
-| G4 | Python code generation | P0 | Dataclass context, pub/sub/srv/action wrappers |
-| G5 | Parameter generation | P0 | Delegates to `generate_parameter_library` (backward-compatible) |
-| G6 | QoS declaration | P0 | Required for pub/sub, supports all DDS QoS policies |
-| G7 | Parameterized QoS | P0 | `${param:name}` references in QoS fields |
-| G8 | Dynamic topic names | P0 | `${param:name}` and `${for_each_param:name}` |
-| G9 | Composition pattern | P0 | Has-a `Node`, not is-a `Node` |
-| G10 | Zero-boilerplate build | P0 | `rosgraph_auto_package()` CMake macro |
-| G11 | Lifecycle node support | P0 | `lifecycle: managed` in node spec |
-| G12 | Timer declarations | P1 | `timers:` section with period, callback name |
-| G13 | Nested parameters | P1 | Hierarchical parameter structures (parity with gen_param_lib) |
-| G14 | Schema versioning | P1 | `schema_version` field with migration tooling |
-| G15 | Mixins / shared fragments | P1 | `$ref` to common interface fragments |
-| G16 | Plugin architecture | P2 | IR-based pipeline, standalone plugins per language |
-| G17 | Callback group declarations | P2 | `callback_groups:` with entity assignment |
-| G18 | Middleware bindings | P3 | Protocol-specific config (DDS, Zenoh) |
-| G19 | System composition schema | P2 | Multi-node graph declaration (`system.yaml`, Layer 2) |
-
-### Static Analysis (`rosgraph lint`)
-
-| # | Feature | Priority | Description |
-|---|---------|----------|-------------|
-| L1 | Topic type mismatch detection | P0 | Flag when pub and sub on same topic disagree on message type |
-| L2 | QoS compatibility checking | P0 | Flag incompatible QoS profiles (reliability, durability, deadline) |
-| L3 | Disconnected subgraph detection | P0 | Flag nodes/topics with no connections |
-| L4 | Launch file validation | P0 | Detect undefined node refs, invalid remaps, unresolved substitutions |
-| L5 | SARIF / CI output | P0 | Structured output for GitHub Security tab, PR annotations |
-| L6 | Differential analysis | P0 | `--new-only` reports only issues introduced since base branch |
-| L7 | Naming convention enforcement | P1 | Check names against configurable patterns |
-| L8 | Unused node detection | P1 | Flag nodes declared but not in any launch config |
-| L9 | Parameter validation | P1 | Check values against declared types, ranges, validators |
-| L10 | Circular dependency detection | P1 | Flag service/action chains that could deadlock |
-| L11 | Inline suppression | P1 | `# rosgraph: noqa: TOP001` in launch/YAML files |
-| L12 | Per-package configuration | P1 | Override rules per package via `rosgraph.toml` |
-| L13 | `--add-noqa` for adoption | P1 | Generate suppression comments for all existing issues |
-| L14 | Semantic validation | P1 | Full type resolution, QoS compatibility checks |
-| L15 | Interface coverage reporting | P2 | Which declared topics/services are exercised in tests |
-
-### Runtime Monitoring (`rosgraph monitor`)
-
-| # | Feature | Priority | Description |
-|---|---------|----------|-------------|
-| M1 | Declared-vs-observed graph diff | P0 | Compare declared interfaces against live DDS discovery |
-| M2 | Missing node alerting | P0 | Alert when a declared node is not present |
-| M3 | QoS drift detection | P0 | Alert when observed QoS differs from declared |
-| M4 | Type mismatch detection (runtime) | P0 | Alert when observed types differ from declaration |
-| M5 | Graph snapshot publishing | P0 | Periodic `rosgraph_monitor_msgs/Graph` snapshots |
-| M6 | Topic statistics | P1 | Message rate, latency, queue depth per topic |
-| M7 | Prometheus /metrics endpoint | P1 | Export graph metrics for Grafana dashboards |
-| M8 | Unexpected node detection | P1 | Alert on nodes present but not declared |
-| M9 | Health diagnostics integration | P1 | Publish to `/diagnostics` for standard ROS tooling |
-| M10 | Adaptive scrape interval | P2 | Faster scraping when drift detected, slower when stable |
-| M11 | Lifecycle state monitoring | P2 | Track lifecycle transitions against expectations |
-| M12 | Interface coverage tracking | P2 | Which declared interfaces are exercised at runtime |
-
-### Other Subcommands
-
-| # | Feature | Subcommand | Priority | Description |
-|---|---------|------------|----------|-------------|
-| O1 | Documentation generation | `rosgraph docs` | P1 | Auto-generated API reference from schema |
-| O2 | Runtime-to-spec discovery | `rosgraph discover` | P1 | Introspect running nodes → `interface.yaml` |
-| O3 | Breaking change detection | `rosgraph breaking` | P2 | Detect breaking interface changes across releases |
-| O4 | Contract testing | `rosgraph test` | P2 | Schema-driven verification of running nodes |
-| O5 | Security policy generation | `rosgraph policy` | P3 | Auto-generate SROS 2 policies from schema |
-
----
-
-## 7. Lint Rule Codes
-
-Rule codes use hierarchical prefix system (modelled on Ruff). Rules
-can be selected at any granularity: `TOP` (all topic rules),
-`TOP001` (specific rule).
-
-| Prefix | Category | Example rules |
-|--------|----------|---------------|
-| `TOP` | Topic rules | `TOP001` type mismatch, `TOP002` no subscribers, `TOP003` naming convention |
-| `SRV` | Service rules | `SRV001` unmatched client, `SRV002` type mismatch |
-| `ACT` | Action rules | `ACT001` unmatched client, `ACT002` type mismatch |
-| `PRM` | Parameter rules | `PRM001` missing default, `PRM002` type violation, `PRM003` undeclared |
-| `QOS` | QoS rules | `QOS001` reliability mismatch, `QOS002` durability incompatible, `QOS003` deadline violation |
-| `LCH` | Launch rules | `LCH001` undefined node ref, `LCH002` invalid remap, `LCH003` unresolved substitution |
-| `GRF` | Graph-level rules | `GRF001` disconnected subgraph, `GRF002` unused node, `GRF003` circular dependency |
-| `NME` | Naming rules | `NME001` topic naming convention, `NME002` node naming convention |
-| `SAF` | Safety rules | `SAF001` insufficient redundancy, `SAF002` single point of failure, `SAF003` unmanaged safety node |
-| `TF` | TF frame rules | `TF001` undeclared frame_id, `TF002` broken frame chain |
-
-**Rule lifecycle:** preview → stable → deprecated → removed. New rules
-always enter as preview.
-
-**Fix applicability:** Safe (preserves semantics), unsafe (may alter
-behaviour), display-only (suggestion). Per-rule override via config.
-
----
-
-## 8. Monitor Alert Rules
-
-| Alert | Condition | Grace period | Severity |
-|---|---|---|---|
-| `NodeMissing` | Declared node not observed | 10s | critical |
-| `UnexpectedNode` | Observed node not declared | 30s | warning |
-| `TopicMissing` | Declared topic not present | 5s | critical |
-| `QoSMismatch` | Declared QoS ≠ observed QoS | 0s | error |
-| `TypeMismatch` | Declared msg type ≠ observed | 0s | critical |
-| `ThroughputDrop` | Rate < expected minimum | 30s | warning |
-
-Grace periods prevent flapping during startup and transient states.
-All thresholds (grace period, severity) are configurable via
-`rosgraph.toml` — see §11.3 for safety-critical overrides.
-
----
-
-## 9. Existing ROS 2 Ecosystem
-
-### 9.1 Maturity Matrix
-
-| Tool | Stars | Contributors | Last active | Maturity | Bus factor |
-|---|---|---|---|---|---|
-| **generate_parameter_library** | 353 | 41 | 2026-02 | Production | Healthy |
-| **ros2_tracing** | 237 | 30 | 2026-02 | Production (QL1) | Healthy |
-| **topic_tools** | 126 | 25 | 2025-08 | Mature | Healthy |
-| **launch_ros** | 78 | 71 | 2026-02 | Core infrastructure | Healthy |
-| **cake** | 36 | 1 | 2026-02 | Early-stage | 1 (risk) |
-| **graph-monitor** | 31 | 3 | 2025-11 | Mid-stage | Low |
-| **nodl** | 10 | 7 | 2022-11 | Dormant | N/A |
-| **clingwrap** | 9 | 1 | 2026-02 | Early-stage | 1 (risk) |
-| **breadcrumb** | 6 | 1 | 2026-02 | Early-stage | 1 (risk) |
-| **HAROS** (ROS 1) | 197 | — | 2021-09 | Abandoned | N/A |
-| **CARET** | 97 | 18 | active | Mature (Tier IV) | Healthy |
-
-### 9.2 Tool Assessments
-
-**cake** — Declarative code generation. `interface.yaml` → C++ and
-Python node scaffolding. Functional pattern (has-a Node, not is-a
-Node). The fundamental bet is correct: making the interface declaration
-the source of truth for code generation is the only way to prevent
-schema-code drift. Core design decisions (YAML-driven,
-composition-based, schema-validated, codegen-first) are sound.
-cake's author is a WG member; rosgraph's Layer 1 schema builds
-directly on cake's format, and G1–G10 represent stabilizing cake's
-capabilities under the rosgraph umbrella — addressing the bus-factor
-risk while preserving the design. Gaps: no lifecycle support, no
-timers, no nested parameters, no formal IR, no plugin architecture,
-no runtime-to-spec generation.
-
-**generate_parameter_library** — The most mature tool in the space.
-Production-proven in MoveIt2 and ros2_control. Rich validation. The
-unification path: the `parameters:` section of `interface.yaml` IS the
-`generate_parameter_library` format (already demonstrated in cake).
-rosgraph delegates to `generate_parameter_library` at build time rather
-than reimplementing parameter generation. The key invariant: a
-standalone gen_param_lib YAML file works as-is when placed in the
-`parameters:` block of `interface.yaml`. Ownership transfer to
-`ros-tooling` would be ideal but is not required — schema compatibility
-is sufficient.
-
-**graph-monitor** — Official ROSGraph WG backing. Publishes structured
-graph messages. The `rmw_stats_shim` approach is architecturally sound.
-Gap: can report *what exists* but not *what's wrong* — no comparison
-against a declared spec.
-
-**breadcrumb + clingwrap** — Proves the concept of static graph
-extraction from launch files. The tight coupling to clingwrap's
-non-standard launch API is the primary concern. Static analysis should
-work with standard `launch_ros` patterns.
-
-**nodl** — Dormant since 2022. Correct problem identification but
-fatal flaw: no code generation. Superseded by cake's YAML approach.
-Key lesson: **a description format without code generation is a
-non-starter.**
-
-**ros2_tracing + CARET** — The most mature dynamic analysis tools.
-QL1 certification, production-proven at Tier IV. Complementary to
-rosgraph: tracing provides instrumentation, CARET provides latency
-analysis, rosgraph provides graph structure analysis.
-
-### 9.3 Gap Analysis
-
-| Category | Capability | Current tool | Status |
-|---|---|---|---|
-| **Schema** | Node interface declaration | cake / nodl / gen_param_lib | cake early; nodl dead; gpl params-only |
-| **Codegen** | Static graph from launch files | breadcrumb + clingwrap | Early-stage, solo dev |
-| **Runtime** | Runtime graph monitoring | graph-monitor | Mid-stage, institutional |
-| **Runtime** | Runtime tracing | ros2_tracing | Mature, production |
-| **Runtime** | Latency analysis | CARET | Mature, Tier IV |
-| **Runtime** | Graph visualisation | Foxglove, Dear RosNodeViewer | Mature but live-only |
-| **Runtime** | **Graph diff (expected vs. actual)** | **Nothing** | **Major gap** |
-| **Static** | **Graph linting (pre-launch)** | **Nothing** | **Major gap** |
-| **Static** | **QoS static analysis** | breadcrumb (partial) | Early-stage |
-| **Static** | **CI graph validation** | **Nothing** | **Major gap** |
-| **Docs** | **Node API documentation** | **Nothing** (hand-written only) | **Major gap** |
-| — | **Behavioural properties** | **Nothing** (HPL was ROS 1) | **Major gap** |
-
----
-
-## 10. Prior Art
-
-Organized by what we borrow, not by framework. Each framework appears
-once at its primary contribution.
-
-### 10.1 Schema Design
-
-#### AsyncAPI
-
-The closest structural match to ROS topics. Version 3 cleanly separates
-channels, messages, operations, and components at the top level.
-
-**What to borrow:**
-- **Structural separation.** `publishers`, `subscribers`, `services`,
-  `actions`, `parameters` as peer top-level sections.
-- **`components` + `$ref` pattern.** Define QoS profiles or common
-  parameter sets once, reference everywhere.
-- **Trait system.** Define a `reliable_sensor` trait with QoS settings,
-  apply to multiple publishers. Traits merge via JSON Merge Patch
-  (RFC 7386).
-- **Protocol bindings.** Core schema stays middleware-agnostic;
-  DDS-specific QoS, Zenoh settings, or shared-memory config in a
-  `bindings:` block.
-- **Parameterized addresses.** Topic name templates
-  (`sensors/{robot_name}/lidar`) map to ROS 2 namespace/remapping and
-  `${param:name}` syntax.
-
-**Gaps:** No services (as typed req/res pair), no actions, no
-parameters, no lifecycle, no timers, no TF frames. Single-application
-scope (which is actually the right scope for a node interface).
-
-#### Smithy (AWS)
-
-Protocol-agnostic interface definition language. Shapes decorated with
-traits.
-
-**What to borrow:**
-- **Typed, composable traits** for extensible metadata — the most
-  powerful metadata mechanism surveyed:
-  ```
-  @qos(reliability: "reliable", depth: 10)
-  @lifecycle(managed: true)
-  @parameter_range(min: 0.0, max: 10.0)
-  @frame_id("base_link")
-  ```
-- **Mixins** for shared structure. A `lifecycle_diagnostics` mixin adds
-  a diagnostics publisher and period parameter to any node that
-  includes it.
-- **Resource lifecycle operations** — maps to ROS 2 lifecycle node
-  transitions.
-
-#### CUE
-
-Constraint-based configuration language where types and values are the
-same thing. Not a codegen tool — a validation tool.
-
-**What to borrow:**
-- **Constraints as types.** `voxel_size: float & >=0.01 & <=1.0`. The
-  JSON Schema equivalent (`minimum`, `maximum`, `enum`) is already
-  used by the existing `interface.schema.yaml`.
-- **Incremental constraints.** Base schema + deployment-specific
-  overlays (e.g., production QoS profiles layered onto a base
-  `interface.yaml`).
-- **Configuration validation.** Validate that launch parameter
-  overrides are compatible with a node's declared interface.
-
-### 10.2 Pipeline & Code Generation
-
-#### Protocol Buffers / Buf CLI
-
-The single most important architectural lesson: **an intermediate
-representation (IR) between parsing and generation**.
-
-```
-interface.yaml ──> [Parser/Validator] ──> InterfaceDescriptor (IR)
-                                            ├──> [Plugin: C++]    ──> scaffolding
-                                            ├──> [Plugin: Python] ──> scaffolding
-                                            ├──> [Plugin: Docs]   ──> API reference
-                                            └──> [Plugin: Launch] ──> templates
-```
-
-**What to borrow:**
-- **IR-based plugin protocol.** Standalone executables consuming a
-  serialized `InterfaceDescriptor` via stdin/file. Community members
-  write `rosgraph-gen-rust` without touching the core codebase.
-- **Config-driven generation** (`buf.gen.yaml` pattern):
-  ```yaml
-  version: 1
-  plugins:
-    - name: cpp
-      out: generated/cpp
-      options: { lifecycle: managed }
-    - name: python
-      out: generated/python
-  ```
-- **Validation as separate layers.** Structural (does the YAML parse?)
-  → semantic (do referenced types exist?) → breaking (did the interface
-  change incompatibly?). Maps to `rosgraph lint`, `rosgraph validate`,
-  `rosgraph breaking`.
-- **Deterministic, reproducible output.** Same inputs → byte-identical
-  output. CI can verify generated code is up to date.
-
-**What to borrow from Buf CLI specifically:**
-- `buf lint` — configurable schema linting with ~50 rules by category.
-  Config-driven rule selection.
-- `buf breaking` — breaking change detection between schema versions.
-- Integrated toolchain: `buf generate`, `buf lint`, `buf breaking`,
-  `buf format` as subcommands of one tool.
-
-#### TypeSpec (Microsoft)
-
-**What to borrow:**
-- **Multi-emitter architecture.** One spec, many outputs:
-  ```
-  interface.yaml ──> C++ emitter       ──> node_interface.hpp
-                 ──> Python emitter    ──> interface.py
-                 ──> Docs emitter      ──> node_api_reference.md
-                 ──> Launch emitter    ──> default_launch.py
-                 ──> Graph emitter     ──> rosgraph_monitor_msgs/NodeInterface
-  ```
-- **Emitter-specific validation.** Each emitter adds its own checks
-  (e.g., C++ emitter warns about names that produce invalid C++
-  identifiers).
-
-#### OpenAPI
-
-**What to borrow:**
-- **The "Swagger UI" experience.** Auto-generated interactive
-  documentation from a schema. A "Swagger UI for ROS nodes" where every
-  node has browsable API docs showing topics, services, actions,
-  parameters, QoS, and message type definitions — generated from
-  `interface.yaml`.
-- **JSON Schema integration.** OpenAPI 3.1 aligned fully with JSON
-  Schema. The existing `interface.schema.yaml` (JSON Schema Draft
-  2020-12) is the right foundation.
-
-### 10.3 Static Analysis Architecture
-
-#### Ruff
-
-A Python linter written in Rust. Relevant not for Python linting but as
-the **best-in-class architecture for building a rule-based analysis
-tool**.
-
-**What to borrow:**
-
-| Ruff pattern | rosgraph equivalent |
-|---|---|
-| Rule enum + compile-time registry | `Rule` enum: `TOP001`, `SRV001`, `QOS001`, `GRF001` |
-| Hierarchical prefix codes | `TOP` (topic), `SRV` (service), `ACT` (action), `QOS`, `GRF` (graph) |
-| Single-pass traversal | Build graph model once, run all rules in one walk |
-| Safe/unsafe fix classification | Safe: add missing QoS. Unsafe: rename topic. Display-only: suggest restructure |
-| Preview → stable lifecycle | Same graduation for new rules |
-| Per-file-ignores | Per-package-ignores, per-launch-file-ignores |
-| Inline suppression | `# rosgraph: noqa: TOP001` |
-| SARIF output | GitHub Security tab integration |
-| Monolithic, no plugins initially | All rules built-in. WASM plugins later |
-| Zero-config defaults | Small, high-confidence default rule set |
-| `--add-noqa` for gradual adoption | Essential for existing ROS workspaces |
-
-**Key architectural lesson:** Speed is an architectural property, not an
-optimisation. Rust + hand-written parser + single-pass + parallel
-package processing + content caching + compile-time codegen.
-
-#### Go Analysis Framework
-
-The gold standard for pluggable static analysis architecture. Used by
-`go vet`, gopls, and golangci-lint.
-
-**What to borrow:**
-
-```
-GraphAnalyzer {
-    name:        str
-    doc:         str
-    requires:    [GraphAnalyzer]      # horizontal deps
-    result_type: Type | None          # typed output for dependent analyzers
-    fact_types:  [Fact]               # cross-package facts
-    run:         (GraphPass) → (result, [Diagnostic])
-}
-
-GraphPass {
-    graph:       ComputationGraph     # the full graph model
-    node:        NodeInterface        # current node under analysis
-    types:       MessageTypeDB        # all known msg/srv/action types
-    qos:         QoSProfileDB         # QoS profiles in the graph
-    result_of:   {Analyzer: Any}      # results from required analyzers
-    report:      (Diagnostic) → void
-    import_fact: (scope, Fact) → bool
-    export_fact: (scope, Fact) → void
-}
-```
-
-Key patterns:
-1. **Analyzers as values, not subclasses** — trivially composable
-2. **Pass as abstraction barrier** — same analyzer in CLI, IDE, CI
-3. **Horizontal dependencies** via `Requires`/`ResultOf` — typed data
-   flow between analyzers
-4. **Vertical facts** for cross-package analysis — cached per-package
-   results enabling separate modular analysis
-5. **Action graph** — 2D grid (analyzer x package), independent actions
-   execute in parallel
-
-#### golangci-lint
-
-**What to borrow:**
-- **Meta-linter pattern.** One CLI, one config, one output format
-  wrapping many analyzers.
-- **Shared parse.** All analyzers share one AST/model parse.
-- **Post-processing pipeline.** `noqa` filter → exclusion rules →
-  severity assignment → deduplication → output formatting.
-- **Differential analysis.** `new-from-merge-base: main` reports only
-  issues in code changed since the base branch. Critical for CI
-  adoption in large codebases.
-
-#### Spectral
-
-**What to borrow:**
-- **YAML-native lint rules** that work directly on `interface.yaml`
-  without language-specific parsing. Custom rulesets in YAML — a
-  robotics engineer can author a rule without knowing Rust or C++.
-  Low barrier to writing new rules.
-
-### 10.4 Runtime Monitoring Architecture
-
-#### OpenTelemetry
-
-Collector pipeline: Receiver → Processor → Exporter. Connectors join
-pipelines and enable signal type conversion.
-
-**What to borrow:**
-- **Pipeline architecture** for `rosgraph monitor`.
-- **Auto-instrumentation.** Two complementary paths:
-  - *Runtime observation* (zero-code): DDS discovery provides the graph
-    without modifying any node.
-  - *Code-generated instrumentation*: rosgraph-generated code embeds
-    topic stats, heartbeats, structured logging.
-  - The **three-way comparison** (declared vs. runtime-observed vs.
-    self-reported) catches issues that any two-way comparison misses.
-
-#### Prometheus
-
-**What to borrow:**
-- **Pull model.** Periodic scraping produces consistent point-in-time
-  snapshots. Absence of data is itself a signal (node is down).
-- **Alerting rules** with `for` durations to prevent flapping.
-- **Metric types mapping:**
-
-  | Prometheus type | ROS topic statistics equivalent |
-  |---|---|
-  | Counter | Messages published (total), dropped messages |
-  | Gauge | Active subscribers, queue depth, alive nodes |
-  | Histogram | Inter-arrival times, message sizes, latency distribution |
-
-#### Kubernetes Controllers
-
-**What to borrow:**
-- **Level-triggered reconciliation** (not edge-triggered). React to the
-  *current difference* between desired and actual state, not to
-  individual change events. If an event is missed, the next
-  reconciliation still catches the drift.
-- **Idempotent.** Running reconciliation twice with the same state
-  produces the same diff and alerts.
-- **Requeue with backoff.** After detecting drift, recheck sooner (1s).
-  If drift persists, escalate.
-- **Status reporting.** Maintained separately from the declared spec,
-  enabling external tools to query current state independently.
-
-### 10.5 Contract Testing & Verification
-
-| Framework | What it does | What to borrow for `rosgraph test` |
-|---|---|---|
-| **Schemathesis** | Fuzz a live API against its OpenAPI spec. Auto-generates test cases from schema. | Fuzz a running node against `interface.yaml` — auto-generate messages matching declared types, verify outputs. |
-| **Dredd** | Start a live server, send requests matching the spec, validate responses. The spec IS the test plan. | Run a node, systematically verify its interface matches declaration. Call every service, check every publisher. |
-| **Pact** | Consumer-driven contract testing. Consumer declares expectations; provider verifies. | Cross-node contract verification: Node A subscribes to `/cmd_vel` (Twist), Node B publishes it. Verify they agree on type. |
-| **gRPC health + reflection** | Standardized health checking + runtime introspection of services/methods. | Health reporting interface that rosgraph-generated nodes expose automatically. Runtime introspection vs. declared interface. |
-| **graphql-inspector** | Schema diff (breaking/dangerous/safe). Coverage: which fields are actually queried. | Interface coverage: "which declared topics are exercised in tests?" Schema diff between interface versions. |
-
-### 10.6 ROS Domain Prior Art: HAROS
-
-The High-Assurance ROS framework (University of Minho, 2016–2021). The
-only tool that accomplished Goals 3–4 for ROS, but only for ROS 1.
-
-**Pipeline:** Package discovery → CMake parsing → launch file parsing →
-source code parsing (libclang for C++, limited Python AST) →
-computation graph assembly → plugin-based analysis → JSON export.
-
-**The metamodel.** Formal classes for the ROS graph: `Node`,
-`NodeInstance`, `Topic`, `Service`, `Parameter`, plus typed link classes
-(`PublishLink`, `SubscribeLink`, etc.) carrying source conditions and
-dependency sets. This metamodel is HAROS's most transferable
-contribution.
-
-**HPL (HAROS Property Language).** Behavioural properties for
-message-passing systems:
-```
-globally: no /cmd_vel {linear.x > 1.0}           # speed limit
-globally: /bumper causes /stop_cmd                 # response
-globally: /cmd_vel requires /trajectory within 5s  # precedence
-```
-
-HPL drove three verification paths from a single spec: model checking
-(Electrum/Alloy), runtime monitors (generated), and property-based
-testing (Hypothesis strategies).
-
-**Why it died for ROS 2.** The extraction pipeline assumes catkin,
-`rospack`, XML launch files, `ros::NodeHandle`. ROS 2 changed
-everything. The maintainer closed ROS 2 support as *wontfix*.
-
-**What to borrow:** Metamodel, HPL's scope+pattern+event structure,
-plugin separation (source-level vs. model-level), one spec → multiple
-verification modes.
-
-**What to do differently:** Use declarations (`interface.yaml`) as
-primary source of truth (not source code parsing); support ROS 2
-concepts HAROS never had (QoS, lifecycle, components, actions, DDS
-discovery).
-
----
-
-## 11. Safety & Certification
-
-rosgraph is not a safety tool — it is a development and verification
-tool that produces artifacts useful in safety cases. This section maps
-rosgraph capabilities to the evidence types required by safety
-standards.
-
-### 11.1 Relevant Standards
-
-| Standard | Domain | How rosgraph helps |
-|---|---|---|
-| **IEC 61508** | General functional safety | Design verification evidence (graph analysis), runtime monitoring |
-| **ISO 26262** | Automotive | Interface specification (`interface.yaml` as design artifact), static verification |
-| **IEC 62304** | Medical device software | Software architecture documentation, traceability |
-| **DO-178C** | Aerospace | Requirements traceability, structural coverage analysis |
-| **ISO 13482** | Service robots | Interface documentation, runtime monitoring |
-| **ISO 21448 (SOTIF)** | Safety of intended functionality | Graph analysis for identifying missing/unexpected interfaces |
-
-### 11.2 Artifact-to-Evidence Mapping
-
-| rosgraph artifact | Evidence type | Useful for |
-|---|---|---|
-| `interface.yaml` | Software architecture description | Design phase documentation |
-| `rosgraph lint` SARIF output | Static analysis results | Verification evidence |
-| `rosgraph monitor` logs | Runtime verification evidence | Validation phase |
-| `rosgraph test` results | Interface conformance evidence | Integration testing |
-| `rosgraph breaking` output | Change impact analysis | Change management |
-| `rosgraph docs` output | API documentation | Design review |
-
-### 11.3 Configurable Safety Levels
-
-Monitor alert grace periods (§8) and severity levels must be
-configurable for safety-critical deployments:
-
-```toml
-[monitor.alerts]
-NodeMissing = { grace_period_ms = 1000, severity = "critical" }   # 1s for surgical robot
-UnexpectedNode = { grace_period_ms = 5000, severity = "error" }
-TopicMissing = { grace_period_ms = 500, severity = "critical" }
-```
-
-The defaults in §8 are tuned for general robotics. Safety-critical
-deployments override them via `rosgraph.toml`.
-
-### 11.4 Behavioral Properties (Future)
-
-Structural analysis (Phase 1–2) proves the graph is correctly wired —
-a necessary precondition for behavioral safety. Behavioral analysis
-(Phase 3+) proves temporal and causal properties:
-
-```
-globally: /emergency_stop causes /motor_disable within 100ms
-globally: no /cmd_vel {linear.x > max_speed}
-globally: /heartbeat absent_for 500ms causes /safe_stop
-```
-
-This capability, inspired by HAROS HPL (§10.6), is where the deeper
-safety value lies. The structural graph model (§3.1) is designed to
-be extensible to behavioral annotations without schema redesign.
-
-### 11.5 Safety-Relevant Lint Rules (Future)
-
-| Rule | Description | Phase |
-|---|---|---|
-| `SAF001` | Critical subscriber has < N publishers (no redundancy) | 2 |
-| `SAF002` | Single point of failure in graph topology | 2 |
-| `SAF003` | Safety-critical node is not lifecycle-managed | 2 |
-| `TF001` | Declared `frame_id` not published by any node in graph | 2 |
-| `TF002` | Frame chain broken (no transform path between declared frames) | 3 |
-
-These rules are not in Phase 1 but the analyzer architecture (§3.5)
-supports adding them without architectural changes.
-
----
-
-## 12. Scope & Limitations
-
-### When Not to Use rosgraph
-
-rosgraph adds value when the cost of interface bugs exceeds the cost
-of maintaining declarations. This trade-off favors rosgraph in
-multi-node systems, team environments, and production deployments.
-It does not favor rosgraph in every context:
-
-- **Quick prototyping.** If you're experimenting with a single node
-  and will throw it away next week, `interface.yaml` is overhead.
-  Use standard `rclcpp` / `rclpy` directly.
-- **Single-node packages.** A package with one node and no
-  cross-package interfaces gets minimal lint value. The code
-  generation may still be worthwhile for parameter validation.
-- **Highly dynamic interfaces.** Nodes that create publishers and
-  subscribers at runtime based on dynamic conditions (e.g., a
-  plugin host that discovers its interface at startup) are outside
-  scope (DP12). rosgraph can declare the static portion and flag
-  the dynamic portion as unexpected, but it cannot generate code
-  for interfaces it doesn't know about at build time.
-
-### Known Limitations
-
-**Spec-code drift for business logic.** Code generation covers the
-structural skeleton (pub/sub creation, parameter declaration, lifecycle
-transitions). Business logic is hand-written. If a developer adds an
-undeclared publisher inside a callback, `rosgraph lint` won't catch it
-at build time — only `rosgraph monitor` flags it at runtime as
-`UnexpectedTopic`. This is a fundamental limitation of any
-declaration-based approach: the declaration describes the intended
-interface, not the implementation.
-
-**Launch file coverage.** Python launch files are Turing-complete.
-AST pattern matching (§3.5) handles common declarative patterns but
-cannot resolve dynamic logic (conditionals based on environment
-variables, loops generating node sets). `system.yaml` (Layer 2) is
-the escape hatch for systems that need full static analyzability.
-
-**Ecosystem bootstrapping.** rosgraph's cross-package analysis (type
-mismatch detection, contract testing) requires multiple packages to
-have `interface.yaml`. The single-package value proposition is code
-generation and parameter validation. Cross-package value grows with
-adoption. `rosgraph discover` (§3.10) lowers the barrier by generating
-specs from running systems, but the generated specs require human
-review and refinement.
-
-**Scope of this proposal.** This document covers 51 features across
-7 subcommands. Not all will be built. Phase 1 (§4) is the commitment
-— the minimum viable tool that delivers value. Later phases are
-contingent on adoption and contributor capacity.
-

From f27121fee8fc4078f5c2cc928ffde14bd9d21f84 Mon Sep 17 00:00:00 2001
From: Luke Sy <sylukewicent@gmail.com>
Date: Fri, 13 Mar 2026 03:47:30 +1100
Subject: [PATCH 4/5] Remove FAQ.md to simplify CLI reference maintenance

Signed-off-by: Luke Sy <sylukewicent@gmail.com>
---
 docs/FAQ.md | 417 ----------------------------------------------------
 1 file changed, 417 deletions(-)
 delete mode 100644 docs/FAQ.md

diff --git a/docs/FAQ.md b/docs/FAQ.md
deleted file mode 100644
index 583cd5c..0000000
--- a/docs/FAQ.md
+++ /dev/null
@@ -1,417 +0,0 @@
-# rosgraph — Frequently Asked Questions
-
-> **Parent:** [ROSGRAPH.md](ROSGRAPH.md) (technical proposal)
-
-Organized by who's asking. Find your perspective, jump to the
-questions that matter to you.
-
----
-
-## Table of Contents
-
-0. [General](#0-general)
-1. [New ROS Developer](#1-new-ros-developer)
-2. [Engineering Lead / System Integrator / DevOps](#2-engineering-lead--system-integrator--devops)
-3. [MoveIt / nav2 / Popular Module User](#3-moveit--nav2--popular-module-user)
-4. [AI-Assisted Developer](#4-ai-assisted-developer)
-5. [Package Maintainer / ROS Governance](#5-package-maintainer--ros-governance)
-6. [Educator / University Researcher](#6-educator--university-researcher)
-7. [Embedded / Resource-Constrained Developer](#7-embedded--resource-constrained-developer)
-8. [The Skeptic](#8-the-skeptic)
-9. [Safety-Critical Engineer](#9-safety-critical-engineer)
-
----
-
-## 0. General
-
-### What problem does rosgraph solve?
-
-When you connect ROS 2 nodes together, mistakes are invisible. If one
-node sends a `Twist` message but another node expects a
-`TwistStamped`, nothing warns you — the subscriber just never receives
-data. If you misspell a topic name in a launch file, the node launches
-fine but sits there doing nothing. You end up staring at
-`ros2 topic list` wondering why nothing is connected.
-
-rosgraph catches these wiring mistakes before you even launch your
-system. You describe what each node publishes, subscribes to, and what
-settings it needs in a short YAML file. Then `rosgraph lint` checks
-that everything fits together — like a spell checker, but for your
-ROS graph.
-
-See [ROSGRAPH.md §1, "The Problem,
-Concretely"](ROSGRAPH.md#the-problem-concretely) for four real-world
-examples.
-
-### How much do I need to learn?
-
-One file per node (`interface.yaml`, about 15 lines) and three
-commands:
-
-```bash
-rosgraph generate .   # creates starter code from your YAML
-rosgraph lint .       # checks for wiring mistakes
-rosgraph monitor      # watches the running system for problems
-```
-
-Your editor will autocomplete the YAML fields for you — no need to
-memorize the format. See the [Quick
-Start](ROSGRAPH.md#quick-start-what-it-looks-like) for a complete
-example.
-
-### What's the overhead?
-
-Per node: one `interface.yaml` file (~15-30 lines). Most of it is
-information you're already specifying in code (topic names, message
-types, QoS settings, parameter names) — `interface.yaml` centralizes
-it.
-
-What you get back:
-- No pub/sub boilerplate (generated)
-- No parameter declaration boilerplate (generated via
-  `generate_parameter_library`)
-- Pre-launch graph validation
-- Runtime graph monitoring
-- Auto-generated API documentation
-
-The net line-count change is typically negative for nodes with
-parameters.
-
-### What about my launch files and parameter configs?
-
-`system.yaml` (Layer 2) overlaps heavily with both — all three
-describe which nodes run, with what parameters, and with what
-remappings. The long-term direction is convergence: `system.yaml`
-becomes the graph spec, the parameter config, *and* the launch
-description in one file. `rosgraph generate` emits a runnable launch
-file from the same spec that `rosgraph lint` validates — no drift
-between what you analyze and what you run.
-
-For projects with multiple deployment configurations (sim, real, test),
-each gets its own `system.yaml`, replacing both the per-config launch
-file and the per-config parameter YAML. See [ROSGRAPH.md
-§3.2](ROSGRAPH.md#32-schema-layers).
-
-### Won't the spec just drift from reality like NoDL?
-
-NoDL died because it was a pure description format — no code
-generation. Maintaining a spec that doesn't produce anything is
-thankless work.
-
-`interface.yaml` generates code. If you change the spec, the generated
-code changes. If you change the code without changing the spec,
-`rosgraph monitor` flags the discrepancy at runtime. The two-way
-binding (codegen + runtime monitoring) is what prevents the drift
-that killed NoDL.
-
-The honest limitation: business logic is hand-written. If a developer
-adds an undeclared publisher inside a callback, `rosgraph lint` won't
-catch it at build time. `rosgraph monitor` catches it at runtime as
-`UnexpectedTopic`. See [ROSGRAPH.md
-§12](ROSGRAPH.md#12-scope--limitations).
-
----
-
-## 1. New ROS Developer
-
-### What does rosgraph do for me?
-
-- **Writes the repetitive code.** Creating publishers, subscribers,
-  and declaring parameters — `rosgraph generate` handles this from
-  your YAML file. You write only the interesting part (what your node
-  actually *does*).
-- **Catches mistakes early.** Mismatched message types, misspelled
-  topic names, incompatible connection settings — found in seconds,
-  not after a 30-second launch-debug-relaunch cycle.
-- **Keeps settings in one place.** Parameter names, types, and default
-  values live in `interface.yaml` instead of scattered across your
-  code, launch files, and README.
-
-### Will error messages make sense?
-
-Yes — this is a design priority. Each error tells you:
-
-- **Where:** which file and line has the problem
-- **What:** a plain description of what's wrong
-- **How to fix it:** a suggested correction, auto-applied when safe
-
-No cryptic stack traces. No silent failures. See [ROSGRAPH.md
-§10.3](ROSGRAPH.md#103-static-analysis-architecture) for the error
-design.
-
----
-
-## 2. Engineering Lead / System Integrator / DevOps
-
-### How does this scale to hundreds of packages?
-
-- **Lint performance target:** 100 packages in under 5 seconds
-  (Design Principle 7). Analysis is single-pass over the graph model
-  with parallel per-package processing and content caching.
-- **Multi-workspace analysis:** Installed `interface.yaml` files in
-  underlays serve as cached facts. Only your workspace is analyzed,
-  not the entire underlay. See [ROSGRAPH.md
-  §3.12](ROSGRAPH.md#312-multi-workspace-analysis).
-- **Differential analysis:** `--new-only` reports only issues
-  introduced since the base branch. No noise from existing code.
-
-### I compose nodes from multiple vendors. How does rosgraph help?
-
-`system.yaml` (Layer 2 schema, [ROSGRAPH.md
-§3.2](ROSGRAPH.md#32-schema-layers)) declares the intended system
-composition — which nodes, which namespaces, which parameter overrides,
-which remappings. `rosgraph lint` validates the composed graph:
-
-- **Type mismatches** across package boundaries
-- **QoS incompatibilities** between a vendor's publisher and your
-  subscriber
-- **Disconnected subgraphs** — nodes that should be connected but
-  aren't due to a namespace or remapping error
-
-If a vendor doesn't ship `interface.yaml`, use `rosgraph discover`
-([ROSGRAPH.md
-§3.10](ROSGRAPH.md#310-rosgraph-discover--runtime-to-spec-generation))
-to generate one from a running instance of the vendor's node.
-
-### How does rosgraph fit into CI?
-
-rosgraph is CI-first by design (Design Principle 8):
-
-```yaml
-# GitHub Actions example
-- name: Lint graph
-  run: rosgraph lint . --output-format sarif --new-only --base main
-
-- name: Check breaking changes
-  run: rosgraph breaking --base main
-
-- name: Run contract tests
-  run: rosgraph test
-```
-
-Output formats: `text`, `json`, `sarif` (GitHub Security tab),
-`github` (Actions annotations), `junit` (test reports). See
-[ROSGRAPH.md §3.11](ROSGRAPH.md#311-configuration).
-
----
-
-## 3. MoveIt / nav2 / Popular Module User
-
-### Does rosgraph work with nav2's plugin system?
-
-Yes, via the mixin system ([ROSGRAPH.md
-§3.2](ROSGRAPH.md#32-schema-layers)). Plugins that inject interfaces
-into a host node are declared as mixins:
-
-```yaml
-# nodes/follow_path/interface.yaml
-node:
-  name: follow_path
-  package: nav2_controller
-
-mixins:
-  - ref: dwb_core/dwb_local_planner
-  - ref: nav2_costmap_2d/costmap
-```
-
-The host's effective interface = its own declaration + all mixin
-interfaces merged. Mixins are Phase 2 (G15). Phase 1 works for nodes
-without plugins.
-
-### What about `generate_parameter_library` compatibility?
-
-Full compatibility is a non-negotiable design principle ([ROSGRAPH.md
-§2, DP9](ROSGRAPH.md#2-design-principles)). The `parameters:` section
-of `interface.yaml` IS the `generate_parameter_library` format. A
-standalone gen_param_lib YAML file works as-is when placed in
-`interface.yaml`. rosgraph delegates to gen_param_lib at build time.
-See [ROSGRAPH.md §9.2](ROSGRAPH.md#92-tool-assessments).
-
----
-
-## 4. AI-Assisted Developer
-
-### How does rosgraph work with AI coding tools?
-
-`interface.yaml` is a machine-readable contract — exactly what LLMs
-are good at consuming and generating. The `InterfaceDescriptor` IR
-([ROSGRAPH.md §3.3](ROSGRAPH.md#33-the-interfacedescriptor-ir)) is a
-JSON blob containing a node's complete API: topics, types, QoS,
-parameters, lifecycle state. An AI agent reads this to understand what
-a node does, generate implementation code, write tests, or suggest
-fixes — without parsing C++ or Python source.
-
-See [ROSGRAPH.md §3.13](ROSGRAPH.md#313-ai--tooling-integration) for
-the full AI integration design.
-
-### Can I use `rosgraph generate` as an agent tool?
-
-Yes. An AI agent writing a ROS node can:
-1. Generate `interface.yaml` from a natural language description
-2. Run `rosgraph generate .` as a tool call to get type-safe
-   scaffolding
-3. Write only the business logic into the generated skeleton
-4. Run `rosgraph lint .` to verify the graph is correct
-
-This avoids the common failure mode of LLMs hallucinating ROS
-boilerplate (wrong QoS defaults, missing component registration,
-incorrect parameter declaration).
-
----
-
-## 5. Package Maintainer / ROS Governance
-
-### Do I have to adopt rosgraph to be compatible with it?
-
-No. Packages without `interface.yaml` are skipped, not errored (Design
-Principle 6). Downstream users can run `rosgraph discover` against your
-running node to generate a spec for their own use. Your package doesn't
-need to ship `interface.yaml` for others to benefit — though shipping
-one is much better, since discovered specs require human review and may
-miss QoS details.
-
-### What's the adoption path toward `ros_core`?
-
-Deliberately incremental ([ROSGRAPH.md §4, "Adoption
-Path"](ROSGRAPH.md#adoption-path)):
-
-1. **`ros-tooling` organization** — institutional backing, CI
-   infrastructure, release process.
-2. **REP for `interface.yaml` schema** — formalizes the declaration
-   format as a community standard, independent of the rosgraph tool.
-3. **docs.ros.org tutorial integration** — if "write your first node"
-   uses `interface.yaml`, every new ROS developer learns it from day
-   one.
-4. **`ros_core` proposal** — after demonstrated adoption across
-   multiple distros.
-
-### Why not extend existing tools instead?
-
-Each existing tool covers one capability but none covers the full
-scope. The gap analysis ([ROSGRAPH.md
-§9.3](ROSGRAPH.md#93-gap-analysis)) shows five major gaps: graph diff,
-graph linting, QoS static analysis, behavioral properties, and CI graph
-validation. No single existing tool can be extended to fill all five.
-
-rosgraph builds on existing work where possible:
-- `generate_parameter_library` for parameters (used as-is)
-- `rosgraph_monitor_msgs` for runtime message definitions (adopted)
-- cake's design decisions for code generation (validated)
-- HAROS's metamodel for the graph model (adapted)
-
----
-
-## 6. Educator / University Researcher
-
-### Can I use rosgraph for teaching ROS 2?
-
-Yes. The Quick Start
-([ROSGRAPH.md §1](ROSGRAPH.md#quick-start-what-it-looks-like))
-shows a complete workflow in 3 commands. For teaching,
-`interface.yaml` forces students to think about their node's API
-before writing implementation code — topics, types, QoS, parameters.
-This is better pedagogy than copy-pasting publisher boilerplate and
-tweaking it.
-
-### How does rosgraph relate to HAROS?
-
-HAROS ([ROSGRAPH.md §10.6](ROSGRAPH.md#106-ros-domain-prior-art-haros))
-was the prior art for graph analysis in ROS — built at the University
-of Minho (2016–2021). rosgraph borrows HAROS's metamodel and HPL
-property language concepts, but differs fundamentally:
-
-- **HAROS extracted interfaces from source code.** rosgraph uses
-  explicit declarations (`interface.yaml`).
-- **HAROS was ROS 1 only.** rosgraph is built for ROS 2 concepts:
-  QoS, lifecycle, components, actions, DDS discovery.
-- **HAROS died because extraction broke.** catkin → ament, rospack →
-  colcon, XML launch → Python launch. Declaration-based tools don't
-  break when the build system changes.
-
----
-
-## 7. Embedded / Resource-Constrained Developer
-
-### Does rosgraph add runtime overhead to my nodes?
-
-The generated code uses a composition pattern (has-a `Node`, not is-a
-`Node`). This adds one pointer indirection — single nanoseconds. The
-generated pub/sub wrappers are thin forwarding calls. No virtual
-dispatch is added beyond what the ROS client library already uses.
-
-Parameter validation (via `generate_parameter_library`) runs at
-parameter-set time, not in the hot path. See [ROSGRAPH.md §3.4,
-"Design decisions"](ROSGRAPH.md#34-rosgraph-generate--code-generation).
-
-### Does `rosgraph monitor` run on the robot?
-
-Yes, but it's optional. `rosgraph monitor` is a separate process — it
-doesn't instrument or modify your nodes. If your platform can't spare
-the resources, don't run it. You still get full value from build-time
-tools (`rosgraph generate`, `rosgraph lint`).
-
-Runtime targets ([ROSGRAPH.md
-§3.14](ROSGRAPH.md#314-scale--fleet-considerations)):
-- Memory: < 50MB resident
-- CPU: < 5% of one core at steady-state (5s scrape interval)
-- No additional DDS traffic beyond standard discovery
-
----
-
-## 8. The Skeptic
-
-### This proposal has 51 features. Is this realistic?
-
-Phase 1 ([ROSGRAPH.md §4](ROSGRAPH.md#4-phasing)) is the commitment:
-~12 features covering core schema, basic code generation, and
-highest-value lint and monitor rules. Later phases are contingent on
-adoption.
-
-The tool builds on existing work — cake for code generation,
-`generate_parameter_library` for parameters, `graph-monitor` message
-definitions for runtime. Phase 1 is stabilizing and unifying existing
-pieces, not building from scratch.
-
-### When should I NOT use rosgraph?
-
-- **Quick prototyping** — single throwaway node, not worth the file.
-- **Single-node packages** — minimal lint value, though codegen may
-  still save boilerplate.
-- **Highly dynamic interfaces** — nodes that create/destroy publishers
-  at runtime based on conditions can't be fully declared.
-
-See [ROSGRAPH.md §12, "When Not to Use
-rosgraph"](ROSGRAPH.md#when-not-to-use-rosgraph).
-
----
-
-## 9. Safety-Critical Engineer
-
-### Does rosgraph help with certification?
-
-rosgraph is not a safety tool — it's a development and verification
-tool that produces artifacts useful in safety cases. See [ROSGRAPH.md
-§11](ROSGRAPH.md#11-safety--certification).
-
-Key artifacts:
-
-| rosgraph artifact | Evidence type |
-|---|---|
-| `interface.yaml` | Software architecture description |
-| `rosgraph lint` SARIF output | Static analysis results |
-| `rosgraph monitor` logs | Runtime verification evidence |
-| `rosgraph test` results | Interface conformance evidence |
-| `rosgraph breaking` output | Change impact analysis |
-
-### What about behavioral properties?
-
-Phase 1-2 covers structural properties: type matches, QoS
-compatibility, graph connectivity. Behavioral analysis (Phase 3+) adds
-temporal and causal properties, inspired by HAROS HPL:
-
-```
-globally: /emergency_stop causes /motor_disable within 100ms
-globally: /heartbeat absent_for 500ms causes /safe_stop
-```
-
-See [ROSGRAPH.md §11.4](ROSGRAPH.md#114-behavioral-properties-future).

From 309c4046de2d61290dcb17d49a4b3513f440eb31 Mon Sep 17 00:00:00 2001
From: Luke Sy <sylukewicent@gmail.com>
Date: Fri, 20 Mar 2026 02:09:03 +1100
Subject: [PATCH 5/5] Address alistair-english PR feedback on ROSGRAPH.md

- Add topic rename as a concrete problem example
- Reframe discovery vs monitoring as same mechanism, different cadence
- Collapse Key Insights to the one load-bearing point (codegen)
- Remove node/package keys from interface.yaml example
---
 docs/ROSGRAPH.md | 39 ++++++++++++++-------------------------
 1 file changed, 14 insertions(+), 25 deletions(-)

diff --git a/docs/ROSGRAPH.md b/docs/ROSGRAPH.md
index 9f452e1..9bdfb95 100644
--- a/docs/ROSGRAPH.md
+++ b/docs/ROSGRAPH.md
@@ -36,6 +36,9 @@ Today in ROS 2:
 - You rename a parameter. Three launch files reference the old name.
   `colcon build` succeeds. The system launches. The parameter silently
   takes its default value.
+- You rename a topic from `/cmd_vel` to `/cmd`. Several downstream
+  nodes subscribed to the old name silently receive nothing. There is
+  no static analysis to tell you what depended on it.
 
 These are real, common bugs in production ROS 2 systems.
 
@@ -58,8 +61,10 @@ are designed as independent, composable libraries.
 3. **Runtime Discovery** — introspect a running system and produce NoDL
    specs from observed nodes. Enables brownfield adoption: point at an
    existing system, generate `interface.yaml` files for every node, then
-   iteratively refine them. Unlike runtime monitoring (component 5),
-   discovery is a one-time migration tool, not a continuous process.
+   iteratively refine them. Discovery and runtime monitoring (component 5)
+   share the same mechanism — observe the live graph, produce a spec,
+   diff against declared. The distinction is cadence: one-time migration
+   vs. continuous verification.
 
 4. **Node-level Unit Testing** — verify a single node conforms to its
    declared spec in isolation.
@@ -76,27 +81,14 @@ are designed as independent, composable libraries.
 
 > **Open question:** implementation language for the generator tooling.
 
-### Key Insights
+### Key Insight
 
-Three key insights drive the design:
-
-1. **The ROS computation graph is not source code — it is a typed,
-   directed graph with QoS-annotated edges.** Analysis tools should
-   operate on a graph model, not on ASTs. Source code parsing is a
-   loader that feeds the model, not the analysis target.
-
-2. **Verification and analysis are schema conformance problems**
-   ("does reality match the spec?"), not traditional program analysis.
-   Once you have a machine-readable spec (`interface.yaml`),
-   verification falls out naturally — the same pattern as `buf lint`,
-   Pact contract tests, and Kubernetes reconciliation.
-
-3. **A declaration without code generation is a non-starter.** NoDL
-   proved this. The schema must generate code, documentation, and
-   validation to stay in sync with reality. `interface.yaml` is
-   simultaneously the source for code generation, the lint target for
-   static analysis, the contract for runtime verification, and the
-   reference for documentation.
+**A declaration without code generation is a non-starter.** NoDL
+proved this. The schema must generate code, documentation, and
+validation to stay in sync with reality. `interface.yaml` is
+simultaneously the source for code generation, the lint target for
+static analysis, the contract for runtime verification, and the
+reference for documentation.
 
 ### Example
 
@@ -104,9 +96,6 @@ A minimal `interface.yaml`:
 
 ```yaml
 schema_version: "1.0"
-node:
-  name: talker
-  package: demo_pkg
 
 publishers:
   - topic: ~/chatter