Summary
Add a forge.yaml observability.tags block for static deployment metadata (team, cost center, environment, region, owner, deploy SHA, etc.) and propagate the same KVs onto both OTel resource attributes (every span) and audit events (new top-level tags field). Optionally stamp inbound W3C Baggage header values into audit.fields.baggage so per-request metadata (tenant, case_id, priority) is filterable in both observability streams.
This closes a small but real gap: operators today can pivot trace ↔ audit by `trace_id`, but can't filter audit by "team" or "tenant" without joining against trace data. After this, the same KVs are present in both streams natively.
Background — what Forge has today
Already wired:
Gaps:
- No ergonomic forge.yaml block — operators hand-write `OTEL_RESOURCE_ATTRIBUTES=team=platform,cost_center=ABC-123,...` as a comma-delimited env string.
- Baggage values aren't stamped onto audit events, so trace ↔ audit cross-filter on "team" / "tenant" doesn't work natively. You can pivot by `trace_id` but not by the tag itself.
Design
Three OTel mechanisms, picked by metadata lifetime
| Lifetime |
Mechanism |
Example |
| Static per deployment |
OTel resource attributes |
team, cost_center, environment, region, deploy_sha |
| Dynamic per request |
OTel baggage |
tenant, case_id, priority |
| Per operation |
Span attributes (code-coupled) |
retry attempt #, chosen model version |
This issue adds declarative config for the first two. Span attributes are out of scope (operators who need them write Go code).
forge.yaml shape
observability:
tags: # static deployment metadata
team: platform
cost_center: ABC-123
environment: production
region: us-east-1
owner: alice@example.com
deploy_sha: 7d2f1ab
tracing:
enabled: true
propagate_baggage_to_audit: true # NEW: stamp baggage KVs into audit.fields.baggage; default false (opt-in)
At runtime Forge:
- Builds `OTEL_RESOURCE_ATTRIBUTES` from `observability.tags` (k=v,k=v,...) and appends to any value already in the operator's env (preserve operator overrides; don't overwrite). Strands and Node sidecars inherit via the curated OTEL_* passthrough.
- Stamps `tags` on every `AuditEvent` (new top-level field, `omitempty`, identical `map[string]string` shape). One audit-side helper `AuditLogger.WithTags(map[string]string)` mirrors the existing `WithTenancy` pattern from FWS-7.
- Extracts inbound `Baggage` header via the composite propagator (already happens). When `propagate_baggage_to_audit: true`, marshals baggage KVs onto `audit.fields.baggage` for the request scope. Default off — keeps audit JSON byte-stable for pre-fix consumers.
Cardinality + size guardrails
- Resource attributes are stamped on every span — high cardinality explodes index sizes in Tempo/Honeycomb/Datadog. Validate at config-load time: ≤ ~12 known values per tag key. Document the trade.
- Baggage rides in an HTTP header — W3C spec is 8 KB total, ≤ 64 entries, ≤ 256-byte values. Forge rejects inbound `Baggage` over the header limit with a typed error (do NOT silently truncate — that breaks downstream services that did size-check correctly).
- Audit `tags` budget — default 4 KB, hard cap 16 KB. Tag-explosion attempts (huge multi-value tags) get truncated with `tags_truncated: true` field so consumers detect.
- PII concerns — baggage propagates downstream uncontrolled. Document that operators should use opaque IDs (`case_id=CASE-7821`), not raw PII. Add a `tags`-side warning in the validator when a value looks email-shaped.
Concept separation (don't conflate fields)
| Concept |
Field |
Use for |
| Hard tenancy identity |
`org_id`, `workspace_id` |
SIEM partitioning, tenant boundary |
| Agent identity |
`entity_id`, `entity_type` |
Which deployed agent emitted this |
| Workflow correlation |
`workflow_id`, `stage_id`, `step_id`, `invocation_caller` |
FWS-2 orchestrator coordination |
| Free-form deployment metadata |
`tags` (new) |
Team / cost / env / region / owner / SHA |
| Per-request operator metadata |
`fields.baggage` (new) |
Tenant / case / priority — varies per call |
Tags = static, low-cardinality, deployment-defined. Baggage = dynamic, per-request, propagates downstream.
Implementation scope
- `forge-core/types/config.go` — `ObservabilityConfig.Tags map[string]string` + `Tracing.PropagateBaggageToAudit bool` fields.
- `forge-core/validate/forge_config.go` — validate keys (lowercase kebab-case, non-PII-shaped), values (≤ 256 bytes), total size (≤ 4 KB), cardinality hint.
- `forge-cli/runtime/runner.go` — at startup, build `OTEL_RESOURCE_ATTRIBUTES` from tags, append to existing env value; call `auditLogger.WithTags(tags)`.
- `forge-core/runtime/audit_schema.go` — `AuditEvent.Tags map[string]string \`json:"tags,omitempty"\``; `AuditLogger.WithTags(...)` helper.
- `forge-core/runtime/audit.go` — when `PropagateBaggageToAudit` set, `EmitFromContext` reads `baggage.FromContext(ctx)` and marshals into `fields.baggage`.
- Schema version stays at `"1.0"` per FWS-8 policy (additive optional fields).
Tests
- `TestObservabilityTags_StampedOnAuditEvents` — every event in a fake invocation carries the configured tags.
- `TestObservabilityTags_StampedOnSpans` — verify resource attributes pass through (via `OTEL_RESOURCE_ATTRIBUTES`) using a `tracetest.SpanRecorder`.
- `TestObservabilityTags_AppendedToExistingEnv` — operator-set `OTEL_RESOURCE_ATTRIBUTES=foo=bar` not clobbered; Forge appends.
- `TestObservabilityTags_ValidatedAtConfigLoad` — bad keys, oversized values, total budget exceed cases each return a typed config error.
- `TestBaggagePropagation_FlowsIntoAuditFields` — inbound `Baggage` header lands in `audit.fields.baggage` when `propagate_baggage_to_audit: true`.
- `TestBaggagePropagation_OffByDefault` — same case with the flag off; `fields.baggage` absent (back-compat pin).
- `TestBaggagePropagation_RejectsOversizedHeader` — 8 KB+ baggage header returns 400, no silent truncation.
Docs
- `docs/security/audit-logging.md` — new "Tags" section + "Baggage propagation" subsection. Update event-field table to include `tags` and `fields.baggage`.
- `docs/core-concepts/observability-tracing.md` — new "Resource attributes via observability.tags" subsection; cross-link to baggage handling in "End-to-end propagation".
- `docs/reference/forge-yaml-schema.md` — `observability.tags` shape + `tracing.propagate_baggage_to_audit` flag.
- Recipe: how operators search by tag in Tempo / Honeycomb / Datadog / Grafana.
Out of scope
- Span attributes from config. Per-operation metadata stays in code (it's by definition not declarative).
- Tag-based egress / auth decisions. Tags are observability metadata only; never an authorization signal.
- Auto-derivation of tags from K8s downward API. Operators can wire that via env interpolation in forge.yaml if they want (`team: ${POD_LABEL_TEAM}`); first-class K8s integration is a separate enhancement.
- Per-tag access control (which audit consumers see which tags). Tags are uniformly visible to anyone with access to the audit stream. A redaction layer would be its own feature.
Risk
Low. Both new fields are additive optional and `omitempty` — pre-fix audit consumers see byte-identical JSON when the operator sets no tags and leaves the baggage flag off. Schema version unchanged. The only behavior change is an extra `OTEL_RESOURCE_ATTRIBUTES` env entry, which OTel SDKs handle natively.
Summary
Add a
forge.yamlobservability.tagsblock for static deployment metadata (team, cost center, environment, region, owner, deploy SHA, etc.) and propagate the same KVs onto both OTel resource attributes (every span) and audit events (new top-leveltagsfield). Optionally stamp inbound W3CBaggageheader values intoaudit.fields.baggageso per-request metadata (tenant, case_id, priority) is filterable in both observability streams.This closes a small but real gap: operators today can pivot trace ↔ audit by `trace_id`, but can't filter audit by "team" or "tenant" without joining against trace data. After this, the same KVs are present in both streams natively.
Background — what Forge has today
Already wired:
TraceContext{} + Baggage{}) installed at startup (forge-core/runtime/tracing.go). InboundBaggageheader flows through ctx automatically.OTEL_RESOURCE_ATTRIBUTESenv passthrough in the curated allowlist (Subprocess tools/skills miss trace context: no traceparent env injection + no binary-skill runtime support #182/feat(tracing): propagate W3C trace context into skill subprocesses + binary skill runtime (closes #182) #183). If the operator sets it manually, both the Forge Go runtime and any subprocess inherit.org_id,workspace_id,entity_id,entity_typeflow from env / per-request headers (FWS-7).Gaps:
Design
Three OTel mechanisms, picked by metadata lifetime
This issue adds declarative config for the first two. Span attributes are out of scope (operators who need them write Go code).
forge.yaml shape
At runtime Forge:
Cardinality + size guardrails
Concept separation (don't conflate fields)
Tags = static, low-cardinality, deployment-defined. Baggage = dynamic, per-request, propagates downstream.
Implementation scope
Tests
Docs
Out of scope
Risk
Low. Both new fields are additive optional and `omitempty` — pre-fix audit consumers see byte-identical JSON when the operator sets no tags and leaves the baggage flag off. Schema version unchanged. The only behavior change is an extra `OTEL_RESOURCE_ATTRIBUTES` env entry, which OTel SDKs handle natively.