Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 50 additions & 0 deletions docs/LEARNING_LOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,56 @@ This file should be updated by Codex after each meaningful change.
### What to learn next
```

## 2026-05-23 - Production protocol connector ADR

### What changed

Added ADR 0002 for the production protocol connector architecture and safety
boundary. The ADR defines a read-only-by-default approach for OPC-UA, MQTT, and
BACnet connection management before schema, API, adapter, or Workbench UI work
starts.

### Why it matters

Production protocol connections may point at real industrial networks, so the
project needs a clear boundary before implementation. The ADR keeps connector
work focused on safe source reads, secret redaction, normalized FactoryEvents,
auditable diagnostics, and explicit non-goals for writeback or production
validation claims.

### How it works

Connection profiles will be backend-owned, validated before use, and redacted in
browser-facing responses. Test connection behavior is limited to read-only
diagnostics, and connector output must flow through ingestion and the shared
FactoryEvent contracts before downstream services consume it.

### How to run it

Read the ADR at:

```text
docs/decisions/0002-production-protocol-connector-architecture.md
```

### How to test it

```bash
.venv/bin/python -m pytest services/simulator/tests/test_production_connector_adr_docs.py
```

### Key files

- `docs/decisions/0002-production-protocol-connector-architecture.md`
- `docs/decisions/README.md`
- `services/simulator/tests/test_production_connector_adr_docs.py`

### What to learn next

Use the ADR to implement the shared connection profile schema without adding
protocol adapter behavior, UI forms, or writeback controls before the safety
boundary is covered by tests.

## 2026-05-23 - OPC UA demo ingestion worker

### What changed
Expand Down
193 changes: 193 additions & 0 deletions docs/decisions/0002-production-protocol-connector-architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,193 @@
# ADR 0002: Production Protocol Connector Architecture

## Status

Accepted

## Date

2026-05-23

## Context

The Factory Intelligence Platform is moving beyond the simulator-only MVP
toward configurable protocol connections for OPC-UA, MQTT, and BACnet. These
connections may eventually point at real industrial networks, so the connector
boundary has to be decided before schema, API, adapter, or Workbench UI work
starts.

The current platform already has a safe demo OPC UA ingestion worker that polls
a local simulator-backed namespace and writes normalized `FactoryEvent` records.
That worker is intentionally not a production connector. The production
connector direction needs a broader boundary that supports protocol connection
profiles, read-only health checks, source-to-event mapping, auditability, and
safe operator-console diagnostics without implying equipment control or
production validation.

Related issues:

- #206: Protocol connection management and operator console
- #210: Production protocol connector architecture and safety boundary

## Decision

Production protocol connector work will use a read-only-by-default connector boundary for
OPC-UA, MQTT, and BACnet.

Connection profiles will be owned by the Factory Intelligence Platform backend,
validated before use, and exposed to the Workbench only through redacted API
responses. Profiles may describe endpoints, protocol-specific settings, polling
or subscription behavior, mapping references, lifecycle state, and secret or
certificate references, but they must not expose raw credentials, private keys,
or certificate bodies to browser clients.

The connector lifecycle will use explicit states:

- `draft`: saved but incomplete or not ready to use.
- `disabled`: valid enough to keep, but not active.
- `testable`: eligible for a read-only test connection action.
- `healthy`: last read-only test succeeded.
- `degraded`: last read-only test partially succeeded or mapping is incomplete.
- `failed`: last read-only test failed.

The first production connector implementation path is:

1. Define shared connection profile schemas and redacted response shapes.
2. Add backend profile storage and API endpoints.
3. Add read-only test connection behavior with structured health results.
4. Add read-only protocol adapters that emit normalized `FactoryEvent` records.
5. Add Workbench connection management, protocol diagnostics, and source/tag
browser views over the backend APIs.

## Safety Boundary

Protocol adapters must be read-only unless a future ADR explicitly approves a
write path. The production connector work covered by this decision must not:

- write to PLCs, DCS, SCADA, OPC UA nodes, MQTT topics, or BACnet objects;
- change machine parameters;
- perform arbitrary tag writes;
- release, quarantine, or otherwise disposition product;
- create, close, or update QMS/MES records;
- create production CAPA records;
- claim production readiness or validated status without separate evidence.

Connector output must pass through the ingestion layer and unified
`FactoryEvent` contracts before Process Sentinel, evidence timelines,
recommendations, RCA/CAPA drafts, or Workbench views consume it.

## Secret and Certificate Boundary

Connection profiles may reference secrets and certificates by stable reference
names. They must not store or return raw secret material in JSON profile
payloads.

Initial local development may use placeholder reference values and environment
variables, but production-style behavior must preserve these rules:

- Browser-facing API responses return only redacted secret or certificate
metadata.
- Logs must not include credentials, tokens, private keys, or certificate
bodies.
- Missing, expired, or unauthorized secret references produce readable
connection health errors.
- `.env.example` may document variable names, but real plant credentials and
production connection strings must not be committed.

## Test Connection Behavior

The test connection action is a read-only diagnostic. It may perform only the
minimum protocol-specific reads needed to verify reachability, authentication,
authorization, and mapping readiness.

Allowed examples:

- OPC-UA: connect, validate security mode, read configured nodes or browse only
the configured namespace boundary.
- MQTT: connect to the broker, subscribe to configured topic filters, and
inspect sample messages without publishing commands.
- BACnet: read configured device/object properties without writing commandable
properties.

The test result should include protocol, connection ID, status, checked-at time,
duration, readable error code/message, and redacted diagnostic details.

## Audit and Logging Expectations

Connection profile changes and test connection actions should produce
audit-friendly records. At minimum, future implementation issues should capture:

- actor or local user identity when available;
- connection profile ID and protocol;
- action type, such as create, update, disable, enable, or test;
- timestamp and result status;
- redacted error details when tests fail.

Audit records must describe connector configuration and diagnostics only. They
must not imply electronic signatures, validated production audit trails, or
approved industrial actions.

## Options Considered

### Option 1: Demo-only protocol controls

- Pros: Fastest continuation from the existing OPC UA simulator work.
- Cons: Does not satisfy the new requirement to define production-oriented
connections for each protocol type.

### Option 2: Read-only production connector boundary

- Pros: Enables realistic connection management while preserving industrial
safety, source normalization, and auditability.
- Cons: Requires schema, storage, redaction, test-connection behavior, and
adapter foundations before a complete operator-console experience exists.

### Option 3: Full read/write connector framework now

- Pros: Would cover future control use cases.
- Cons: Introduces unsafe scope too early, conflicts with governed-action
rules, and would require much stronger validation, policy, approval, and
rollback design before implementation.

## Consequences

### Positive

- Establishes a safe boundary before production connector implementation.
- Gives OPC-UA, MQTT, and BACnet one consistent connection profile model.
- Keeps raw industrial source data behind ingestion and `FactoryEvent`
normalization.
- Allows Workbench diagnostics without exposing secrets or adding writeback.

### Negative

- Delays protocol-specific adapter implementation until the shared profile and
safety model are in place.
- Requires redaction and health-result tests before useful UI screens can be
considered complete.

### Risks or Tradeoffs

- Real plants may require protocol-specific security behavior that does not fit
the first shared profile cleanly.
- Local development can use Demo-Factory endpoints for testing, but those
endpoints are not production validation evidence.
- Future writeback needs, if any, must be handled by a separate ADR and
governed-action design rather than extending this read-only boundary quietly.

## Follow-Up Work

- Define the protocol connection profile schema.
- Add connection profile API and storage.
- Add read-only test connection results.
- Add read-only OPC-UA, MQTT, and BACnet adapter foundations.
- Add Workbench connection management, protocol diagnostics, and source/tag
browser views.
- Add connector configuration, redaction, diagnostics, and adapter tests.
- Document local Demo-Factory validation as a development fixture only.

## Related Links

- Issue: #210
- Parent epic: #206
- Related ADRs: [ADR 0001](./0001-platform-architecture.md)
1 change: 1 addition & 0 deletions docs/decisions/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,3 +38,4 @@ understanding.
## Existing Decisions

- [ADR 0001: Simulator-First Modular Platform Architecture](./0001-platform-architecture.md)
- [ADR 0002: Production Protocol Connector Architecture](./0002-production-protocol-connector-architecture.md)
72 changes: 72 additions & 0 deletions services/simulator/tests/test_production_connector_adr_docs.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
from pathlib import Path

REPO_ROOT = Path(__file__).resolve().parents[3]
ADR = REPO_ROOT / "docs" / "decisions" / "0002-production-protocol-connector-architecture.md"
ADR_INDEX = REPO_ROOT / "docs" / "decisions" / "README.md"


def _content() -> str:
return ADR.read_text(encoding="utf-8")


def test_production_connector_adr_exists_and_is_indexed() -> None:
assert ADR.exists()
assert "0002-production-protocol-connector-architecture.md" in ADR_INDEX.read_text(
encoding="utf-8"
)


def test_production_connector_adr_defines_read_only_protocol_boundary() -> None:
content = _content()

required_terms = [
"read-only-by-default connector boundary",
"OPC-UA",
"MQTT",
"BACnet",
"Connection profiles",
"FactoryEvent",
"ingestion layer",
"must not expose raw credentials",
"private keys",
"certificate bodies",
]

for term in required_terms:
assert term in content


def test_production_connector_adr_rejects_unsafe_industrial_actions() -> None:
content = _content()

rejected_actions = [
"write to PLCs",
"write to PLCs, DCS, SCADA, OPC UA nodes, MQTT topics, or BACnet objects",
"change machine parameters",
"perform arbitrary tag writes",
"release, quarantine, or otherwise disposition product",
"create, close, or update QMS/MES records",
"create production CAPA records",
"claim production readiness or validated status",
]

for action in rejected_actions:
assert action in content


def test_production_connector_adr_defines_test_connection_and_audit_expectations() -> None:
content = _content()

required_terms = [
"The test connection action is a read-only diagnostic.",
"OPC-UA: connect",
"MQTT: connect to the broker",
"BACnet: read configured device/object properties",
"actor or local user identity",
"connection profile ID and protocol",
"create, update, disable, enable, or test",
"redacted error details",
]

for term in required_terms:
assert term in content
Loading