Skip to content
Merged
34 changes: 23 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,24 +3,36 @@
Public Workflow plugin and Go module for compute protocol and provider catalog
contracts.

Provider plugins use this module for shared compute provider-catalog data
types, validation helpers, and canonical hashing. They also declare a plugin
dependency on `workflow-plugin-compute-core` in `plugin.json`, giving Workflow a
registry dependency anchor separate from runtime execution plugins.
Provider and workload plugins use this module for shared compute protocol data
types, provider-catalog data types, validation helpers, canonical hashing, and a
minimal task/proof HTTP client. They also declare a plugin dependency on
`workflow-plugin-compute-core` in `plugin.json`, giving Workflow a registry
dependency anchor separate from runtime execution plugins.

This is distinct from the Workflow plugin runtime contract: external plugins
still expose their runtime capabilities through Workflow's gRPC/protobuf plugin
service contracts. The provider-catalog structs in `protocol/` are the typed
declaration data that provider plugins publish and `workflow-plugin-compute`
validates.

The public catalog contract includes provider identity, org/pool scoping,
access visibility, supported workload and network modes, runtime profiles,
operation schemas, artifact declarations, residue policy, and upstream client
conformance evidence. Workflow applications should treat these declarations as
the portable provider-facing base contract; application-specific scheduling,
task state, settlement, dashboards, and worker supervision remain outside this
core plugin.
The public contract includes task/proof/lease wire shapes, provider identity,
org/pool scoping, access visibility, supported workload and network modes,
runtime profiles, operation schemas, artifact declarations, residue policy, and
upstream client conformance evidence. Workflow applications should treat these
declarations as the portable provider-facing base contract.

The task/proof client covers submission and read-only observation:

- `SubmitTask`
- `ListTasks`
- `TaskSnapshot`
- `ListProofs`
- `FindProof`

Application-specific scheduling, task mutation policy, settlement, dashboards,
worker supervision, local-agent rollout, and control-plane storage/authz remain
outside this core plugin. Those concerns may be extracted into a reusable
control-plane component later, but they are not implemented by compute-core.

This plugin intentionally advertises no module, step, trigger, or IaC runtime
capabilities.
Expand Down
38 changes: 38 additions & 0 deletions docs/plans/2026-06-06-task-proof-sdk-alignment.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
### Alignment Report

**Status:** PASS

**Coverage:**

| Design Requirement | Plan Task(s) | Status |
|---|---|---|
| Expose portable task, lease, proof-listing, and minimal HTTP client contracts | Task 1, Task 2, Task 3, Task 4 | Covered |
| Remove immediate private-protocol dependency blocker for product-capture | Task 5 | Covered |
| Keep scheduling, task mutation, agent supervision, settlement, dashboards, and deployment policy outside compute-core | Task 5, Scope Manifest out-of-scope | Covered |
| Future reusable control plane must be its own phase/component, not compute-core expansion | Scope Manifest out-of-scope, Successor Hand-Off item 5 | Covered |
| Add `TaskStatus`, `Task`, `Lease`, wrappers, and client methods named in design | Task 1, Task 2, Task 3, Task 4 | Covered |
| Token-bearing clients reject non-HTTPS non-loopback URLs | Task 3, Task 4 | Covered |
| Strict decode and typed status errors without response-body leakage | Task 3, Task 4 | Covered |
| Prove HTTP boundary with `httptest.Server` | Task 3, Task 4 | Covered |
| Prove exact product-capture public symbol surface | Task 5 | Covered |
| No infra/staging action in compute-core PR; staging belongs workflow-compute consumer phase | Scope Manifest, Successor Hand-Off | Covered |
| Rollback by reverting additive SDK before release; tag rollback only if published | Successor Hand-Off, Final PR Verification | Covered |

**Scope Check:**

| Plan Task | Design Requirement | Status |
|---|---|---|
| Task 1 | Public task/lease contract tests and representative wire validation | Justified |
| Task 2 | Public `TaskStatus`, `Task`, `Lease`, and portable validation | Justified |
| Task 3 | HTTP client tests for auth, strict decode, status errors, and proof/task endpoints | Justified |
| Task 4 | Transport-thin public client implementation | Justified |
| Task 5 | Product-capture compatibility proof and boundary documentation | Justified |

**Manifest Trace:**

- `PR Count: 1` matches the single PR Grouping row.
- `Tasks: 5` matches `### Task 1` through `### Task 5`.
- Every task appears exactly once in the PR Grouping table.
- `plan-scope-check.sh --plan <absolute plan path>` returned `PASS: scope-manifest checks succeeded.`

**Drift Items:** None.
40 changes: 40 additions & 0 deletions docs/plans/2026-06-06-task-proof-sdk-design-review.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
### Adversarial Review Report

**Phase:** design
**Artifact:** `docs/plans/2026-06-06-task-proof-sdk-design.md`
**Status:** PASS

**Findings (Critical):**
- None.

**Findings (Important):**
- None.

**Findings (Minor):**
- `D1` [YAGNI violations] [Design]: `Lease` is not required by product-capture's immediate import switch. Recommendation: keep `Lease` as a wire type only, and keep lease acquisition/client methods out of this phase. _Resolution: already constrained by the design and Deferred Issues sections._
- `D2` [Missing failure modes] [Design]: client error taxonomy is named only as "typed errors"; the design does not say whether response bodies are bounded or surfaced. Recommendation: plan an explicit status error test that avoids leaking response bodies and captures status/method/path. _Resolution: plan task must include the test._
- `D3` [Existence / runtime-validity] [Multi-Component Validation]: downstream compatibility is stated, but the design does not name the exact product-capture import surface to compile. Recommendation: plan a temporary downstream compile proof covering `Task`, `TaskStatus`, `ProofReceipt`, `WorkloadSpec`, `ProviderWorkload`, `ProviderConfig`, `ProductCaptureMode`, `SignatureEnvelope`, and `DecodeStrict`/client replacement. _Resolution: plan task must enumerate these symbols._

**Bug-class scan transcript:**

| Class | Result | Note |
|---|---|---|
| Project-guidance conflicts | Clean | `README.md` keeps scheduling/task state/settlement/dashboard/supervision outside compute-core; design excludes those surfaces. |
| Assumptions under attack | Clean | The design names the stable `/v1/tasks` and `/v1/proofs` assumption and requires workflow-compute drift tests before live usage. |
| Repo-precedent conflicts | Clean | Prior compute-core plans add additive protocol types plus validation and defer workflow-compute consumption. |
| Artifact-class precedent | Clean | Sibling protocol contracts live under `protocol/` with tests in `protocol/*_test.go`; design follows that shape. |
| YAGNI violations | Minor | `Lease` is future-facing for agent interop, but constrained to a type without lease client methods. |
| Missing failure modes | Minor | Status error/body behavior needs explicit plan coverage. |
| Security / privacy at architecture level | Clean | HTTPS requirement for token-bearing non-loopback URLs and no token/body logging are explicit. |
| Infrastructure impact | Clean | Compute-core-only PR has no infra or runtime process impact; workflow-compute consumer phase owns staging. |
| Multi-component validation | Clean | Requires `httptest.Server` boundary and downstream product-capture compile proof. |
| Rollback story | Clean | Revert additive commit before release; patch tag/pin rollback if already released. |
| Simpler alternative not considered | Clean | Types-only and full control-plane SDK alternatives are considered and rejected. |
| User-intent drift | Clean | Design serves the requested public reusable platform boundary and product-capture compatibility. |
| Existence / runtime-validity | Minor | Needs exact downstream compile symbols in the plan. |

**Options the author may not have considered:**
1. Keep the client in product-capture and publish only task/proof types. This is smaller but preserves duplicated auth/strict-decode behavior across future plugins.
2. Publish client methods in a separate `client` package. That reduces `protocol` package breadth, but this repo already keeps protocol helpers such as canonical hashing and strict decode alongside contracts, so a small client is acceptable if it stays transport-thin.

**Verdict reasoning:** PASS. The design has no open Critical or Important issue. Minor findings are plan-level constraints: avoid lease methods, test status errors, and compile the exact product-capture symbol surface.
160 changes: 160 additions & 0 deletions docs/plans/2026-06-06-task-proof-sdk-design.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
# Task and Proof SDK Design

## Goal

Expose the portable task, lease, proof-listing, and minimal HTTP client
contracts in `workflow-plugin-compute-core/protocol` so public Workflow plugins
can submit compute tasks and observe proof receipts without importing the
private `workflow-compute/pkg/protocol` package.

This is Phase 1 of the public distributed-compute platform roadmap. It removes
the immediate private-protocol dependency blocker for `workflow-plugin-product-capture`
while keeping `workflow-compute` as the managed product assembly that owns
scheduling, task mutation, agent supervision, settlement, dashboards, and
deployment policy until a later reusable control-plane component is designed and
extracted.

## Global Design Guidance

Source: `README.md`

| Guidance | Design response |
|---|---|
| Compute-core is the public Go module for compute protocol and provider catalog contracts. | Add wire contracts and client helpers to `protocol`, not app behavior. |
| Workflow applications should treat declarations as portable provider-facing base contracts. | Define task/proof request and response shapes that external plugins can compile against. |
| Application-specific scheduling, task state, settlement, dashboards, and worker supervision remain outside compute-core. | Do not add scheduler queues, admin APIs, dashboard models, worker registration, service leasing methods, or settlement helpers here; future reusable control-plane extraction must be its own phase/component. |

## Approaches Considered

1. **Recommended: additive public SDK in compute-core.** Add task/lease/status
structs, public response wrappers, and a minimal HTTP client that only
covers task submission, task listing/snapshot, proof listing, and proof
lookup. This satisfies product-capture and preserves the private app
boundary.
2. **Types only, no client.** This is smaller but leaves each plugin copying
HTTP auth, strict decoding, response wrappers, and timeout behavior. That is
the current product-capture problem in a different file.
3. **Full control-plane SDK.** This would centralize more code, but it would
move scheduler/agent/admin concerns into compute-core prematurely and blur
the public product boundary.

## Design

Add a new public protocol surface:

- `TaskStatus` constants matching the existing wire values.
- `Task` with only the current portable JSON fields: protocol version,
product/org/pool/policy IDs, status, workload, placement/proof/network/access
policies, residue/resource limits, input hash, requested time, timeout,
labels, and signature.
- `Lease` for the task-agent wire contract, including capability snapshot,
executor, network/P2P/residue policies, and lease timestamps.
- `TaskStall` and `TaskList` response wrappers for `/v1/tasks`.
- `TaskResponse` and `ProofList` response wrappers for `/v1/tasks` and
`/v1/proofs`.
- `Client` with `SubmitTask`, `ListTasks`, `TaskSnapshot`, `ListProofs`, and
`FindProof`.

The client will be transport-thin. It will set bearer auth when configured,
require HTTPS for token-bearing non-loopback URLs, use `DecodeStrict`, and
return typed errors for unexpected status codes. It will not implement task
creation policy, retries, async watches, lease acquisition, worker
registration, admin endpoints, provider registration, or dashboard/settlement
views.

`workflow-compute` will consume these types in a follow-up PR by aliasing its
public protocol package to compute-core where the wire shape is identical.
`workflow-plugin-product-capture` will then switch imports to compute-core in a
later downstream phase and use the public client.

Long-term, GoCodeAlone may extract a reusable control plane that other managed
products can assemble. This SDK is intentionally narrower: it supplies the
shared wire contract that such a control plane would also consume, without
deciding that control plane's storage model, scheduling policy, deployment
shape, authz chain, or operational UI.

## Security Review

- Auth token flow stays caller-owned; compute-core only places a configured
token into `Authorization: Bearer`.
- Token-bearing clients reject non-HTTPS URLs unless the host is loopback.
- Strict JSON decode rejects unrecognized response fields so plugins detect
contract drift early.
- The client does not log token values, request bodies, task payloads, or proof
payloads.
- Server-side authorization remains in `workflow-compute`; client-side config
is not treated as authority.

## Infrastructure Impact

This phase changes only the public Go module. It creates no cloud resources,
secrets, databases, queues, migrations, deployment environments, or runtime
processes. Release impact is limited to a compute-core tag after the PR merges.

No staging deployment is required for the compute-core PR by itself. The
follow-up `workflow-compute` consumer PR must refresh staging and run a real
product-capture-compatible submission/proof smoke because that PR changes the
managed app assembly.

## Multi-Component Validation

The compute-core PR must prove:

- `protocol.Task` and `protocol.Lease` validate representative real task/agent
wire shapes.
- The HTTP client crosses a real `httptest.Server` boundary for submit, list,
snapshot, proof list, auth header, strict decode, and status errors.
- A downstream compatibility check can compile the product-capture client
shape against compute-core types without importing `workflow-compute`.

The follow-up app PR must prove:

- `workflow-compute` aliases or delegates matching types to compute-core
without changing API JSON.
- product-capture can compile against the public contract.
- staging accepts a product-capture-style workload from registered local agents
and returns a proof or explicit typed failure.

## Assumptions

- The existing `/v1/tasks` and `/v1/proofs` JSON response shapes are intended
public surfaces for plugins.
- Product-capture needs task submission and proof lookup, not agent lease
acquisition.
- `Lease` belongs in compute-core as a public wire type, but lease acquisition
methods are a later agent SDK concern.
- Existing host validation remains stricter than compute-core portable
validation where policy decisions require server state.

## Self-Challenge

1. The laziest solution is to keep product-capture copying its private client
and only switch type imports. That reduces code movement now but repeats
auth/strict-decode drift across every future workload plugin.
2. The fragile assumption is that `/v1/tasks` and `/v1/proofs` are stable enough
for public clients. The plan must add drift tests in `workflow-compute` before
downstream live usage.
3. The main YAGNI risk is adding leasing/admin methods. This design excludes
them and records them as a deferred agent SDK concern.

## Rollback

Rollback the compute-core PR by reverting the additive public SDK commit and
tagging no release. Because no downstream app code is changed in this PR, no
server deployment rollback is needed.

If a compute-core tag has already been published, publish a patch tag that
removes or marks the SDK unstable, then keep `workflow-compute` pinned to the
previous known-good compute-core version until the replacement tag is verified.

## Deferred Issues

- Switch `workflow-compute` to compute-core aliases/delegates in the next PR.
- Switch `workflow-plugin-product-capture` imports and docs after the
`workflow-compute` consumer PR verifies API compatibility.
- Add lease acquisition, worker registration, and resilient agent upgrade APIs
to the future public agent plugin phase, not this SDK.
- Design reusable control-plane extraction as its own future platform phase,
not as an expansion of compute-core's protocol/client package.
- Keep live staging refresh/local-agent registration evidence in the
`workflow-compute` consumer phase where the server actually changes.
Loading
Loading