This repo now provides executable helpers for IBM Data Product Hub workflows instead of docs alone. It generates governed metric artifacts, performs deterministic review checks, prepares GitHub pull requests, and publishes approved contracts to IBM Data Product Hub through the IBM Data Intelligence MCP server.
The CLI now supports a more portable workflow for hosted or restricted environments:
- automatic
.env.localloading via--env-file - offline PR preview mode
- offline MCP demo mode
- dry-run publish mapping without live IBM access
The repo exposes these commands:
python3 scripts/dph_workflow.py create-data-product --request "<business request>"python3 scripts/dph_workflow.py review-data-product contracts/<name>.yamlpython3 scripts/dph_workflow.py publish-data-product contracts/<name>.yaml --environment devpython3 scripts/dph_workflow.py mcp-diagnose --timeout 10python3 scripts/dph_workflow.py generate-pr-body contracts/<name>.yamlpython3 scripts/dph_workflow.py create-pr ...python3 scripts/dph_workflow.py update-pr ...
By default the CLI loads environment variables from .env.local if present:
python3 scripts/dph_workflow.py --env-file .env.local review-data-product contracts/invoice_collection_risk_14d.yamlIf you prefer an installed entrypoint:
pip install -e .
dph-plugin create-data-product --request "Create a new DPH data product called invoice_collection_risk_14d that estimates unpaid invoices due in the next 14 days. Owner is Finance Ops. Internal-only sharing."contracts/: canonical YAML contractsmetrics/: SQL definitions referenced by contractstemplates/metric_contract.yaml: authoring templateschemas/contract.schema.json: validation schemasrc/ibm_dph_plugin/: workflow implementationscripts/dph_workflow.py: CLI wrapper for local and CI use.github/workflows/: PR review and controlled publish workflows
Each metric contract is repo-authored YAML with this minimum structure:
name: invoice_collection_risk_14d
description: Total value of unpaid invoices due within the next 14 days.
owner: finance_ops
domain: finance
sharing_policy: internal_only
delivery_method: api
metric_type: aggregate
semantic_query: metrics.invoice_collection_risk_14d
endpoint: /metrics/invoice_collection_risk_14d
lifecycle:
status: draft
human_approver_role: finance_data_steward
governance:
contains_sensitive_data: false
sensitive_fields: []
enforcement: standard
publish:
enabled: false
environments: []python3 scripts/dph_workflow.py create-data-product \
--request "Create a new DPH data product called invoice_collection_risk_14d that estimates unpaid invoices due in the next 14 days. Owner is Finance Ops. Internal-only sharing."The command:
- infers the contract name, description, owner, domain, and policy,
- checks for near-duplicate existing contracts,
- writes
contracts/<name>.yamlandmetrics/<name>.sql, - leaves the product in
draftstatus.
Use --force to override duplicate detection.
python3 scripts/dph_workflow.py review-data-product contracts/invoice_collection_risk_14d.yamlThe review checks:
- schema compliance,
- missing metadata,
- SQL presence and read-only shape,
- potential sensitive identifier exposure,
- endpoint collisions,
- publish readiness.
Generate a PR body from a contract and an optional stored review summary:
python3 scripts/dph_workflow.py generate-pr-body contracts/invoice_collection_risk_14d.yamlCreate a branch, commit artifacts, push, and open a PR:
python3 scripts/dph_workflow.py create-pr \
--branch feat/invoice-collection-risk-14d \
--commit-message "feat: add invoice_collection_risk_14d data product" \
--title "feat: add invoice_collection_risk_14d data product" \
--body-file /tmp/pr_body.md \
--contract contracts/invoice_collection_risk_14d.yaml \
--paths contracts/invoice_collection_risk_14d.yaml metrics/invoice_collection_risk_14d.sqlThe implementation prefers gh for PR operations and falls back to the GitHub REST API when GITHUB_TOKEN is available. Branch, commit, and push steps still require a Git checkout.
Portable demo mode:
python3 scripts/dph_workflow.py create-pr \
--offline \
--branch demo/invoice-collection-risk \
--commit-message "feat: demo" \
--title "feat: demo invoice_collection_risk_14d" \
--body-file /tmp/pr_body.md \
--contract contracts/invoice_collection_risk_14d.yamlThis returns a PR preview payload and URLs without requiring GitHub access or a local Git repo.
Publishing requires:
lifecycle.status: approved,- a clean review result,
- an explicit target environment,
- IBM MCP credentials in the shell environment.
Dry run:
python3 scripts/dph_workflow.py publish-data-product \
contracts/invoice_collection_risk_14d.yaml \
--environment dev \
--dry-runReal publish:
export DI_SERVICE_URL="https://api.dataplatform.cloud.ibm.com"
export DI_APIKEY="..."
export DI_ENV_MODE="SaaS"
python3 scripts/dph_workflow.py publish-data-product \
contracts/invoice_collection_risk_14d.yaml \
--environment devThe publish implementation:
- maps repo YAML to a publish payload,
- initializes the configured
wxdiMCP server from.mcp.json, - discovers a publish-like tool via
tools/list, - maps payload fields into the tool input schema,
- executes
tools/call, - returns the request and result payloads.
--dry-run renders the mapped publish payload and current readiness without calling MCP, so it can be used before approval.
Offline demo mode:
python3 scripts/dph_workflow.py publish-data-product \
contracts/invoice_collection_risk_14d.yaml \
--environment dev \
--offlineThis simulates a publish tool call and returns a synthetic registered response.
Use this to isolate MCP handshake and tool discovery issues before attempting a publish:
PYTHONPATH=src python3 scripts/dph_workflow.py mcp-diagnose --timeout 10The command:
- resolves the
wxdiserver from.mcp.json, - runs
initialize, - sends
notifications/initialized, - runs
tools/list, - prints the raw initialize and tools responses,
- fails fast if either request exceeds the timeout.
Offline diagnose mode:
python3 scripts/dph_workflow.py mcp-diagnose --offlineThis returns a synthetic initialize response and a demo tools/list payload for portable demos.
- review-data-product.yml: runs deterministic review checks for changed contracts on pull requests.
- publish-data-product.yml: manually publishes an approved contract using protected environment secrets.
PYTHONPATH=src python3 -m unittest discover -s tests
PYTHONPATH=src python3 scripts/dph_workflow.py review-data-product contracts/invoice_collection_risk_14d.yaml
PYTHONPATH=src python3 scripts/dph_workflow.py publish-data-product contracts/invoice_collection_risk_14d.yaml --environment dev --dry-runTracked credentials were removed from .mcp.json. Populate IBM secrets through local environment variables or GitHub Actions secrets instead.