Skip to content

geneazad/plugin

Repository files navigation

IBM Data Product Hub Git-Native Plugin

This repo now provides executable helpers for IBM Data Product Hub workflows instead of docs alone. It generates governed metric artifacts, performs deterministic review checks, prepares GitHub pull requests, and publishes approved contracts to IBM Data Product Hub through the IBM Data Intelligence MCP server.

The CLI now supports a more portable workflow for hosted or restricted environments:

  • automatic .env.local loading via --env-file
  • offline PR preview mode
  • offline MCP demo mode
  • dry-run publish mapping without live IBM access

Workflow Surface

The repo exposes these commands:

  • python3 scripts/dph_workflow.py create-data-product --request "<business request>"
  • python3 scripts/dph_workflow.py review-data-product contracts/<name>.yaml
  • python3 scripts/dph_workflow.py publish-data-product contracts/<name>.yaml --environment dev
  • python3 scripts/dph_workflow.py mcp-diagnose --timeout 10
  • python3 scripts/dph_workflow.py generate-pr-body contracts/<name>.yaml
  • python3 scripts/dph_workflow.py create-pr ...
  • python3 scripts/dph_workflow.py update-pr ...

By default the CLI loads environment variables from .env.local if present:

python3 scripts/dph_workflow.py --env-file .env.local review-data-product contracts/invoice_collection_risk_14d.yaml

If you prefer an installed entrypoint:

pip install -e .
dph-plugin create-data-product --request "Create a new DPH data product called invoice_collection_risk_14d that estimates unpaid invoices due in the next 14 days. Owner is Finance Ops. Internal-only sharing."

Repo Layout

  • contracts/: canonical YAML contracts
  • metrics/: SQL definitions referenced by contracts
  • templates/metric_contract.yaml: authoring template
  • schemas/contract.schema.json: validation schema
  • src/ibm_dph_plugin/: workflow implementation
  • scripts/dph_workflow.py: CLI wrapper for local and CI use
  • .github/workflows/: PR review and controlled publish workflows

Contract Shape

Each metric contract is repo-authored YAML with this minimum structure:

name: invoice_collection_risk_14d
description: Total value of unpaid invoices due within the next 14 days.
owner: finance_ops
domain: finance
sharing_policy: internal_only
delivery_method: api
metric_type: aggregate
semantic_query: metrics.invoice_collection_risk_14d
endpoint: /metrics/invoice_collection_risk_14d
lifecycle:
  status: draft
  human_approver_role: finance_data_steward
governance:
  contains_sensitive_data: false
  sensitive_fields: []
  enforcement: standard
publish:
  enabled: false
  environments: []

Command Details

Create a draft

python3 scripts/dph_workflow.py create-data-product \
  --request "Create a new DPH data product called invoice_collection_risk_14d that estimates unpaid invoices due in the next 14 days. Owner is Finance Ops. Internal-only sharing."

The command:

  • infers the contract name, description, owner, domain, and policy,
  • checks for near-duplicate existing contracts,
  • writes contracts/<name>.yaml and metrics/<name>.sql,
  • leaves the product in draft status.

Use --force to override duplicate detection.

Review deterministically

python3 scripts/dph_workflow.py review-data-product contracts/invoice_collection_risk_14d.yaml

The review checks:

  • schema compliance,
  • missing metadata,
  • SQL presence and read-only shape,
  • potential sensitive identifier exposure,
  • endpoint collisions,
  • publish readiness.

Generate PR metadata and open a PR

Generate a PR body from a contract and an optional stored review summary:

python3 scripts/dph_workflow.py generate-pr-body contracts/invoice_collection_risk_14d.yaml

Create a branch, commit artifacts, push, and open a PR:

python3 scripts/dph_workflow.py create-pr \
  --branch feat/invoice-collection-risk-14d \
  --commit-message "feat: add invoice_collection_risk_14d data product" \
  --title "feat: add invoice_collection_risk_14d data product" \
  --body-file /tmp/pr_body.md \
  --contract contracts/invoice_collection_risk_14d.yaml \
  --paths contracts/invoice_collection_risk_14d.yaml metrics/invoice_collection_risk_14d.sql

The implementation prefers gh for PR operations and falls back to the GitHub REST API when GITHUB_TOKEN is available. Branch, commit, and push steps still require a Git checkout.

Portable demo mode:

python3 scripts/dph_workflow.py create-pr \
  --offline \
  --branch demo/invoice-collection-risk \
  --commit-message "feat: demo" \
  --title "feat: demo invoice_collection_risk_14d" \
  --body-file /tmp/pr_body.md \
  --contract contracts/invoice_collection_risk_14d.yaml

This returns a PR preview payload and URLs without requiring GitHub access or a local Git repo.

Publish through MCP

Publishing requires:

  • lifecycle.status: approved,
  • a clean review result,
  • an explicit target environment,
  • IBM MCP credentials in the shell environment.

Dry run:

python3 scripts/dph_workflow.py publish-data-product \
  contracts/invoice_collection_risk_14d.yaml \
  --environment dev \
  --dry-run

Real publish:

export DI_SERVICE_URL="https://api.dataplatform.cloud.ibm.com"
export DI_APIKEY="..."
export DI_ENV_MODE="SaaS"
python3 scripts/dph_workflow.py publish-data-product \
  contracts/invoice_collection_risk_14d.yaml \
  --environment dev

The publish implementation:

  • maps repo YAML to a publish payload,
  • initializes the configured wxdi MCP server from .mcp.json,
  • discovers a publish-like tool via tools/list,
  • maps payload fields into the tool input schema,
  • executes tools/call,
  • returns the request and result payloads.

--dry-run renders the mapped publish payload and current readiness without calling MCP, so it can be used before approval.

Offline demo mode:

python3 scripts/dph_workflow.py publish-data-product \
  contracts/invoice_collection_risk_14d.yaml \
  --environment dev \
  --offline

This simulates a publish tool call and returns a synthetic registered response.

Diagnose MCP connectivity

Use this to isolate MCP handshake and tool discovery issues before attempting a publish:

PYTHONPATH=src python3 scripts/dph_workflow.py mcp-diagnose --timeout 10

The command:

  • resolves the wxdi server from .mcp.json,
  • runs initialize,
  • sends notifications/initialized,
  • runs tools/list,
  • prints the raw initialize and tools responses,
  • fails fast if either request exceeds the timeout.

Offline diagnose mode:

python3 scripts/dph_workflow.py mcp-diagnose --offline

This returns a synthetic initialize response and a demo tools/list payload for portable demos.

GitHub Actions

Local Verification

PYTHONPATH=src python3 -m unittest discover -s tests
PYTHONPATH=src python3 scripts/dph_workflow.py review-data-product contracts/invoice_collection_risk_14d.yaml
PYTHONPATH=src python3 scripts/dph_workflow.py publish-data-product contracts/invoice_collection_risk_14d.yaml --environment dev --dry-run

Security Note

Tracked credentials were removed from .mcp.json. Populate IBM secrets through local environment variables or GitHub Actions secrets instead.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages