Skip to content

feat: Add testing harness for dissertation with evidence collection a…#38

Open
stephenjtyrrell wants to merge 1 commit into
mainfrom
testing-execution
Open

feat: Add testing harness for dissertation with evidence collection a…#38
stephenjtyrrell wants to merge 1 commit into
mainfrom
testing-execution

Conversation

@stephenjtyrrell

Copy link
Copy Markdown
Owner

…nd analysis

  • Introduced a comprehensive testing harness in the testing directory to facilitate evidence collection for governance and scalability tests.
  • Created scripts for capturing pull request audits, recording policy events, and summarizing run results.
  • Implemented a common library (common.sh) for shared functions and configurations.
  • Added CSV files for storing evidence data, including approval events, policy events, and run summaries.
  • Developed Terraform and Kubernetes fixtures to simulate negative control scenarios for governance testing.
  • Updated README and configuration files to guide users through the setup and execution of tests.
  • Established a structured write-up template for documenting findings from the testing process.

…nd analysis

- Introduced a comprehensive testing harness in the `testing` directory to facilitate evidence collection for governance and scalability tests.
- Created scripts for capturing pull request audits, recording policy events, and summarizing run results.
- Implemented a common library (`common.sh`) for shared functions and configurations.
- Added CSV files for storing evidence data, including approval events, policy events, and run summaries.
- Developed Terraform and Kubernetes fixtures to simulate negative control scenarios for governance testing.
- Updated README and configuration files to guide users through the setup and execution of tests.
- Established a structured write-up template for documenting findings from the testing process.
Copilot AI review requested due to automatic review settings March 23, 2026 21:48

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a dissertation-focused testing harness under testing/ plus supporting scripts to collect governance/scalability evidence (GitHub Actions, Terraform/OPA, Kubernetes/Conftest, ArgoCD), store it as structured CSVs, and generate a summarized markdown report for analysis/write-up.

Changes:

  • Introduces reusable capture/record scripts (scripts/testing/*) and Makefile targets to initialize, collect, and analyze evidence.
  • Adds negative-control fixtures for Terraform and Kubernetes policy enforcement tests.
  • Adds evidence CSV schemas (and initial datasets), documentation, and a dissertation write-up template.

Reviewed changes

Copilot reviewed 29 out of 31 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
testing/fixtures/terraform/unrestricted-ingress-plan.json Terraform negative-control fixture for unrestricted ingress detection
testing/fixtures/terraform/unencrypted-s3-plan.json Terraform negative-control fixture for missing S3 encryption detection
testing/fixtures/terraform/public-storage-plan.json Terraform negative-control fixture for public storage detection
testing/fixtures/terraform/missing-owner-plan.json Terraform negative-control fixture for missing required tags
testing/fixtures/terraform/missing-environment-plan.json Terraform negative-control fixture for missing required tags
testing/fixtures/terraform/missing-cost-center-plan.json Terraform negative-control fixture for missing required tags
testing/fixtures/kubernetes/privileged-container.yaml Kubernetes negative-control fixture for securityContext violations
testing/fixtures/kubernetes/missing-resources.yaml Kubernetes negative-control fixture for missing resource requests/limits
testing/fixtures/kubernetes/missing-labels.yaml Kubernetes negative-control fixture for required label violations
testing/evidence/run_summary.csv Run-level evidence CSV (schema + initial entries)
testing/evidence/policy_events.csv Policy event evidence CSV (schema + initial entries)
testing/evidence/ci_job_events.csv CI job-level evidence CSV header
testing/evidence/artifact_events.csv Artifact evidence CSV header
testing/evidence/argocd_events.csv ArgoCD evidence CSV header
testing/evidence/approval_events.csv PR approval/audit evidence CSV header
testing/dissertation_writeup_template.md Template for Chapters 5–7 write-up structure
testing/config.env.example Example environment configuration for running the harness
testing/README.md Harness documentation and recommended workflow
scripts/testing/run_policy_negative_controls.sh Executes local Terraform/Kubernetes negative controls and records results
scripts/testing/record_run.sh Records manual run_summary rows
scripts/testing/record_policy_event.sh Records manual policy event rows
scripts/testing/init_evidence.sh Initializes evidence directories and CSV headers
scripts/testing/common.sh Shared functions/config for evidence capture + CSV writing
scripts/testing/capture_pr_audit.sh Captures PR audit/approval evidence via GitHub CLI
scripts/testing/capture_ci_run.sh Captures workflow/job timing + artifacts via GitHub CLI
scripts/testing/capture_argocd_snapshot.sh Captures ArgoCD + cluster snapshot evidence
scripts/testing/analyze_results.py Builds a markdown summary report (median/IQR/p95, rates) from CSV evidence
scripts/testing/pycache/analyze_results.cpython-313.pyc Compiled Python bytecode artifact added (should not be committed)
policies/kubernetes/required-labels.rego Updates first deny rule to match deny contains msg if style
Makefile Adds testing harness targets and parameters
.gitignore Ignores testing/config.env and generated evidence bundles/reports

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

load_testing_env
init_evidence_layout
require_cmd gh
require_cmd jq

Copilot AI Mar 23, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script calls duration_between (from common.sh), which uses python3, but python3 is not required/validated here. Add require_cmd python3 (or avoid the Python dependency) so the script fails early with a clear error when Python isn’t available.

Suggested change
require_cmd jq
require_cmd jq
require_cmd python3

Copilot uses AI. Check for mistakes.
RUN_SEQ="$2"
GH_RUN_ID="$3"
APP_TARGET="${4:-github-actions}"

Copilot AI Mar 23, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Argument validation only checks the count of parameters, so empty required values (e.g. empty TEST_ID when invoked via make testing-capture-ci) will still be accepted and will generate malformed run IDs like -RUN01-.... Add explicit non-empty checks for required args (TEST_ID, RUN_SEQ, GH_RUN_ID) and fail with a clear usage message when they’re blank.

Suggested change
if [[ -z "${TEST_ID}" || -z "${RUN_SEQ}" || -z "${GH_RUN_ID}" ]]; then
fail "Usage: capture_ci_run.sh TEST_ID RUN_SEQ GH_RUN_ID [APP_TARGET]"
fi

Copilot uses AI. Check for mistakes.
Comment on lines +2 to +10
"GOV-02-RUN01-20260323-1722","terraform-policy","required_tags","module.aws.aws_vpc.this","1","true"
"GOV-02-RUN02-20260323-1722","terraform-policy","required_tags","module.aws.aws_flow_log.this","1","true"
"GOV-02-RUN03-20260323-1722","terraform-policy","required_tags","module.aws.aws_vpc.this","1","true"
"GOV-03-RUN01-20260323-1722","terraform-policy","public_access","module.aws.aws_s3_bucket.this","1","true"
"GOV-03-RUN02-20260323-1722","terraform-policy","server_side_encryption","module.aws.aws_s3_bucket.this","1","true"
"GOV-03-RUN03-20260323-1722","terraform-policy","unrestricted_ingress","module.gcp.google_compute_firewall.this","1","true"
"GOV-04-RUN01-20260323-1722","kubernetes-policy","required_labels","Deployment/missing-labels-api","1","true"
"GOV-04-RUN02-20260323-1722","kubernetes-policy","resource_requests_limits","Deployment/missing-resources-api","2","true"
"GOV-04-RUN03-20260323-1722","kubernetes-policy","security_context","Deployment/privileged-api","2","true"

Copilot AI Mar 23, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

policy_events.csv is committed with concrete run rows. Because these files are intended to be appended to during evidence collection, keeping real event rows under version control will likely create churn and conflicts. Consider committing only the header (or storing sample rows in a separate example fixture) and generating/maintaining the actual evidence data outside of git.

Suggested change
"GOV-02-RUN01-20260323-1722","terraform-policy","required_tags","module.aws.aws_vpc.this","1","true"
"GOV-02-RUN02-20260323-1722","terraform-policy","required_tags","module.aws.aws_flow_log.this","1","true"
"GOV-02-RUN03-20260323-1722","terraform-policy","required_tags","module.aws.aws_vpc.this","1","true"
"GOV-03-RUN01-20260323-1722","terraform-policy","public_access","module.aws.aws_s3_bucket.this","1","true"
"GOV-03-RUN02-20260323-1722","terraform-policy","server_side_encryption","module.aws.aws_s3_bucket.this","1","true"
"GOV-03-RUN03-20260323-1722","terraform-policy","unrestricted_ingress","module.gcp.google_compute_firewall.this","1","true"
"GOV-04-RUN01-20260323-1722","kubernetes-policy","required_labels","Deployment/missing-labels-api","1","true"
"GOV-04-RUN02-20260323-1722","kubernetes-policy","resource_requests_limits","Deployment/missing-resources-api","2","true"
"GOV-04-RUN03-20260323-1722","kubernetes-policy","security_context","Deployment/privileged-api","2","true"

Copilot uses AI. Check for mistakes.
load_testing_env
init_evidence_layout
require_cmd kubectl
require_cmd jq

Copilot AI Mar 23, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script computes durations via duration_between, which shells out to python3 in common.sh, but it doesn’t check that python3 exists. Add require_cmd python3 (or replace the helper with a non-Python implementation) to avoid runtime failures on systems without Python.

Suggested change
require_cmd jq
require_cmd jq
require_cmd python3

Copilot uses AI. Check for mistakes.
Comment thread Makefile
@TESTING_CONFIG=$(TESTING_CONFIG) RUN_ID=$(RUN_ID) scripts/testing/capture_argocd_snapshot.sh "$(TEST_ID)" "$(RUN_SEQ)" "$(APP_NAME)" "$(APP_NAMESPACE)" "$(OUTCOME)"

testing-analyze: ## Build markdown summary statistics from collected evidence
@python3 scripts/testing/analyze_results.py

Copilot AI Mar 23, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TESTING_CONFIG is plumbed through the other testing targets, but testing-analyze doesn’t pass it to analyze_results.py. That means make testing-analyze TESTING_CONFIG=... won’t be respected. Export/pass TESTING_CONFIG (and optionally EVIDENCE_DIR/REPORT_DIR) for consistency with the rest of the harness.

Suggested change
@python3 scripts/testing/analyze_results.py
@TESTING_CONFIG=$(TESTING_CONFIG) EVIDENCE_DIR=$(EVIDENCE_DIR) REPORT_DIR=$(REPORT_DIR) python3 scripts/testing/analyze_results.py

Copilot uses AI. Check for mistakes.
DURATION_S="$5"
NOTES="${6:-}"
COMMIT_SHA="${7:-$(repo_head_sha)}"

Copilot AI Mar 23, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like the other scripts, this only validates argument count and will happily write rows with empty required fields (e.g. empty OUTCOME / DURATION_S) because make passes empty strings as arguments. Consider validating that required args are non-empty before appending to run_summary.csv, so evidence data stays well-formed.

Suggested change
# Validate that required args are non-empty to avoid malformed CSV rows
if [[ -z "${TEST_ID}" || -z "${RUN_SEQ}" || -z "${APP_TARGET}" || -z "${OUTCOME}" || -z "${DURATION_S}" ]]; then
fail "Usage: record_run.sh TEST_ID RUN_SEQ APP_TARGET OUTCOME DURATION_S [NOTES] [COMMIT_SHA]"
fi

Copilot uses AI. Check for mistakes.
Comment on lines +2 to +10
"GOV-02-RUN01-20260323-1722","GOV-02","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:missing-owner-plan.json","blocked","0","expected_rule=required_tags"
"GOV-02-RUN02-20260323-1722","GOV-02","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:missing-cost-center-plan.json","blocked","0","expected_rule=required_tags"
"GOV-02-RUN03-20260323-1722","GOV-02","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:missing-environment-plan.json","blocked","0","expected_rule=required_tags"
"GOV-03-RUN01-20260323-1722","GOV-03","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:public-storage-plan.json","blocked","0","expected_rule=public_access"
"GOV-03-RUN02-20260323-1722","GOV-03","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:unencrypted-s3-plan.json","blocked","0","expected_rule=server_side_encryption"
"GOV-03-RUN03-20260323-1722","GOV-03","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:unrestricted-ingress-plan.json","blocked","0","expected_rule=unrestricted_ingress"
"GOV-04-RUN01-20260323-1722","GOV-04","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:missing-labels.yaml","blocked","0","expected_rule=required_labels"
"GOV-04-RUN02-20260323-1722","GOV-04","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:missing-resources.yaml","blocked","0","expected_rule=resource_requests_limits"
"GOV-04-RUN03-20260323-1722","GOV-04","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:privileged-container.yaml","blocked","0","expected_rule=security_context"

Copilot AI Mar 23, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These CSVs are being added with populated run data (timestamps, commit SHA, outcomes). Since the harness appends to these files on every run, committing real run rows will quickly become stale/noisy and cause merge conflicts. Consider committing only headers (or moving sample data to *.example.csv) and leaving the working evidence CSVs to be generated locally.

Suggested change
"GOV-02-RUN01-20260323-1722","GOV-02","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:missing-owner-plan.json","blocked","0","expected_rule=required_tags"
"GOV-02-RUN02-20260323-1722","GOV-02","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:missing-cost-center-plan.json","blocked","0","expected_rule=required_tags"
"GOV-02-RUN03-20260323-1722","GOV-02","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:missing-environment-plan.json","blocked","0","expected_rule=required_tags"
"GOV-03-RUN01-20260323-1722","GOV-03","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:public-storage-plan.json","blocked","0","expected_rule=public_access"
"GOV-03-RUN02-20260323-1722","GOV-03","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:unencrypted-s3-plan.json","blocked","0","expected_rule=server_side_encryption"
"GOV-03-RUN03-20260323-1722","GOV-03","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:unrestricted-ingress-plan.json","blocked","0","expected_rule=unrestricted_ingress"
"GOV-04-RUN01-20260323-1722","GOV-04","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:missing-labels.yaml","blocked","0","expected_rule=required_labels"
"GOV-04-RUN02-20260323-1722","GOV-04","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:missing-resources.yaml","blocked","0","expected_rule=resource_requests_limits"
"GOV-04-RUN03-20260323-1722","GOV-04","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:privileged-container.yaml","blocked","0","expected_rule=security_context"

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants