feat: Add testing harness for dissertation with evidence collection a… by stephenjtyrrell · Pull Request #38 · stephenjtyrrell/dissertation

stephenjtyrrell · 2026-03-23T21:48:08Z

…nd analysis

Introduced a comprehensive testing harness in the testing directory to facilitate evidence collection for governance and scalability tests.
Created scripts for capturing pull request audits, recording policy events, and summarizing run results.
Implemented a common library (common.sh) for shared functions and configurations.
Added CSV files for storing evidence data, including approval events, policy events, and run summaries.
Developed Terraform and Kubernetes fixtures to simulate negative control scenarios for governance testing.
Updated README and configuration files to guide users through the setup and execution of tests.
Established a structured write-up template for documenting findings from the testing process.

…nd analysis - Introduced a comprehensive testing harness in the `testing` directory to facilitate evidence collection for governance and scalability tests. - Created scripts for capturing pull request audits, recording policy events, and summarizing run results. - Implemented a common library (`common.sh`) for shared functions and configurations. - Added CSV files for storing evidence data, including approval events, policy events, and run summaries. - Developed Terraform and Kubernetes fixtures to simulate negative control scenarios for governance testing. - Updated README and configuration files to guide users through the setup and execution of tests. - Established a structured write-up template for documenting findings from the testing process.

Copilot

Pull request overview

Adds a dissertation-focused testing harness under testing/ plus supporting scripts to collect governance/scalability evidence (GitHub Actions, Terraform/OPA, Kubernetes/Conftest, ArgoCD), store it as structured CSVs, and generate a summarized markdown report for analysis/write-up.

Changes:

Introduces reusable capture/record scripts (scripts/testing/*) and Makefile targets to initialize, collect, and analyze evidence.
Adds negative-control fixtures for Terraform and Kubernetes policy enforcement tests.
Adds evidence CSV schemas (and initial datasets), documentation, and a dissertation write-up template.

Reviewed changes

Copilot reviewed 29 out of 31 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
testing/fixtures/terraform/unrestricted-ingress-plan.json	Terraform negative-control fixture for unrestricted ingress detection
testing/fixtures/terraform/unencrypted-s3-plan.json	Terraform negative-control fixture for missing S3 encryption detection
testing/fixtures/terraform/public-storage-plan.json	Terraform negative-control fixture for public storage detection
testing/fixtures/terraform/missing-owner-plan.json	Terraform negative-control fixture for missing required tags
testing/fixtures/terraform/missing-environment-plan.json	Terraform negative-control fixture for missing required tags
testing/fixtures/terraform/missing-cost-center-plan.json	Terraform negative-control fixture for missing required tags
testing/fixtures/kubernetes/privileged-container.yaml	Kubernetes negative-control fixture for securityContext violations
testing/fixtures/kubernetes/missing-resources.yaml	Kubernetes negative-control fixture for missing resource requests/limits
testing/fixtures/kubernetes/missing-labels.yaml	Kubernetes negative-control fixture for required label violations
testing/evidence/run_summary.csv	Run-level evidence CSV (schema + initial entries)
testing/evidence/policy_events.csv	Policy event evidence CSV (schema + initial entries)
testing/evidence/ci_job_events.csv	CI job-level evidence CSV header
testing/evidence/artifact_events.csv	Artifact evidence CSV header
testing/evidence/argocd_events.csv	ArgoCD evidence CSV header
testing/evidence/approval_events.csv	PR approval/audit evidence CSV header
testing/dissertation_writeup_template.md	Template for Chapters 5–7 write-up structure
testing/config.env.example	Example environment configuration for running the harness
testing/README.md	Harness documentation and recommended workflow
scripts/testing/run_policy_negative_controls.sh	Executes local Terraform/Kubernetes negative controls and records results
scripts/testing/record_run.sh	Records manual run_summary rows
scripts/testing/record_policy_event.sh	Records manual policy event rows
scripts/testing/init_evidence.sh	Initializes evidence directories and CSV headers
scripts/testing/common.sh	Shared functions/config for evidence capture + CSV writing
scripts/testing/capture_pr_audit.sh	Captures PR audit/approval evidence via GitHub CLI
scripts/testing/capture_ci_run.sh	Captures workflow/job timing + artifacts via GitHub CLI
scripts/testing/capture_argocd_snapshot.sh	Captures ArgoCD + cluster snapshot evidence
scripts/testing/analyze_results.py	Builds a markdown summary report (median/IQR/p95, rates) from CSV evidence
scripts/testing/pycache/analyze_results.cpython-313.pyc	Compiled Python bytecode artifact added (should not be committed)
policies/kubernetes/required-labels.rego	Updates first deny rule to match `deny contains msg if` style
Makefile	Adds testing harness targets and parameters
.gitignore	Ignores `testing/config.env` and generated evidence bundles/reports

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-23T21:52:24Z

+load_testing_env
+init_evidence_layout
+require_cmd gh
+require_cmd jq


This script calls duration_between (from common.sh), which uses python3, but python3 is not required/validated here. Add require_cmd python3 (or avoid the Python dependency) so the script fails early with a clear error when Python isn’t available.

Suggested change

require_cmd jq

require_cmd jq

require_cmd python3

Copilot · 2026-03-23T21:52:25Z

+RUN_SEQ="$2"
+GH_RUN_ID="$3"
+APP_TARGET="${4:-github-actions}"
+


Argument validation only checks the count of parameters, so empty required values (e.g. empty TEST_ID when invoked via make testing-capture-ci) will still be accepted and will generate malformed run IDs like -RUN01-.... Add explicit non-empty checks for required args (TEST_ID, RUN_SEQ, GH_RUN_ID) and fail with a clear usage message when they’re blank.

Suggested change

if [[ -z "${TEST_ID}" || -z "${RUN_SEQ}" || -z "${GH_RUN_ID}" ]]; then

fail "Usage: capture_ci_run.sh TEST_ID RUN_SEQ GH_RUN_ID [APP_TARGET]"

fi

Copilot · 2026-03-23T21:52:25Z

+"GOV-02-RUN01-20260323-1722","terraform-policy","required_tags","module.aws.aws_vpc.this","1","true"
+"GOV-02-RUN02-20260323-1722","terraform-policy","required_tags","module.aws.aws_flow_log.this","1","true"
+"GOV-02-RUN03-20260323-1722","terraform-policy","required_tags","module.aws.aws_vpc.this","1","true"
+"GOV-03-RUN01-20260323-1722","terraform-policy","public_access","module.aws.aws_s3_bucket.this","1","true"
+"GOV-03-RUN02-20260323-1722","terraform-policy","server_side_encryption","module.aws.aws_s3_bucket.this","1","true"
+"GOV-03-RUN03-20260323-1722","terraform-policy","unrestricted_ingress","module.gcp.google_compute_firewall.this","1","true"
+"GOV-04-RUN01-20260323-1722","kubernetes-policy","required_labels","Deployment/missing-labels-api","1","true"
+"GOV-04-RUN02-20260323-1722","kubernetes-policy","resource_requests_limits","Deployment/missing-resources-api","2","true"
+"GOV-04-RUN03-20260323-1722","kubernetes-policy","security_context","Deployment/privileged-api","2","true"


policy_events.csv is committed with concrete run rows. Because these files are intended to be appended to during evidence collection, keeping real event rows under version control will likely create churn and conflicts. Consider committing only the header (or storing sample rows in a separate example fixture) and generating/maintaining the actual evidence data outside of git.

Suggested change

"GOV-02-RUN01-20260323-1722","terraform-policy","required_tags","module.aws.aws_vpc.this","1","true"

"GOV-02-RUN02-20260323-1722","terraform-policy","required_tags","module.aws.aws_flow_log.this","1","true"

"GOV-02-RUN03-20260323-1722","terraform-policy","required_tags","module.aws.aws_vpc.this","1","true"

"GOV-03-RUN01-20260323-1722","terraform-policy","public_access","module.aws.aws_s3_bucket.this","1","true"

"GOV-03-RUN02-20260323-1722","terraform-policy","server_side_encryption","module.aws.aws_s3_bucket.this","1","true"

"GOV-03-RUN03-20260323-1722","terraform-policy","unrestricted_ingress","module.gcp.google_compute_firewall.this","1","true"

"GOV-04-RUN01-20260323-1722","kubernetes-policy","required_labels","Deployment/missing-labels-api","1","true"

"GOV-04-RUN02-20260323-1722","kubernetes-policy","resource_requests_limits","Deployment/missing-resources-api","2","true"

"GOV-04-RUN03-20260323-1722","kubernetes-policy","security_context","Deployment/privileged-api","2","true"

Copilot · 2026-03-23T21:52:25Z

+load_testing_env
+init_evidence_layout
+require_cmd kubectl
+require_cmd jq


This script computes durations via duration_between, which shells out to python3 in common.sh, but it doesn’t check that python3 exists. Add require_cmd python3 (or replace the helper with a non-Python implementation) to avoid runtime failures on systems without Python.

Suggested change

require_cmd jq

require_cmd jq

require_cmd python3

Copilot · 2026-03-23T21:52:26Z

+	@TESTING_CONFIG=$(TESTING_CONFIG) RUN_ID=$(RUN_ID) scripts/testing/capture_argocd_snapshot.sh "$(TEST_ID)" "$(RUN_SEQ)" "$(APP_NAME)" "$(APP_NAMESPACE)" "$(OUTCOME)"
+
+testing-analyze: ## Build markdown summary statistics from collected evidence
+	@python3 scripts/testing/analyze_results.py


TESTING_CONFIG is plumbed through the other testing targets, but testing-analyze doesn’t pass it to analyze_results.py. That means make testing-analyze TESTING_CONFIG=... won’t be respected. Export/pass TESTING_CONFIG (and optionally EVIDENCE_DIR/REPORT_DIR) for consistency with the rest of the harness.

Suggested change

@python3 scripts/testing/analyze_results.py

@TESTING_CONFIG=$(TESTING_CONFIG) EVIDENCE_DIR=$(EVIDENCE_DIR) REPORT_DIR=$(REPORT_DIR) python3 scripts/testing/analyze_results.py

Copilot · 2026-03-23T21:52:26Z

+DURATION_S="$5"
+NOTES="${6:-}"
+COMMIT_SHA="${7:-$(repo_head_sha)}"
+


Like the other scripts, this only validates argument count and will happily write rows with empty required fields (e.g. empty OUTCOME / DURATION_S) because make passes empty strings as arguments. Consider validating that required args are non-empty before appending to run_summary.csv, so evidence data stays well-formed.

Suggested change

# Validate that required args are non-empty to avoid malformed CSV rows

if [[ -z "${TEST_ID}" || -z "${RUN_SEQ}" || -z "${APP_TARGET}" || -z "${OUTCOME}" || -z "${DURATION_S}" ]]; then

fail "Usage: record_run.sh TEST_ID RUN_SEQ APP_TARGET OUTCOME DURATION_S [NOTES] [COMMIT_SHA]"

fi

Copilot · 2026-03-23T21:52:26Z

+"GOV-02-RUN01-20260323-1722","GOV-02","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:missing-owner-plan.json","blocked","0","expected_rule=required_tags"
+"GOV-02-RUN02-20260323-1722","GOV-02","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:missing-cost-center-plan.json","blocked","0","expected_rule=required_tags"
+"GOV-02-RUN03-20260323-1722","GOV-02","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:missing-environment-plan.json","blocked","0","expected_rule=required_tags"
+"GOV-03-RUN01-20260323-1722","GOV-03","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:public-storage-plan.json","blocked","0","expected_rule=public_access"
+"GOV-03-RUN02-20260323-1722","GOV-03","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:unencrypted-s3-plan.json","blocked","0","expected_rule=server_side_encryption"
+"GOV-03-RUN03-20260323-1722","GOV-03","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:unrestricted-ingress-plan.json","blocked","0","expected_rule=unrestricted_ingress"
+"GOV-04-RUN01-20260323-1722","GOV-04","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:missing-labels.yaml","blocked","0","expected_rule=required_labels"
+"GOV-04-RUN02-20260323-1722","GOV-04","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:missing-resources.yaml","blocked","0","expected_rule=resource_requests_limits"
+"GOV-04-RUN03-20260323-1722","GOV-04","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:privileged-container.yaml","blocked","0","expected_rule=security_context"


These CSVs are being added with populated run data (timestamps, commit SHA, outcomes). Since the harness appends to these files on every run, committing real run rows will quickly become stale/noisy and cause merge conflicts. Consider committing only headers (or moving sample data to *.example.csv) and leaving the working evidence CSVs to be generated locally.

Suggested change

"GOV-02-RUN01-20260323-1722","GOV-02","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:missing-owner-plan.json","blocked","0","expected_rule=required_tags"

"GOV-02-RUN02-20260323-1722","GOV-02","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:missing-cost-center-plan.json","blocked","0","expected_rule=required_tags"

"GOV-02-RUN03-20260323-1722","GOV-02","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:missing-environment-plan.json","blocked","0","expected_rule=required_tags"

"GOV-03-RUN01-20260323-1722","GOV-03","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:public-storage-plan.json","blocked","0","expected_rule=public_access"

"GOV-03-RUN02-20260323-1722","GOV-03","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:unencrypted-s3-plan.json","blocked","0","expected_rule=server_side_encryption"

"GOV-03-RUN03-20260323-1722","GOV-03","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:unrestricted-ingress-plan.json","blocked","0","expected_rule=unrestricted_ingress"

"GOV-04-RUN01-20260323-1722","GOV-04","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:missing-labels.yaml","blocked","0","expected_rule=required_labels"

"GOV-04-RUN02-20260323-1722","GOV-04","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:missing-resources.yaml","blocked","0","expected_rule=resource_requests_limits"

"GOV-04-RUN03-20260323-1722","GOV-04","2026-03-23T17:22:45Z","8e4934c859525e59545c763b8df539e7c5ce3e92","policy-fixture:privileged-container.yaml","blocked","0","expected_rule=security_context"

Copilot AI review requested due to automatic review settings March 23, 2026 21:48

Copilot started reviewing on behalf of stephenjtyrrell March 23, 2026 21:48 View session

Copilot AI reviewed Mar 23, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add testing harness for dissertation with evidence collection a…#38

feat: Add testing harness for dissertation with evidence collection a…#38
stephenjtyrrell wants to merge 1 commit into
mainfrom
testing-execution

stephenjtyrrell commented Mar 23, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 23, 2026

Uh oh!

Copilot AI Mar 23, 2026

Uh oh!

Copilot AI Mar 23, 2026

Uh oh!

Copilot AI Mar 23, 2026

Uh oh!

Copilot AI Mar 23, 2026

Uh oh!

Copilot AI Mar 23, 2026

Uh oh!

Copilot AI Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

+if [[ -z "${TEST_ID}" || -z "${RUN_SEQ}" || -z "${GH_RUN_ID}" ]]; then
+  fail "Usage: capture_ci_run.sh TEST_ID RUN_SEQ GH_RUN_ID [APP_TARGET]"
+fi

	@python3 scripts/testing/analyze_results.py
	@TESTING_CONFIG=$(TESTING_CONFIG) EVIDENCE_DIR=$(EVIDENCE_DIR) REPORT_DIR=$(REPORT_DIR) python3 scripts/testing/analyze_results.py

+# Validate that required args are non-empty to avoid malformed CSV rows
+if [[ -z "${TEST_ID}" || -z "${RUN_SEQ}" || -z "${APP_TARGET}" || -z "${OUTCOME}" || -z "${DURATION_S}" ]]; then
+  fail "Usage: record_run.sh TEST_ID RUN_SEQ APP_TARGET OUTCOME DURATION_S [NOTES] [COMMIT_SHA]"
+fi

Conversation

stephenjtyrrell commented Mar 23, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants