Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -214,4 +214,8 @@ reference-docs/
**/*report*.jsonl.gz
**/*report*.jsonl.gz.part
**/*report*.jsonl.gz.part.1
# Committed demo artifacts (intentional exception)
!docs/demo-report-*.pdf
!docs/demo-report-*.md
!examples/*/expected_report.md
cline*.*
89 changes: 81 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,18 @@ Regulators are moving faster than your governance docs. The EU AI Act is in forc

It's the missing link between *"we have a responsible-AI policy"* and *"we can prove it."*

**Use it when you need to:**

- turn AI governance policies into executable checks
- produce audit-ready compliance evidence on every release
- evaluate AI interactions against named regulatory frameworks (EU AI Act, NIST AI RMF, FERPA, fair-lending, FAA/EASA aviation, …)
- generate Markdown, JSON, HTML, or PDF reports your auditor can read
- integrate AI compliance checks into CI/CD

AICertify is part of the [Open Policy Agent ecosystem](https://www.openpolicyagent.org/ecosystem/entry/principled-evolution) — built on the same policy engine that powers Kubernetes admission, microservice authorisation, and infrastructure governance at scale.

> ⭐ **If AICertify helps you, please star the repo.** It helps AI governance and policy-as-code practitioners discover the project.

---

## Quick Start
Expand Down Expand Up @@ -183,18 +195,19 @@ See [`examples/quickstart.py`](examples/quickstart.py) for the full Python API.

---

## Sample Reports
## See the output

You don't have to install anything to see what AICertify produces. Pre-generated reports are committed to the repo:

- **[demo-report-eu-ai-act.pdf](docs/demo-report-eu-ai-act.pdf)** — a customer-support agent evaluated against the EU AI Act
- [examples/outputs/eu_ai_act/](examples/outputs/eu_ai_act/) — the canonical full output
- [examples/outputs/loan_evaluation/](examples/outputs/loan_evaluation/) — a credit-scoring model evaluated for fair lending
- [examples/outputs/medical_diagnosis/](examples/outputs/medical_diagnosis/) — a clinical-decision-support model evaluated for patient safety

<p align="center">
<img src="diagrams/diagram5_report_anatomy.png" alt="Anatomy of an audit-ready report: header with framework name, application, model and date; executive summary; policy results table; risk assessment bar chart; remediation guidance; footer attributing AICertify v0.7.0" width="85%" />
</p>

The `examples/outputs/` directory contains generated reports from real evaluations you can inspect before running anything:

- `eu_ai_act/` — A customer-support agent evaluated against the EU AI Act
- `loan_evaluation/` — A credit-scoring model evaluated for fair lending
- `medical_diagnosis/` — A clinical-decision-support model evaluated for patient safety

Open the PDFs. That's what your auditor wants.

---
Expand All @@ -214,6 +227,56 @@ Track progress in the [policy library roadmap](https://github.com/Principled-Evo

---

## For OPA / Rego users

If you already use OPA for Kubernetes admission, microservice authorisation, or infrastructure governance, AICertify is the AI-system slot in your existing policy strategy.

- **Bring your own Rego policies.** Drop a `.rego` file into the policy folder and it evaluates alongside the bundled set.
- **Evaluate AI interactions through OPA.** Captured inputs, outputs, and metrics flow into your policies via the standard OPA `input` document.
- **Generate audit-ready evidence.** PDF / Markdown / JSON / HTML, one command.
- **Use [gopal](https://github.com/Principled-Evolution/gopal) as the policy library underneath.** 94 production Rego policies covering EU AI Act, NIST AI RMF, aviation safety, FERPA, fair lending, and more.

AICertify is listed in the [Open Policy Agent ecosystem](https://www.openpolicyagent.org/ecosystem/entry/principled-evolution) as the AI-governance entry alongside Gopal.

---

## Why AICertify?

Most AI governance programs live in PDFs, spreadsheets, and policy documents. They describe what *should* happen but do not prove what *did*.

AICertify turns governance rules into executable policy checks.

Instead of saying:

> "Our chatbot follows our responsible AI policy."

You can produce:

> "Here is the captured interaction, the policy version, the OPA evaluation result, and the generated audit report."

AICertify is for AI teams, governance teams, auditors, and platform engineers who need AI compliance evidence that can be **read, run, reviewed, and repeated**.

See the full positioning in [docs/why-aicertify.md](docs/why-aicertify.md).

---

## Who should contribute?

AICertify is especially useful for:

- **AI engineers** building regulated AI systems
- **Governance, risk, and compliance (GRC) teams** producing audit evidence
- **Auditors and model risk professionals** evaluating third-party AI
- **OPA / Rego users** interested in AI-specific policy authoring
- **Responsible AI researchers** wanting reproducible benchmarks
- **Python developers** interested in compliance automation

**Non-code contributions are welcome:** examples, policy mappings, docs, tests, report templates, and regulatory notes.

A good place to start is the [`good first issue`](https://github.com/Principled-Evolution/aicertify/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22) and [`help wanted`](https://github.com/Principled-Evolution/aicertify/issues?q=is%3Aissue+is%3Aopen+label%3A%22help+wanted%22) labels.

---

## Contributing

We welcome:
Expand All @@ -222,8 +285,11 @@ We welcome:
- Industry-specific policies you've battle-tested
- New evaluators (fairness, safety, robustness — see `aicertify/evaluators/`)
- Bug reports with a minimal reproducing contract
- Documentation, examples, and tutorials

Start with [CONTRIBUTING.md](CONTRIBUTING.md) and the [Code of Conduct](CODE_OF_CONDUCT.md).
Start with [CONTRIBUTING.md](CONTRIBUTING.md), the [Code of Conduct](CODE_OF_CONDUCT.md), and the open [contributor issues](https://github.com/Principled-Evolution/aicertify/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22).

For security issues, please follow the [Security Policy](SECURITY.md) — report privately to [security@principledevolution.ai](mailto:security@principledevolution.ai), not via public issue.

---

Expand All @@ -239,4 +305,11 @@ Start with [CONTRIBUTING.md](CONTRIBUTING.md) and the [Code of Conduct](CODE_OF_

Apache License 2.0 — see [LICENSE](LICENSE).

---

<p align="center">
<strong>⭐ If AICertify is useful to you, please star the repo and share it with one colleague.</strong><br>
<sub>Every star helps AI governance and policy-as-code practitioners discover the project.</sub>
</p>

<p align="center"><sub>Built by <a href="https://github.com/Principled-Evolution">Principled Evolution</a> · Policies you can read, run, and prove.</sub></p>
59 changes: 59 additions & 0 deletions SECURITY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# Security Policy

## Reporting a Vulnerability

Please report security issues **privately** by emailing
[security@principledevolution.ai](mailto:security@principledevolution.ai).

**Do not open public GitHub issues** for suspected vulnerabilities. Public disclosure
before a fix is shipped puts every AICertify user at risk.

We aim to acknowledge reports within **5 business days** and to publish a fix or a
written mitigation plan within 30 days of confirming a valid report. Severity is
assessed against the [CVSS 3.1](https://www.first.org/cvss/v3.1/specification-document)
framework.

If you would like to encrypt your report, request our PGP key in the initial email and
we will share it before you send the technical detail.

## Scope

This policy covers:

- the `aicertify` Python package and its public API,
- the AICertify CLI (`python -m aicertify.cli`),
- the bundled examples under `examples/`,
- the policy evaluation logic against the vendored [gopal](https://github.com/Principled-Evolution/gopal) Rego policies,
- the report generation pipeline (PDF, Markdown, JSON, HTML).

Out of scope:

- vulnerabilities in upstream dependencies that already have a published CVE — please
report those upstream (we track them via GitHub Dependabot),
- attacks that require physical access to a machine running AICertify,
- denial-of-service via legitimate but expensive evaluation workloads.

## Coordinated Disclosure

We follow a coordinated disclosure model. Once a fix is available, we will:

1. Publish a patched release on PyPI (or the equivalent install path),
2. Publish a [GitHub Security Advisory](https://github.com/Principled-Evolution/aicertify/security/advisories) with credit to the reporter (unless anonymity is requested),
3. Reference the advisory in [CHANGELOG.md](CHANGELOG.md),
4. Update affected examples and documentation.

We are happy to publicly credit reporters who request it; we will never publish your
identity without explicit permission.

## Hardening notes

For users running AICertify in regulated or audit-sensitive environments:

- pin AICertify to a specific patched version in your dependency manifest,
- run evaluations in an isolated environment (container or virtual machine) when the AI
application under test handles sensitive data,
- review the captured contract JSON before sharing it — by design, contracts include
the AI application's input and output text.

For policy authors: see the [gopal](https://github.com/Principled-Evolution/gopal)
SECURITY policy for upstream policy library reporting.
Binary file added docs/demo-report-eu-ai-act.pdf
Binary file not shown.
66 changes: 66 additions & 0 deletions docs/why-aicertify.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# Why AICertify?

## The gap

Most AI governance programs live in PDFs, spreadsheets, and policy documents. They describe what *should* happen — but do not prove what *did*.

Auditors don't accept "we have a policy." They accept evidence: a dated record of the AI system under test, the rule it was evaluated against, the result, and the document signed off by the responsible owner. Producing that evidence by hand, every release, for every regulation, for every AI system in your portfolio, is not a sustainable program.

## The shift

The DevOps and platform engineering communities solved a similar problem ten years ago by moving infrastructure from documents into code: Terraform replaced cloud-architecture diagrams, Helm replaced runbooks, [Open Policy Agent](https://www.openpolicyagent.org/) replaced security-policy memos. The pattern in every case was the same — *take the rule out of the document and put it into a thing that runs.*

AICertify applies that shift to AI governance.

## The artifact AICertify produces

Instead of saying:

> "Our customer-support chatbot follows our responsible AI policy."

You produce:

> "Here is the contract that captured the chatbot's model version, the captured user-AI interactions, the EU AI Act v1 transparency policy (commit `a52d605`), the OPA evaluation result, the per-rule deny messages where applicable, and the dated PDF report sent to the audit committee."

Every artifact is reproducible: same input, same policy, same result. Every claim is traceable: the policy is code in git, the evaluation is deterministic, the report is generated, not handwritten.

## Who is this for?

AICertify exists for teams that need to **read, run, review, and repeat** their AI compliance evidence:

- **AI engineers** building under the EU AI Act, NIST AI RMF, India DPDP, Brazil AI Bill, FERPA/COPPA, FAA UAS rules, or any other named framework.
- **Governance, risk, and compliance (GRC) teams** who want their controls to *execute*, not just describe.
- **Auditors and model risk professionals** evaluating third-party AI systems.
- **Platform engineers** integrating AI compliance checks into CI/CD next to their linting, type-checking, and dependency scanning.
- **OPA / Rego users** who already trust policy-as-code for infrastructure and want the same discipline for AI.
- **Responsible AI researchers** who need reproducible bias, content-safety, and risk-management benchmarks.

## How AICertify is different

| | AICertify | Vendor SaaS (Credo AI, Holistic AI) | Research toolkit (Fairlearn, AIF360, MS RAI Toolbox) |
|---|---|---|---|
| Open source | ✅ Apache 2.0 | ❌ Closed | ✅ MIT |
| Air-gapped / on-prem deployable | ✅ | ❌ | ✅ |
| Policy-as-code (versioned, diff-able, reviewable) | ✅ OPA / Rego | ❌ | ❌ |
| Named regulatory frameworks (EU AI Act, NIST RMF, +13 more) | ✅ via [gopal](https://github.com/Principled-Evolution/gopal) | ✅ | ❌ (fairness/explainability only) |
| Industry verticals out of the box (aviation, banking, healthcare, education, automotive) | ✅ | Partial | ❌ |
| Audit-ready report output (PDF / Markdown / JSON / HTML) | ✅ | ✅ | Partial |
| Custom policies | ✅ Drop a `.rego` file | ✅ (paid tier) | N/A |
| Reproducible from a git checkout | ✅ | ❌ | ✅ |

## The honest scope

AICertify is **infrastructure**, not magic.

- It does not interpret regulations for you. Encoding "EU AI Act Article 13 transparency" as a Rego policy is a deliberate, reviewable act, and the policy is a human's interpretation, not a legal opinion. Read [SECURITY.md](../SECURITY.md), the per-framework READMEs, and the disclaimer on every policy directory before claiming compliance.
- It does not certify your AI system. It produces the evidence a human or organisation needs in order to assert compliance, internally or to a regulator. The certification authority remains your auditor, your legal counsel, or the relevant supervisor.
- It does not replace your governance program. It replaces the *paperwork* in your governance program.

What it *does* give you is the missing link between *"we have a responsible-AI policy"* and *"we can prove it."*

## Next steps

- **See the output without installing:** open [demo-report-eu-ai-act.pdf](demo-report-eu-ai-act.pdf).
- **Run the quickstart:** [`examples/quickstart.py`](../examples/quickstart.py).
- **Explore the policy library:** [gopal](https://github.com/Principled-Evolution/gopal) — 94 production Rego policies across 15+ frameworks.
- **Open a [good first issue](https://github.com/Principled-Evolution/aicertify/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22).**
75 changes: 56 additions & 19 deletions examples/README.md
Original file line number Diff line number Diff line change
@@ -1,31 +1,68 @@
# AICertify Examples

This directory contains examples demonstrating how to use the AICertify library.
Forkable references for evaluating real AI applications with AICertify.

## Quickstart Example
## Quickstart

The `quickstart.py` example demonstrates the basic functionality of AICertify:
A minimal end-to-end demo of the AICertify API.

1. Creating a regulations set
2. Selecting target regulations
3. Creating AI applications
4. Adding interactions to applications
5. Evaluating applications against regulations
6. Generating and viewing reports
```bash
python examples/quickstart.py
```

### Running the Quickstart Example
Creates a sample app, adds a few interactions, evaluates against the EU AI Act, and writes a report into `reports/`. Read [`quickstart.py`](quickstart.py) before adapting it.

To run the quickstart example:
## Forkable application examples

```bash
python examples/quickstart.py
Each folder is a self-contained reference you can copy as the starting point for evaluating your own AI application. The shape is the same in every example so the pattern is easy to follow:

```
example-name/
├── README.md How to run + how to adapt
├── input_contract.json AI application contract (model + interactions + metadata)
├── sample_interactions.json Standalone interaction set you can splice into a contract
├── policy_config.yaml Which gopal policies + evaluators to run against
├── run.py Runnable script using the Python API
└── expected_report.md What a successful run looks like
```

This will:
- Create a sample AI application with example interactions
- Evaluate it against the EU AI Act regulations
- Generate an HTML report in the `reports` directory
### Available examples

| Example | Risk class | Primary frameworks |
|---|---|---|
| [`customer-support-bot/`](customer-support-bot/) | Limited risk | EU AI Act transparency obligations + global baselines |
| [`healthcare-triage-bot/`](healthcare-triage-bot/) | **High risk** (Annex III) | EU AI Act high-risk + gopal healthcare patient-safety |
| [`hiring-screening-bot/`](hiring-screening-bot/) | **High risk** (Annex III) | EU AI Act high-risk + fair-lending proxy + global fairness |

### Wanted: more examples

The community is welcome to contribute additional examples following the same shape. Open issues track current asks:

- FastAPI integration example
- LangChain integration example
- LlamaIndex integration example
- Financial-advice bot
- EdTech tutor
- Docker quickstart

See the [`good first issue`](https://github.com/Principled-Evolution/aicertify/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22) and [`help wanted`](https://github.com/Principled-Evolution/aicertify/issues?q=is%3Aissue+is%3Aopen+label%3A%22help+wanted%22) labels.

## Pre-generated sample reports

If you want to see the AICertify deliverable before installing anything, the `outputs/` directory has historical reports from real runs:

- [`outputs/eu_ai_act/`](outputs/eu_ai_act/) — EU AI Act evaluations
- [`outputs/loan_evaluation/`](outputs/loan_evaluation/) — fair-lending evaluations
- [`outputs/medical_diagnosis/`](outputs/medical_diagnosis/) — patient-safety evaluations

A clean one is also bundled as [`docs/demo-report-eu-ai-act.pdf`](../docs/demo-report-eu-ai-act.pdf).

## Authoring conventions

## Additional Resources
When you add an example:

For more information about AICertify, please refer to the main documentation.
1. Match the directory layout above. The shape matters more than the content; it's what makes the examples forkable.
2. The `metadata` block in `input_contract.json` must declare jurisdiction, risk class, and (if Annex III) the relevant subpoint.
3. `policy_config.yaml` must include a `rationale:` for each framework explaining *why* that framework applies.
4. `expected_report.md` should describe both the pass case **and** the common failure modes a fork might hit.
5. Be honest about scope. A green AICertify report is necessary but not sufficient for production deployment — say so explicitly.
Loading
Loading