diff --git a/.gitignore b/.gitignore index f58214b..7da9fc1 100644 --- a/.gitignore +++ b/.gitignore @@ -17,3 +17,21 @@ /Gemfile.lock /vendor/ /spec/fixtures/modules/ + +## AI coding assistant-specific configuration +## The standard is AGENTS.md +# Claude Code +/CLAUDE.md +/.claude/ +# GitHub Copilot +/.github/copilot-instructions.md +# Cursor +/.cursor/ +/.cursorrules +# Windsurf +/.windsurf/ +/.windsurfrules +# Gemini CLI +/GEMINI.md +# Cline +/.clinerules/ diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 0000000..2712c60 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,134 @@ +# AGENTS.md + +This file provides guidance to AI agents when working with code in this repository. + +## Overview + +This is a Ruby gem (`compliance_engine`) that parses and works with [Sicura/SIMP Compliance Engine (SCE)](https://simp-project.com/docs/sce/) data. It also ships as a Puppet module providing a Hiera backend (`compliance_engine::enforcement`) for enforcing compliance profiles in Puppet environments. + +## Commands + +### Testing +```bash +# Run all tests and rubocop (default task) +bundle exec rake + +# Run just spec tests (with fixture prep/cleanup) +bundle exec rake spec + +# Run spec tests standalone (no fixture prep) +bundle exec rake spec:standalone + +# Run rubocop linting +bundle exec rake rubocop + +# Run a single spec file +bundle exec rspec spec/classes/compliance_engine/data_spec.rb + +# Run tests in parallel (used in CI for Ruby < 4.0) +bundle exec rake parallel_spec +``` + +### Development +```bash +# Install dependencies +bundle install + +# Open interactive shell with compliance data loaded +bundle exec compliance_engine inspect --module /path/to/module + +# CLI usage examples +bundle exec compliance_engine profiles --modulepath /path/to/modules +bundle exec compliance_engine hiera --profile my_profile --modulepath /path/to/modules +bundle exec compliance_engine lookup some::class::param --profile my_profile --module /path/to/module +``` + +## Architecture + +### Data Model + +Compliance data lives in YAML/JSON files at `/SIMP/compliance_profiles/*.yaml` or `/simp/compliance_profiles/*.yaml`. Files are structured with four top-level keys: `profiles`, `ce` (Compliance Elements), `checks`, and `controls`. + +The library models this data with a two-layer class hierarchy: + +**Collections** (`ComplianceEngine::Collection` subclass) hold named groups of components: +- `ComplianceEngine::Profiles` — keyed by `'profiles'` in source data +- `ComplianceEngine::Ces` — keyed by `'ce'` in source data +- `ComplianceEngine::Checks` — keyed by `'checks'` in source data +- `ComplianceEngine::Controls` — keyed by `'controls'` in source data + +**Components** (`ComplianceEngine::Component` subclass) represent individual named entries within those collections: +- `ComplianceEngine::Profile` — a named compliance profile +- `ComplianceEngine::Ce` — a Compliance Element (CE) +- `ComplianceEngine::Check` — a single compliance check; only `type: puppet-class-parameter` checks produce Hiera data via `Check#hiera` +- `ComplianceEngine::Control` — a compliance control + +A component can have multiple **fragments** (one per source file), which are deep-merged together via `deep_merge`. Confinement logic in `Component` filters fragments based on Puppet facts, module presence/version, and remediation risk level. + +### Central Data Object + +`ComplianceEngine::Data` is the primary entry point. It: +1. Loads files via `open(*paths)` which delegates to `ModuleLoader` → `DataLoader::Yaml/Json` +2. Uses Ruby's `Observable` pattern — `DataLoader` objects notify `Data` of changes +3. Lazily constructs and caches the four collection objects; invalidates all caches when facts, enforcement_tolerance, modulepath, or environment_data change +4. Exposes `Data#hiera(profiles)` which walks the check_mapping of requested profiles to produce a flat Hiera-compatible hash + +### Business Logic: From Profiles to Hiera + +**`Data#hiera(profile_names)`** is the primary output method. It: +1. Resolves each name to a `Profile` object (logs and skips unknown names). +2. Calls `Data#check_mapping(profile)` for each profile to find all associated checks. +3. Filters to checks with `type: 'puppet-class-parameter'`. +4. Calls `Check#hiera` on each, which returns `{ settings['parameter'] => settings['value'] }`. +5. Deep-merges all results into a single flat hash and caches it. + +**`Data#check_mapping(profile_or_ce)`** is the correlation engine that links profiles (or CEs) to checks. A check is included if **any** of the following hold (evaluated via `Data#mapping?`): + +| Condition | What it checks | +|-----------|---------------| +| Shared **control** | `check.controls` and `profile.controls` share a key set to `true` | +| Shared **CE** | `check.ces` and `profile.ces` share a key set to `true` | +| CE→Control overlap | Any of `check.ces`' CEs has a control that also appears in `profile.controls` | +| Direct reference | `profile.checks[check_key]` is truthy | + +`check_mapping` can also be called with CE objects (in addition to profiles). Results are cached by `"#{object.class}:#{object.key}"`. + +### Loading Pipeline + +``` +paths → EnvironmentLoader → ModuleLoader (one per module dir) + → DataLoader::Yaml / DataLoader::Json + ↓ (Observable notify) + ComplianceEngine::Data#update +``` + +- `EnvironmentLoader` scans a Puppet modulepath for module directories +- `EnvironmentLoader::Zip` handles zip-archived environments +- `ModuleLoader` reads a module's `metadata.json` and discovers compliance data files +- `DataLoader` (and its subclasses) read and parse individual files; they use the Observable pattern to push updates to `Data` + +### Puppet Hiera Backend + +`lib/puppet/functions/compliance_engine/enforcement.rb` implements the Hiera `lookup_key` function. It: +- Resolves profiles from `compliance_engine::enforcement` and optionally `compliance_markup::enforcement` Hiera keys +- Creates and caches a `ComplianceEngine::Data` object on the Puppet lookup context +- Calls `data.hiera(profiles)` and bulk-caches results for subsequent lookups +- Supports `compliance_markup` backwards compatibility via `compliance_markup_compatibility` option + +### Confinement and Enforcement Tolerance + +`Component#fragments` filters source fragments based on: +- **Fact confinement** (`confine` key): dot-notation Puppet facts (e.g. `os.release.major`). Values may be a string (exact match), a string prefixed with `!` (negation), or an array (any match). Implemented in `Component#fact_match?`. Fact confinement is skipped when `facts` is `nil`. +- **Module confinement** (`confine.module_name` + `confine.module_version`): checks against `environment_data` (a `{module_name => version}` hash) using semantic versioning. Module confinement only runs when `environment_data` is set (e.g. by `ComplianceEngine::Data#open`). +- **Remediation risk** (`remediation.risk`): when `enforcement_tolerance` is a positive `Integer`, drops fragments where risk level ≥ `enforcement_tolerance` and drops disabled remediations. Only applies to `Check` components. + +In practice, only fact confinement is bypassed when `facts` is `nil`; module confinement still applies whenever `environment_data` is available. All confinement and risk/disabled-remediation filtering are effectively bypassed only when both `facts` and `environment_data` are unset and `enforcement_tolerance` is not a positive `Integer` (every fragment is then included). This is useful for offline analysis where system context and enforcement settings are unavailable. + +### Code Style + +Rubocop is configured via `.rubocop.yml` inheriting from `voxpupuli-test`. Key style choices: +- `compact` class/module nesting style (e.g. `class ComplianceEngine::Data` not nested modules) +- Trailing commas on multiline args/arrays +- Leading dot position for method chaining +- `braces_for_chaining` block delimiters +- Max line length: 200 diff --git a/README.md b/README.md index f79dcb7..70c1f6f 100644 --- a/README.md +++ b/README.md @@ -42,6 +42,54 @@ Options: See the [`ComplianceEngine::Data`](https://rubydoc.info/gems/compliance_engine/ComplianceEngine/Data) class for details. +## Concepts + +### Data Model + +Compliance data is expressed across four entity types that live in YAML/JSON files inside Puppet modules (`/SIMP/compliance_profiles/*.yaml`): + +| Entity | Key | Purpose | +|--------|-----|---------| +| **Profile** | `profiles` | A named compliance standard (e.g. `nist_800_53_rev4`). References CEs, checks, and/or controls that together constitute that standard. | +| **CE** (Compliance Element) | `ce` | A single, named compliance capability (e.g. "enable audit logging"). Bridges profiles to checks via a shared vocabulary. | +| **Check** | `checks` | A verifiable assertion about a system setting. Checks of `type: puppet-class-parameter` carry a `parameter` and `value` that become Hiera data. | +| **Control** | `controls` | A cross-reference label from an external framework (e.g. `nist_800_53:rev4:AU-2`). Profiles and checks both annotate themselves with controls to express alignment. | + +### From Profiles to Hiera Data + +The central operation of the library is `Data#hiera(profiles)`, which converts a list of profile names into a flat hash of Puppet class parameters and their enforced values: + +``` +profile names + ↓ check_mapping: find all checks that belong to each profile +checks (type: puppet-class-parameter only) + ↓ Check#hiera: extract { 'class::param' => value } +deep-merged hash → { 'widget_spinner::audit_logging' => true, ... } +``` + +**How check_mapping works** — a check is considered part of a profile if any of the following are true: + +1. The check and profile share a **control** label (`nist_800_53:rev4:AU-2`). +2. The check and profile share a **CE** reference. +3. The check's CE and the profile share a **control** label. +4. The profile explicitly lists the check by key under its `checks:` map. + +This layered matching lets compliance authors express mappings at different levels of abstraction and have the engine resolve them automatically. + +### Confinement + +A component (profile, CE, check, or control) may be defined across multiple source files. Each file contributes a **fragment**. Before fragments are merged, they are filtered by: + +- **Facts** (`confine:` key): dot-notation Puppet facts, optionally negated with a `!` prefix. A fragment is dropped if its confinement does not match the current system's facts. +- **Module presence/version** (`confine.module_name` / `confine.module_version`): fragment is dropped if the required module is absent or the wrong version. +- **Remediation risk/status** (`remediation.risk` / `remediation.disabled`): when `enforcement_tolerance` is set to a positive Integer, a fragment is dropped if remediation is explicitly `disabled` or if its risk level is ≥ `enforcement_tolerance`. + +If `facts` is `nil`, all fact/module confinement is skipped; fragments are still subject to remediation-based filtering when `enforcement_tolerance` is set. + +### Enforcement Tolerance + +`enforcement_tolerance` is an optional integer threshold that controls how cautiously the engine applies remediations. When it is set to a positive Integer, fragments whose `remediation.risk.level` meets or exceeds the threshold, or whose remediation is explicitly `disabled`, are silently excluded from the merged result, allowing operators to tune aggressiveness (e.g. apply only low-risk remediations in production, all remediations in a test environment). When `enforcement_tolerance` is `nil` or not a positive Integer, no remediation-based filtering occurs and `remediation.risk` / `remediation.disabled` do not affect fragment inclusion. + ## Using as a Puppet Module The Compliance Engine can be used as a Puppet module to provide a Hiera backend for compliance data. This allows you to enforce compliance profiles through Hiera lookups within your Puppet manifests.