From d9e868de5a567380a17c62c07f6c2d7b3b0ad318 Mon Sep 17 00:00:00 2001
From: Byron Pullutasig <115118857+bpulluta@users.noreply.github.com>
Date: Tue, 17 Mar 2026 12:11:12 -0600
Subject: [PATCH 1/7] Add COMPASS workflow skills

---
 .github/skills/extraction-run/SKILL.md  | 253 ++++++++++++++++++++++++
 .github/skills/schema-creation/SKILL.md | 173 ++++++++++++++++
 .github/skills/web-scraper/SKILL.md     | 134 +++++++++++++
 .github/skills/yaml-setup/SKILL.md      | 199 +++++++++++++++++++
 4 files changed, 759 insertions(+)
 create mode 100644 .github/skills/extraction-run/SKILL.md
 create mode 100644 .github/skills/schema-creation/SKILL.md
 create mode 100644 .github/skills/web-scraper/SKILL.md
 create mode 100644 .github/skills/yaml-setup/SKILL.md
diff --git a/.github/skills/extraction-run/SKILL.md b/.github/skills/extraction-run/SKILL.md
new file mode 100644
index 00000000..23321b46
--- /dev/null
+++ b/.github/skills/extraction-run/SKILL.md
@@ -0,0 +1,253 @@
+---
+name: extraction-run
+description: Execute one-shot extraction with COMPASS, evaluate outputs, and iterate schema/config changes with minimal cost.
+---
+
+# Extraction Run Skill
+
+Use this skill to run one-shot extraction in a repeatable, low-risk way,
+then iterate quickly until you have stable structured outputs.
+
+## When to use
+
+- Schema exists and plugin config points to it.
+- You are onboarding a new technology (for example geothermal, CHP, hydrogen).
+- You need a reliable smoke-test workflow before scaling.
+
+## Two-pipeline modes
+
+COMPASS supports two distinct extraction pipelines. Choose one and do not mix
+them for the same technology:
+
+| Mode | Where code lives | Good for |
+|---|---|---|
+| **One-shot (schema-based)** | `examples/` → `compass/extraction/<tech>/` | New techs, no Python changes |
+| **Legacy decision-tree** | Python code in `compass/extraction/<tech>/` | Existing solar, wind, small wind |
+
+One-shot is the correct path for all new technology onboarding. It requires
+only a schema JSON, a plugin YAML, and a run config — no Python source changes.
+
+## Tech promotion lifecycle
+
+New technology assets start in `examples/` and finish in `compass/extraction/`:
+
+1. **Develop** — place all assets in `examples/one_shot_schema_extraction_<tech>/`
+2. **Stabilize** — iterate schema/plugin until smoke and robustness gates pass
+3. **Promote** — copy the three finalized files into `compass/extraction/<tech>/`:
+   - `<tech>_schema.json`
+   - `<tech>_plugin_config.yaml`
+   - `<tech>_config.json5` (optional; useful as a reference run config)
+
+The promoted extraction folder contains only config files — no Python code is
+needed for one-shot techs.
+
+## Required inputs
+
+- Run config for `compass process`.
+- Plugin config containing `schema`.
+- API keys in environment (never hardcode in configs).
+- A jurisdiction set sized to the current phase.
+
+## Naming convention
+
+Use tech-first names for all one-shot assets:
+
+- `<tech>_config*.json5`
+- `<tech>_plugin_config.yaml`
+- `<tech>_schema.json`
+- `<tech>_jurisdictions*.csv`
+
+The `tech` value in the run config must be a string that becomes the plugin
+registry identifier. It must be unique, lowercase, and underscore-separated
+(for example `concentrating_solar`, `geothermal_electricity`). COMPASS will
+raise `Unknown tech input` if this key does not match any registered plugin.
+
+## Canonical development pattern
+
+For early development, start with the proven dynamic baseline, then fall back
+to deterministic mode only when search infrastructure is unstable:
+
+1. Use one small jurisdiction file (1-3 rows).
+2. Use your preferred configured search engine.
+3. Load `.env` into shell (`set -a && source .env && set +a`).
+4. Run with verbose logs:
+	 - `pixi run compass process -c config.json5 -p plugin.yaml -v`
+5. Confirm output artifacts exist before tuning schema semantics.
+
+Fallback mode when needed:
+
+- Add `known_doc_urls` (or `known_local_docs`) in run config.
+- Set `perform_se_search: false` and `perform_website_search: false`.
+
+## Adaptation rule
+
+When adapting this workflow for a new technology, keep the run structure
+unchanged and swap only technology-specific inputs:
+
+- `tech` in run config,
+- schema file,
+- plugin descriptor (`data_type_short_desc`),
+- retrieval query/keyword vocabulary,
+- known document URL set.
+
+Change one axis per run unless debugging infrastructure failures.
+
+## Example references (optional)
+
+- `examples/one_shot_schema_extraction/README.rst`
+- `examples/one_shot_schema_extraction_geothermal/geothermal_config.json5`
+- `examples/one_shot_schema_extraction_geothermal/geothermal_plugin_config.yaml`
+- `examples/one_shot_schema_extraction_geothermal/geothermal_schema.json`
+- `examples/one_shot_schema_extraction_geothermal/geothermal_jurisdictions_one.csv`
+- `examples/one_shot_schema_extraction_geothermal/geothermal_one_shot_guide.md`
+- `examples/one_shot_schema_extraction_cst/cst_config.json5` (CST reference)
+- `examples/one_shot_schema_extraction_cst/cst_plugin_config.yaml` (CST reference)
+- `examples/one_shot_schema_extraction_cst/cst_schema.json` (CST reference)
+- `compass/extraction/geothermal_electricity/` (finalized one-shot tech example)
+- `docs/source/examples/one_shot_schema_extraction/plugin_config_minimal.json`
+- `docs/source/examples/one_shot_schema_extraction/plugin_config.yaml`
+- `examples/compass_tech_pipeline/README.md`
+
+## Environment setup reminder
+
+Before running, load secrets from `.env` (for example `SERPAPI_KEY`,
+`AZURE_OPENAI_API_KEY`) into the current shell. Do not commit secret values
+inside config files.
+
+Common `.env` gotcha: avoid spaces around `=` in variable assignments.
+
+## Core command
+
+```bash
+pixi run compass process -c config.json5 -p path/to/plugin_config.yaml -v
+```
+
+## Phase-gated workflow
+
+1. **Smoke test (3 jurisdictions)**
+	 - Goal: verify wiring and output contract.
+2. **Robustness (10-25 jurisdictions)**
+	 - Goal: verify feature stability and edge-case handling.
+3. **Scale (full set)**
+	 - Goal: only after earlier phases pass acceptance gates.
+
+## Validation checklist
+
+Evaluate each run on:
+
+- document relevance (exclude off-domain content),
+- feature coverage vs expected ordinance topics,
+- section/summary traceability,
+- unit consistency,
+- null discipline,
+- **scope bleed** — check that no features appear in the output CSVs that
+  fall outside the schema enum; generic land-use-code documents can cause
+  unrelated provisions to leak through. Tighten `extraction_system_prompt`
+  in plugin YAML to fix this.
+
+## Expected output artifacts
+
+A successful run produces these files under `out_dir`:
+
+| Artifact | Meaning |
+|---|---|
+| `ordinance_files/*.pdf` | Downloaded source documents |
+| `cleaned_text/*.txt` | Heuristic-filtered extracted text |
+| `jurisdiction_dbs/*.csv` | Per-jurisdiction raw extraction rows |
+| `quantitative_ordinances.csv` | Final compiled numeric features |
+| `qualitative_ordinances.csv` | Final compiled qualitative features |
+| `usage.json` | Per-jurisdiction LLM token and request counts |
+| `meta.json` | Run metadata (cost, timing, version) |
+
+Final CSV columns: `county`, `state`, `subdivision`, `jurisdiction_type`,
+`FIPS`, `feature`, `value`, `units`, `adder`, `min_dist`, `max_dist`,
+`summary`, `year`, `section`, `source`.
+
+## Interpreting output status correctly
+
+`cleaned_text` files can exist while `Number of documents found` is `0`.
+
+This means acquisition/text collection worked, but no final structured ordinance
+rows were emitted into consolidated DB outputs.
+
+Check in order:
+
+1. `outputs/*/cleaned_text/*.txt` (text extraction present)
+2. `outputs/*/jurisdiction_dbs/*.csv` (per-jurisdiction parsed rows)
+3. `outputs/*/quantitative_ordinances.csv` and
+	 `outputs/*/qualitative_ordinances.csv` (final compiled results)
+
+## Root-cause triage
+
+- **Wrong or noisy documents**
+	- Tune query templates, URL keywords, and exclusions.
+	- Prefer `known_doc_urls` while stabilizing.
+- **Right documents, wrong fields**
+	- Tune schema descriptions/examples and ambiguity rules.
+	- Check `extraction_system_prompt` in plugin YAML — it is the primary
+	  guard against scope bleed from generic legal documents.
+- **Correct values, unstable formatting**
+	- Tighten enums, unit vocabulary, and null behavior.
+- **Nothing downloaded / unstable search**
+	- Disable live search and use deterministic known URLs/local docs.
+- **0 documents found for a jurisdiction during website crawl**
+	- Expected for jurisdictions with few online ordinances. The website
+	  crawl is a second acquisition pass after search-engine retrieval;
+	  0 results there is not a pipeline failure.
+
+## Acceptance gates
+
+Do not advance phases until all are true:
+
+- Output rows conform to required contract.
+- High share of rows include useful `section` and `summary`.
+- Feature names are stable and machine-consistent.
+- Repeated runs on same sample show minimal drift.
+
+## Cost and speed controls
+
+- Keep sample size minimal while tuning.
+- Change one variable per run.
+- Archive run command, input set, and output path for each iteration.
+
+## Workspace hygiene (important)
+
+Keep one canonical working set per technology in `examples/`:
+
+- one run config,
+- one plugin config,
+- one schema,
+- one jurisdiction file,
+- one known docs file.
+
+Delete stale `_migrated`, `_smoke`, and duplicate output folders to avoid
+configuration drift and debugging confusion.
+
+## Known infrastructure issues
+
+### Playwright timeouts
+
+Web search via `rebrowser_playwright` may fail with 60s timeouts on
+`Page.wait_for_selector`. Symptoms:
+- `TimeoutError: Page.wait_for_selector: Timeout 60000ms exceeded`
+- All search queries fail consistently
+- Browser session crashes with `ProtocolError: Internal server error, session closed`
+
+These errors during the **website crawl phase** (second acquisition pass) are
+**non-fatal**. COMPASS logs them and continues. They do not block the
+search-engine phase or extraction.
+
+If search itself is failing, verify provider credentials are loaded and fall
+back to deterministic mode.
+
+**Workaround**: Use `known_local_docs` or `known_doc_urls` and disable
+search/website steps while validating extraction logic.
+
+### known_local_docs loading failures
+
+`known_local_docs` may fail silently with `ERROR: Failed to read file` in
+jurisdiction logs due to external loader behavior.
+
+**Workaround**: Prefer `known_doc_urls` for deterministic smoke tests and
+pre-validate local docs before pipeline runs.
+
diff --git a/.github/skills/schema-creation/SKILL.md b/.github/skills/schema-creation/SKILL.md
new file mode 100644
index 00000000..951593a3
--- /dev/null
+++ b/.github/skills/schema-creation/SKILL.md
@@ -0,0 +1,173 @@
+---
+name: schema-creation
+description: Author and iterate one-shot extraction schemas that replace legacy decision-tree extraction logic in native COMPASS.
+---
+
+# Schema Creation Skill
+
+Use this skill to encode extraction logic in schema so behavior is repeatable
+across jurisdictions and technologies.
+
+## When to use
+
+- Creating a new one-shot technology plugin.
+- Migrating from decision-tree logic to schema-driven extraction.
+- Stabilizing inconsistent model outputs.
+
+## Example references (optional)
+
+- `examples/one_shot_schema_extraction_geothermal/geothermal_schema.json`
+- `examples/one_shot_schema_extraction_geothermal/README.rst`
+- `examples/one_shot_schema_extraction_geothermal/geothermal_one_shot_guide.md`
+- `docs/source/examples/one_shot_schema_extraction/wind_schema.json`
+
+## Required output contract
+
+Top-level object must define `outputs` and each item must require:
+
+- `feature`
+- `value`
+- `units`
+- `section`
+- `summary`
+
+```json
+{
+  "type": "object",
+  "required": ["outputs"],
+  "properties": {
+    "outputs": {
+      "type": "array",
+      "items": {
+        "type": "object",
+        "required": ["feature", "value", "units", "section", "summary"],
+        "additionalProperties": false
+      }
+    }
+  }
+}
+```
+
+## Build sequence
+
+1. Copy baseline schema and rename for target tech.
+2. Replace `feature` enum with target-tech IDs.
+3. Define `value`/`units` rules per feature family.
+4. Add `$definitions` for reusable decision logic.
+5. Add `$examples` for top failure modes.
+6. Add `$instructions` for global extraction policy.
+
+For new technologies (for example CHP or CST), clone a working schema and
+perform a strict vocabulary swap (features, units, exclusions) before adding
+new logic.
+
+## Output column mapping
+
+Schema field names map directly to the final output CSV columns:
+
+| Schema field | CSV column |
+|---|---|
+| `feature` | `feature` |
+| `value` | `value` |
+| `units` | `units` |
+| `section` | `section` |
+| `summary` | `summary` |
+
+Additional columns added by COMPASS finalization: `county`, `state`,
+`subdivision`, `jurisdiction_type`, `FIPS`, `adder`, `min_dist`, `max_dist`,
+`year`, `source`. These do not need to appear in the schema.
+
+## Scope bleed from generic legal documents
+
+When COMPASS retrieves a large generic land-use code rather than a
+technology-specific ordinance, the LLM may extract provisions that are
+outside the schema enum. This is most visible when unfamiliar feature names
+appear in the output CSV.
+
+Primary controls:
+- `extraction_system_prompt` in plugin YAML — this is the strongest signal.
+  State explicitly what is in scope and what is out.
+- `$instructions.scope` in schema — reinforce exclusion language here.
+- `heuristic_keywords.not_tech_words` — filter documents upstream.
+
+Do not widen the feature enum to accommodate scope bleed; narrow the prompt
+and upstream filters instead.
+
+## Technology adaptation guidance
+
+When adapting a baseline schema to any new technology:
+
+- Separate core utility-scale requirements from adjacent/non-target systems.
+- Keep district/permit features distinct from numerical constraints.
+- Encode jurisdiction/governance handling where relevant in summaries.
+- Require explicit nulls when a feature is not enacted.
+
+## Cross-technology adaptation checklist
+
+Apply this for any new domain:
+
+1. Define technology-specific `feature` enum with stable IDs.
+2. Define allowed unit vocabulary for each feature family.
+3. Add explicit exclusion language for adjacent-but-out-of-scope systems.
+4. Ensure summaries preserve legal traceability (section + source-faithful text).
+5. Validate on deterministic docs before tuning retrieval.
+6. Consider including `enactment date` in the enum — COMPASS naturally surfaces it
+   from documents and it provides important temporal context in outputs.
+
+## Example specialization patterns (optional)
+
+Use examples only to shape exclusion strategy:
+
+- separate core utility-scale requirements from adjacent technologies,
+- add explicit exclusion terms in `not_tech_words`,
+- preserve legal traceability via `section` and `summary`.
+
+## Reuse safeguards
+
+- Keep tech-first file names consistent across assets:
+  `<tech>_config*.json5`, `<tech>_plugin_config.yaml`,
+  `<tech>_schema.json`, `<tech>_jurisdictions*.csv`.
+- Keep credentials out of schema content and examples.
+- Validate schema behavior with a small smoke run before scaling.
+
+## High-value authoring patterns
+
+- Put restrictive-value selection rules directly in descriptions.
+- Explicitly define accepted unit vocabulary.
+- Clarify near-miss terms that should not be treated as equivalent.
+- State whether qualitative features should keep `value`/`units` null.
+
+## Anti-patterns
+
+- Retrieval instructions embedded in schema semantics.
+- Feature IDs that change names across iterations.
+- Implicit unit assumptions not declared in text.
+- Examples that contradict field descriptions.
+- Feature enums that include placeholders with no extraction logic.
+
+## Quality checklist
+
+- Enum matches target output columns.
+- Every feature has deterministic extraction rules.
+- `section` and `summary` preserve legal traceability.
+- Repeated sample runs produce stable feature typing.
+
+## Iteration loop
+
+1. Run 3-jurisdiction smoke sample.
+2. Catalog failure modes by feature.
+3. Patch only affected descriptions/examples.
+4. Re-run same sample before expanding scope.
+
+Save iterated schema versions as `<tech>_schemav2.json`, `<tech>_schemav3.json`
+etc. to preserve a diff history. The active version is what `schema:` in the
+plugin YAML points to.
+
+## Practical quality signal
+
+Treat a schema as "working" when all are true on the smoke sample:
+
+- final ordinance CSV outputs are non-empty,
+- extracted rows include stable feature IDs,
+- most non-null rows have useful `section` and `summary`,
+- repeated runs do not shift feature semantics materially.
diff --git a/.github/skills/web-scraper/SKILL.md b/.github/skills/web-scraper/SKILL.md
new file mode 100644
index 00000000..f021bdb8
--- /dev/null
+++ b/.github/skills/web-scraper/SKILL.md
@@ -0,0 +1,134 @@
+---
+name: web-scraper
+description: Build and tune one-shot plugin configs that search, rank, and collect ordinance documents with native COMPASS pipeline settings.
+---
+
+# Web Scraper Skill
+
+Use this skill to improve retrieval precision/recall before extraction tuning.
+
+## When to use
+
+- Download step returns noisy sources.
+- Ordinance recall is weak across jurisdictions.
+- LLM filtering is compensating for poor search quality.
+
+## Scope
+
+- Query-template strategy.
+- URL ranking and filtering patterns.
+- Heuristic phrase controls before LLM validation.
+
+## Example references (optional)
+
+- `examples/one_shot_schema_extraction_geothermal/geothermal_plugin_config.yaml`
+- `examples/one_shot_schema_extraction_geothermal/geothermal_config.json5`
+- `examples/one_shot_schema_extraction_geothermal/geothermal_jurisdictions_one.csv`
+- `examples/one_shot_schema_extraction_cst/cst_plugin_config.yaml`
+- `examples/compass_tech_pipeline/README.md`
+
+## Two retrieval phases
+
+COMPASS runs two sequential acquisition passes per jurisdiction:
+
+1. **Search-engine phase** — queries `SerpAPIGoogleSearch` (or configured
+   engine) using `query_templates`. This phase is the primary source of
+   ordinance documents.
+2. **Website crawl phase** — crawls the jurisdiction's official website,
+   ranking pages using `website_keywords`. This phase is a secondary pass
+   and runs even if the SE phase found documents.
+
+Key behaviors:
+- Playwright browser errors during the website crawl phase are **non-fatal**.
+  COMPASS logs the error and continues.
+- `Found 0 potential documents` at the end of the crawl phase is **expected**
+  for jurisdictions without relevant online ordinances.
+- Disable the crawl phase with `perform_website_search: false` in run config
+  when you want faster smoke tests or Playwright is unavailable.
+
+## Key management
+
+For SerpAPI-backed search, keep `api_key` out of committed config and provide
+`SERPAPI_KEY` via environment (for example through `.env` loaded in shell).
+
+Recommended shell setup:
+
+```bash
+set -a
+source .env
+set +a
+```
+
+Avoid spaces around `=` in `.env` assignments.
+
+## Retrieval design pattern
+
+1. Create 3-7 jurisdiction queries with `{jurisdiction}`.
+2. Weight legal document indicators in URL keywords.
+3. Apply exclusions for templates/reports/slides.
+4. Add focused negative tech terms to reduce false positives.
+5. Start with dynamic search, then switch to deterministic known URLs when
+  search infrastructure is unstable.
+
+For first-pass reliability, test retrieval with deterministic known URLs
+before using live web search.
+
+## Technology-specific retrieval controls (template)
+
+- Include target-technology facility/deployment terms.
+- Exclude adjacent and non-target terms (residential/HVAC/PV/etc as needed).
+- Favor jurisdictional legal-code signals like `land use code`,
+  `code of ordinances`, `use table`, and `special use permit`.
+
+## Deterministic smoke-test mode
+
+Use run-config controls to bypass flaky search while tuning:
+
+- supply `known_doc_urls` or `known_local_docs`,
+- set `perform_se_search: false`,
+- set `perform_website_search: false`.
+
+Then validate:
+
+- download artifacts exist,
+- cleaned text exists,
+- ordinance DB rows are non-empty.
+
+## Tuning loop
+
+1. Run SE-search phase on small sample.
+2. Inspect kept vs discarded PDFs (`ordinance_files/`).
+3. Run heuristic filter and review false rejects/accepts (`cleaned_text/`).
+4. Check website crawl phase independently if needed (enable, run, inspect logs).
+5. Update one axis only:
+	- query templates (affects SE phase),
+	- URL weights (affects both phases),
+	- include/exclude heuristic patterns (pre-LLM filter),
+	- `not_tech_words` (upstream document rejection).
+6. Re-run same sample and compare.
+
+## Cross-tech onboarding
+
+When reusing this workflow for any technology:
+
+- keep legal retrieval tokens (`ordinance`, `zoning`, `code`),
+- replace all technology terms in `query_templates`, `website_keywords`,
+  and `heuristic_keywords`,
+- seed `known_doc_urls` with authoritative regulatory documents for smoke
+  testing,
+- avoid copying negatives from previous technologies into the new tech config,
+- verify `not_tech_words` excludes adjacent technologies for your domain.
+
+## Phase gates
+
+- **3 jurisdictions**: ensure major source classes are found.
+- **10-25 jurisdictions**: verify stability across regions.
+- **Full scale**: only once false positive/negative rates stabilize.
+
+## Guardrails
+
+- Keep feature extraction logic out of retrieval config.
+- Do not overfit to one county's document style.
+- Preserve auditable rationale for each retrieval change.
+- Keep one canonical retrieval config per active technology.
+- Ensure each run uses a unique `out_dir` to avoid COMPASS aborting early.
diff --git a/.github/skills/yaml-setup/SKILL.md b/.github/skills/yaml-setup/SKILL.md
new file mode 100644
index 00000000..a9f93d17
--- /dev/null
+++ b/.github/skills/yaml-setup/SKILL.md
@@ -0,0 +1,199 @@
+---
+name: yaml-setup
+description: Author and tune one-shot plugin YAML configs for COMPASS-native document discovery, filtering, and text collection.
+---
+
+# YAML Setup Skill
+
+Use this skill to create or tune one-shot plugin YAML that controls retrieval,
+filtering, and text collection behavior.
+
+## When to use
+
+- New technology onboarding in one-shot extraction.
+- Schema exists but source relevance is weak.
+- You need reproducible config handoff across teams.
+
+## Example references (optional)
+
+- `examples/one_shot_schema_extraction_geothermal/geothermal_plugin_config.yaml`
+- `examples/one_shot_schema_extraction_geothermal/README.rst`
+- `examples/one_shot_schema_extraction_geothermal/geothermal_one_shot_guide.md`
+- `docs/source/examples/one_shot_schema_extraction/plugin_config_minimal.json`
+- `docs/source/examples/one_shot_schema_extraction/plugin_config_simple.json5`
+- `docs/source/examples/one_shot_schema_extraction/plugin_config.yaml`
+
+## Naming convention
+
+Use tech-first file names when creating new one-shot assets:
+`<tech>_config*.json5`, `<tech>_plugin_config.yaml`,
+`<tech>_schema.json`, `<tech>_jurisdictions*.csv`.
+
+## Secret handling
+
+Keep API keys in environment variables (for example `SERPAPI_KEY`,
+`AZURE_OPENAI_API_KEY`) rather than in plugin or run config files.
+Load them per shell session with `set -a && source .env && set +a`.
+Avoid spaces around `=` in `.env` assignments.
+
+## Required minimum
+
+```yaml
+schema: ./my_schema.json
+```
+
+## Key plugin YAML fields
+
+| Field | Type | Behavior |
+|---|---|---|
+| `schema` | string (path) | **Required.** Path to JSON schema file, relative to plugin YAML location. |
+| `data_type_short_desc` | string | Short description used in LLM prompts (e.g. `utility-scale <tech> ordinance`). |
+| `query_templates` | list | Search query templates; `{jurisdiction}` is replaced at runtime. |
+| `website_keywords` | dict | Keyword → score map for URL ranking during website crawl. |
+| `heuristic_keywords` | dict or `true` | Pre-LLM text filter. If `true`, LLM generates lists from schema. |
+| `collection_prompts` | list or `true` | Text collection prompt(s). If **`true`**, LLM auto-generates from schema. |
+| `text_extraction_prompts` | list or `true` | Text consolidation prompt(s). If **`true`**, LLM auto-generates from schema. |
+| `extraction_system_prompt` | string | Overrides default LLM system prompt for the extraction step. Use this to scope extraction tightly to the target technology. |
+| `cache_llm_generated_content` | bool | Cache LLM-generated `query_templates` and `website_keywords`. Set to `false` when iterating schema to see live changes. |
+
+### `collection_prompts: true` and `text_extraction_prompts: true`
+
+Setting either flag to `true` (not a list) instructs COMPASS to use the LLM
+to auto-generate the prompts from the schema content. This is the recommended
+shortcut during development — do not write manual prompt lists until
+auto-generated ones prove insufficient.
+
+### `extraction_system_prompt`
+
+This is the primary control for preventing scope bleed from generic land-use
+code documents. Write it as a multi-line YAML literal block:
+
+```yaml
+extraction_system_prompt: |-
+  You are a legal scholar extracting structured data from
+  utility-scale <tech> ordinances.
+
+  Extract only enacted requirements for utility-scale <tech> facilities.
+  Exclude adjacent technologies and non-target use cases.
+  Prefer explicit values. Use null for qualitative obligations.
+```
+
+See `compass/extraction/geothermal_electricity/geothermal_plugin_config.yaml`
+for a complete example.
+
+## Progressive config path
+
+1. **Minimal**
+   - Confirm schema path and extraction invocation work.
+2. **Simple**
+   - Add `query_templates`, `heuristic_keywords`, and `cache_llm_generated_content`.
+   - Set `collection_prompts: true` and `text_extraction_prompts: true` to
+     let the LLM auto-generate prompts from the schema.
+3. **Full**
+   - Add `extraction_system_prompt` if scope bleed or off-domain extraction
+     is observed.
+   - Replace `heuristic_keywords: true` with an explicit list if precision
+     is insufficient.
+
+Use the same progression for any technology.
+
+## Baseline YAML pattern
+
+```yaml
+schema: ./my_schema.json
+data_type_short_desc: utility-scale <tech> ordinance
+cache_llm_generated_content: true
+query_templates:
+  - "filetype:pdf {jurisdiction} <tech> ordinance"
+  - "{jurisdiction} <tech> zoning ordinance"
+  - "{jurisdiction} <tech> permitting requirements"
+website_keywords:
+  pdf: 92160
+  <tech>: 46080
+  ordinance: 23040
+  zoning: 2880
+  permit: 1440
+heuristic_keywords:
+  good_tech_keywords:
+    - "<tech keyword 1>"
+    - "<tech keyword 2>"
+  good_tech_acronyms:
+    - "<tech acronym>"
+  good_tech_phrases:
+    - "<tech phrase 1>"
+    - "<tech phrase 2>"
+  not_tech_words:
+    - "<adjacent technology term 1>"
+    - "<adjacent technology term 2>"
+collection_prompts: true
+text_extraction_prompts: true
+extraction_system_prompt: |-
+  You are a legal scholar extracting structured data from
+  utility-scale <tech> ordinances.
+
+  Extract only requirements for utility-scale <tech> facilities.
+  Exclude adjacent technologies and non-target use cases.
+```
+
+Swap vocabulary for any technology while keeping the same structure.
+
+## Stable development mode
+
+Plugin YAML controls retrieval behavior, but deterministic acquisition for
+smoke tests belongs in run config:
+
+- `known_doc_urls` or `known_local_docs`
+- `perform_se_search: false`
+- `perform_website_search: false` (disables the website crawl second phase)
+
+Use this mode first, then re-enable search once schema extraction quality is
+stable.
+
+Recommended baseline: use dynamic search first, then use deterministic mode
+if search infrastructure fails.
+
+## Acquisition phases
+
+COMPASS acquisition runs in two sequential phases per jurisdiction:
+
+1. **Search-engine phase** — uses `SerpAPIGoogleSearch` or similar; driven by
+   `query_templates`.
+2. **Website crawl phase** — crawls the jurisdiction's main website using
+   `website_keywords` for ranking. Playwright browser errors during this
+   phase are **non-fatal**; COMPASS logs them and moves on.
+
+`perform_website_search: false` skips phase 2. Use it during smoke tests to
+keep run time short and avoid Playwright dependency issues.
+
+## Validation checklist
+
+- Schema path resolves from runtime working directory.
+- Query templates include `{jurisdiction}` consistently.
+- URL weights favor legal and government documents.
+- Heuristic exclusions are precise and not over-broad.
+- Prompt overrides are only added when default behavior fails.
+
+## Cross-tech adaptation checklist
+
+When adapting to another technology:
+
+- replace vocabulary in `query_templates` and `website_keywords`,
+- keep legal-code terms (`ordinance`, `zoning`, `code of ordinances`),
+- keep non-target exclusions explicit in `not_tech_words`,
+- do not carry terms from a previous technology into new tech configs,
+- write a technology-specific `extraction_system_prompt`.
+
+## Run command
+
+```bash
+pixi run compass process -c config.json5 -p path/to/plugin_config.yaml -v
+```
+
+If running outside the tech folder, use absolute paths for `-c` and `-p`.
+
+## Guardrails
+
+- Retrieval behavior belongs in plugin YAML.
+- Feature logic belongs in schema.
+- Adjust one tuning axis per run for clean attribution.
+- Keep one canonical plugin file per technology in the active example folder.

From 54b8d290a082bbf18310c84efa3c97df61cadc18 Mon Sep 17 00:00:00 2001
From: Byron Pullutasig <115118857+bpulluta@users.noreply.github.com>
Date: Tue, 17 Mar 2026 13:13:14 -0600
Subject: [PATCH 2/7] Added one-shot skills

---
 .github/skills/extraction-run/SKILL.md  |  57 +++---
 .github/skills/schema-creation/SKILL.md | 228 ++++++++++++------------
 .github/skills/web-scraper/SKILL.md     |  29 +--
 .github/skills/yaml-setup/SKILL.md      | 107 +++++++----
 4 files changed, 241 insertions(+), 180 deletions(-)

diff --git a/.github/skills/extraction-run/SKILL.md b/.github/skills/extraction-run/SKILL.md
index 23321b46..e77a10cc 100644
--- a/.github/skills/extraction-run/SKILL.md
+++ b/.github/skills/extraction-run/SKILL.md
@@ -5,14 +5,19 @@ description: Execute one-shot extraction with COMPASS, evaluate outputs, and ite
 
 # Extraction Run Skill
 
+**ONE-SHOT EXTRACTION ONLY.** This skill applies only to schema-driven extraction.
+For legacy decision-tree extraction (solar, wind, small wind), consult COMPASS 
+architecture docs.
+
 Use this skill to run one-shot extraction in a repeatable, low-risk way,
 then iterate quickly until you have stable structured outputs.
 
 ## When to use
 
 - Schema exists and plugin config points to it.
-- You are onboarding a new technology (for example geothermal, CHP, hydrogen).
+- You are onboarding a new technology (diesel generator, geothermal, CHP, hydrogen).
 - You need a reliable smoke-test workflow before scaling.
+- You are NOT using legacy decision-tree extraction.
 
 ## Two-pipeline modes
 
@@ -48,6 +53,16 @@ needed for one-shot techs.
 - API keys in environment (never hardcode in configs).
 - A jurisdiction set sized to the current phase.
 
+## Preflight checks (must pass before run)
+
+- Jurisdiction CSV has headers `County,State`.
+- `out_dir` is unique for this run.
+- At least one acquisition step is enabled:
+	`perform_se_search: true`, `perform_website_search: true`,
+	`known_doc_urls`, or `known_local_docs`.
+- If `heuristic_keywords` exists, all four required lists are present and
+	non-empty.
+
 ## Naming convention
 
 Use tech-first names for all one-shot assets:
@@ -92,29 +107,19 @@ unchanged and swap only technology-specific inputs:
 
 Change one axis per run unless debugging infrastructure failures.
 
-## Example references (optional)
+## Canonical reference
 
-- `examples/one_shot_schema_extraction/README.rst`
-- `examples/one_shot_schema_extraction_geothermal/geothermal_config.json5`
-- `examples/one_shot_schema_extraction_geothermal/geothermal_plugin_config.yaml`
-- `examples/one_shot_schema_extraction_geothermal/geothermal_schema.json`
-- `examples/one_shot_schema_extraction_geothermal/geothermal_jurisdictions_one.csv`
-- `examples/one_shot_schema_extraction_geothermal/geothermal_one_shot_guide.md`
-- `examples/one_shot_schema_extraction_cst/cst_config.json5` (CST reference)
-- `examples/one_shot_schema_extraction_cst/cst_plugin_config.yaml` (CST reference)
-- `examples/one_shot_schema_extraction_cst/cst_schema.json` (CST reference)
-- `compass/extraction/geothermal_electricity/` (finalized one-shot tech example)
-- `docs/source/examples/one_shot_schema_extraction/plugin_config_minimal.json`
-- `docs/source/examples/one_shot_schema_extraction/plugin_config.yaml`
-- `examples/compass_tech_pipeline/README.md`
+- `examples/one_shot_schema_extraction/` — complete working examples
+- `examples/one_shot_schema_extraction/README.rst` — general one-shot overview
+- `examples/water_rights_demo/one-shot/` — multi-doc extraction example
 
-## Environment setup reminder
+## Environment setup
 
-Before running, load secrets from `.env` (for example `SERPAPI_KEY`,
-`AZURE_OPENAI_API_KEY`) into the current shell. Do not commit secret values
-inside config files.
+Load secrets from `.env` before running. Never commit key values in config files.
 
-Common `.env` gotcha: avoid spaces around `=` in variable assignments.
+```bash
+set -a && source .env && set +a   # no spaces around = in .env assignments
+```
 
 ## Core command
 
@@ -124,9 +129,9 @@ pixi run compass process -c config.json5 -p path/to/plugin_config.yaml -v
 
 ## Phase-gated workflow
 
-1. **Smoke test (3 jurisdictions)**
+1. **Smoke test (1 jurisdiction)**
 	 - Goal: verify wiring and output contract.
-2. **Robustness (10-25 jurisdictions)**
+2. **Robustness (5 jurisdictions)**
 	 - Goal: verify feature stability and edge-case handling.
 3. **Scale (full set)**
 	 - Goal: only after earlier phases pass acceptance gates.
@@ -177,6 +182,14 @@ Check in order:
 3. `outputs/*/quantitative_ordinances.csv` and
 	 `outputs/*/qualitative_ordinances.csv` (final compiled results)
 
+Treat the run as **failed for extraction quality** when either is true:
+- `Number of jurisdictions with extracted data: 0`
+- any configuration exception appears in logs (even if process exits 0)
+
+Only treat a run as passing when both are true:
+- at least one jurisdiction has extracted data
+- at least one jurisdiction CSV in `jurisdiction_dbs/` has more than header row
+
 ## Root-cause triage
 
 - **Wrong or noisy documents**
diff --git a/.github/skills/schema-creation/SKILL.md b/.github/skills/schema-creation/SKILL.md
index 951593a3..805d8dc9 100644
--- a/.github/skills/schema-creation/SKILL.md
+++ b/.github/skills/schema-creation/SKILL.md
@@ -5,169 +5,161 @@ description: Author and iterate one-shot extraction schemas that replace legacy
 
 # Schema Creation Skill
 
-Use this skill to encode extraction logic in schema so behavior is repeatable
-across jurisdictions and technologies.
+**ONE-SHOT EXTRACTION ONLY.** This skill applies only to schema-driven extraction
+(new technology onboarding with JSON schema + plugin YAML). For legacy decision-tree
+extraction (existing solar/wind/small-wind in `compass/extraction/<tech>/`),
+consult COMPASS architecture docs.
+
+Use this skill to define what the LLM extracts and how it formats results.
+The schema is the single most important config file for output quality.
 
 ## When to use
 
-- Creating a new one-shot technology plugin.
-- Migrating from decision-tree logic to schema-driven extraction.
-- Stabilizing inconsistent model outputs.
+- Starting a new one-shot technology extraction (NOT decision-tree legacy extraction).
+- Fixing inconsistent or incorrect extracted values in one-shot extraction.
+- Adding new features to an existing one-shot extraction.
 
-## Example references (optional)
+## Canonical reference
 
-- `examples/one_shot_schema_extraction_geothermal/geothermal_schema.json`
-- `examples/one_shot_schema_extraction_geothermal/README.rst`
-- `examples/one_shot_schema_extraction_geothermal/geothermal_one_shot_guide.md`
-- `docs/source/examples/one_shot_schema_extraction/wind_schema.json`
+For complete examples, see the `examples/` directory:
+- `examples/one_shot_schema_extraction/wind_schema.json`
+- `examples/water_rights_demo/one-shot/water_rights_schema.json5`
 
-## Required output contract
+Each follows the pattern: `<tech>_schema.json` or `<tech>_schema.json5`.
 
-Top-level object must define `outputs` and each item must require:
+## Required output contract
 
-- `feature`
-- `value`
-- `units`
-- `section`
-- `summary`
+Every schema must define `outputs` as an array. Each item must require
+exactly these five fields and set `additionalProperties: false`:
 
 ```json
 {
   "type": "object",
   "required": ["outputs"],
+  "additionalProperties": false,
   "properties": {
     "outputs": {
       "type": "array",
       "items": {
         "type": "object",
         "required": ["feature", "value", "units", "section", "summary"],
-        "additionalProperties": false
+        "additionalProperties": false,
+        "properties": {
+          "feature": { "type": "string", "enum": ["..."] },
+          "value":   { "anyOf": [{"type": "number"}, {"type": "string"}, {"type": "boolean"}, {"type": "array", "items": {"type": "string"}}, {"type": "null"}] },
+          "units":   { "type": ["string", "null"] },
+          "section": { "type": ["string", "null"] },
+          "summary": { "type": ["string", "null"] }
+        }
       }
     }
   }
 }
 ```
 
-## Build sequence
-
-1. Copy baseline schema and rename for target tech.
-2. Replace `feature` enum with target-tech IDs.
-3. Define `value`/`units` rules per feature family.
-4. Add `$definitions` for reusable decision logic.
-5. Add `$examples` for top failure modes.
-6. Add `$instructions` for global extraction policy.
-
-For new technologies (for example CHP or CST), clone a working schema and
-perform a strict vocabulary swap (features, units, exclusions) before adding
-new logic.
-
-## Output column mapping
+These five fields map directly to the output CSV columns. COMPASS adds
+`county`, `state`, `FIPS`, and other metadata columns automatically.
 
-Schema field names map directly to the final output CSV columns:
-
-| Schema field | CSV column |
-|---|---|
-| `feature` | `feature` |
-| `value` | `value` |
-| `units` | `units` |
-| `section` | `section` |
-| `summary` | `summary` |
-
-Additional columns added by COMPASS finalization: `county`, `state`,
-`subdivision`, `jurisdiction_type`, `FIPS`, `adder`, `min_dist`, `max_dist`,
-`year`, `source`. These do not need to appear in the schema.
+## Build sequence
 
-## Scope bleed from generic legal documents
+1. **Define the feature enum** — one stable lowercase ID per siting-relevant
+   requirement. Group IDs by family (setbacks, noise, zoning, permitting).
+2. **Define `value` and `units` rules per feature family** — in each
+   feature's `description`, state the expected value type and accepted unit
+   vocabulary explicitly.
+3. **Add `$definitions`** — group related feature descriptions here to keep
+   the `feature` enum block clean.
+4. **Add `$instructions`** — encode global extraction policy (scope, null
+   handling, one-row-per-feature contract, verbatim quote preference).
+5. **Smoke-test on one jurisdiction** — validate all enum items appear in
+   output and null rows are correctly populated for missing features.
 
-When COMPASS retrieves a large generic land-use code rather than a
-technology-specific ordinance, the LLM may extract provisions that are
-outside the schema enum. This is most visible when unfamiliar feature names
-appear in the output CSV.
+## Feature definition template
 
-Primary controls:
-- `extraction_system_prompt` in plugin YAML — this is the strongest signal.
-  State explicitly what is in scope and what is out.
-- `$instructions.scope` in schema — reinforce exclusion language here.
-- `heuristic_keywords.not_tech_words` — filter documents upstream.
+Every feature description must answer four questions:
 
-Do not widen the feature enum to accommodate scope bleed; narrow the prompt
-and upstream filters instead.
+1. **What is this?** One sentence identifying the regulatory concept.
+2. **VALUE rule:** What type is the value and what specific values/ranges are
+   valid?
+3. **UNITS rule:** What unit string is accepted, or `null` if not applicable?
+4. **IGNORE / CLARIFICATION:** What near-miss concepts must NOT match this
+   feature?
 
-## Technology adaptation guidance
+Example (abbreviated):
 
-When adapting a baseline schema to any new technology:
+```json
+"structure setback": {
+  "description": "Minimum distance from the generator to an occupied building. VALUE: numerical distance. UNITS: 'feet' or 'meters'. IGNORE: setbacks from property lines or roads — those are separate features."
+}
+```
 
-- Separate core utility-scale requirements from adjacent/non-target systems.
-- Keep district/permit features distinct from numerical constraints.
-- Encode jurisdiction/governance handling where relevant in summaries.
-- Require explicit nulls when a feature is not enacted.
+## Feature family taxonomy
 
-## Cross-technology adaptation checklist
+Organize `$definitions` by these families:
 
-Apply this for any new domain:
+| Family | Example features |
+|---|---|
+| Setbacks | `structure setback`, `property line setback`, `road setback` |
+| Noise/Emissions | `noise limit`, `emissions standard`, `vibration limit` |
+| Operational | `hours of operation` |
+| Physical design | `screening requirement`, `enclosure requirement`, `exhaust stack height` |
+| Zoning | `primary use districts`, `conditional use districts`, `prohibited use districts` |
+| Permitting | `permit requirement`, `capacity threshold` |
+| Compliance | `decommissioning`, `enactment date` |
 
-1. Define technology-specific `feature` enum with stable IDs.
-2. Define allowed unit vocabulary for each feature family.
-3. Add explicit exclusion language for adjacent-but-out-of-scope systems.
-4. Ensure summaries preserve legal traceability (section + source-faithful text).
-5. Validate on deterministic docs before tuning retrieval.
-6. Consider including `enactment date` in the enum — COMPASS naturally surfaces it
-   from documents and it provides important temporal context in outputs.
+## `$instructions` block
 
-## Example specialization patterns (optional)
+Always include a `$instructions` object at the top level with these keys:
 
-Use examples only to shape exclusion strategy:
+```json
+"$instructions": {
+  "scope": "Describe exactly what to extract and what to ignore.",
+  "null_handling": "Output every enum feature. Use null value and null summary when a feature is not found in the document. Do not omit features.",
+  "one_row_per_feature": "Output exactly one row per feature. If multiple values apply, use the most restrictive and describe variants in summary.",
+  "verbatim_quotes": "In summary fields, prefer verbatim quotes from the source. Enclose in double quotation marks.",
+  "units_discipline": "Do not convert units. Record them exactly as they appear in the document."
+}
+```
 
-- separate core utility-scale requirements from adjacent technologies,
-- add explicit exclusion terms in `not_tech_words`,
-- preserve legal traceability via `section` and `summary`.
+## Scope bleed control
 
-## Reuse safeguards
+When COMPASS retrieves a large land-use code instead of a tech-specific
+ordinance, the LLM may extract off-domain provisions.
 
-- Keep tech-first file names consistent across assets:
-  `<tech>_config*.json5`, `<tech>_plugin_config.yaml`,
-  `<tech>_schema.json`, `<tech>_jurisdictions*.csv`.
-- Keep credentials out of schema content and examples.
-- Validate schema behavior with a small smoke run before scaling.
+Fix order (most powerful first):
+1. `extraction_system_prompt` in plugin YAML — state explicitly what is in
+   scope and what is excluded.
+2. `$instructions.scope` in schema — reinforce with exclusion language.
+3. `heuristic_keywords.NOT_TECH_WORDS` — reject documents upstream.
 
-## High-value authoring patterns
+Do not expand the feature enum to absorb scope bleed. Narrow the prompt.
 
-- Put restrictive-value selection rules directly in descriptions.
-- Explicitly define accepted unit vocabulary.
-- Clarify near-miss terms that should not be treated as equivalent.
-- State whether qualitative features should keep `value`/`units` null.
+## Cross-technology adaptation checklist
 
-## Anti-patterns
+When cloning this schema for a new technology:
 
-- Retrieval instructions embedded in schema semantics.
-- Feature IDs that change names across iterations.
-- Implicit unit assumptions not declared in text.
-- Examples that contradict field descriptions.
-- Feature enums that include placeholders with no extraction logic.
+- [ ] Replace all feature IDs with technology-specific names.
+- [ ] Replace value/units rules in every feature description.
+- [ ] Replace exclusion terms in `$instructions.scope` and feature IGNORE
+      clauses.
+- [ ] Replace `$definitions` group names to match new feature families.
+- [ ] Smoke-test before widening to 10+ jurisdictions.
 
 ## Quality checklist
 
-- Enum matches target output columns.
-- Every feature has deterministic extraction rules.
-- `section` and `summary` preserve legal traceability.
-- Repeated sample runs produce stable feature typing.
-
-## Iteration loop
-
-1. Run 3-jurisdiction smoke sample.
-2. Catalog failure modes by feature.
-3. Patch only affected descriptions/examples.
-4. Re-run same sample before expanding scope.
-
-Save iterated schema versions as `<tech>_schemav2.json`, `<tech>_schemav3.json`
-etc. to preserve a diff history. The active version is what `schema:` in the
-plugin YAML points to.
-
-## Practical quality signal
-
-Treat a schema as "working" when all are true on the smoke sample:
-
-- final ordinance CSV outputs are non-empty,
-- extracted rows include stable feature IDs,
-- most non-null rows have useful `section` and `summary`,
-- repeated runs do not shift feature semantics materially.
+- [ ] Feature enum uses stable, lowercase, underscore-separated IDs.
+- [ ] Every feature description contains VALUE, UNITS, and IGNORE clauses.
+- [ ] `$instructions` block is present with all five keys.
+- [ ] `additionalProperties: false` is set on the top-level object and on
+      each item in the `outputs` array.
+- [ ] Schema validates cleanly against a JSON Schema validator.
+- [ ] A smoke run using this schema produces extracted rows (not just
+   successful process exit logs).
+
+## Anti-patterns to avoid
+
+- Feature IDs that change names between iterations.
+- Implicit unit assumptions not stated in description text.
+- Missing IGNORE clauses for common near-miss features.
+- Examples in descriptions that contradict field rules.
+- Widening the enum to absorb scope bleed instead of tightening the prompt.
diff --git a/.github/skills/web-scraper/SKILL.md b/.github/skills/web-scraper/SKILL.md
index f021bdb8..f5149364 100644
--- a/.github/skills/web-scraper/SKILL.md
+++ b/.github/skills/web-scraper/SKILL.md
@@ -6,11 +6,13 @@ description: Build and tune one-shot plugin configs that search, rank, and colle
 # Web Scraper Skill
 
 Use this skill to improve retrieval precision/recall before extraction tuning.
+Applies to both one-shot (schema-driven) and legacy decision-tree extraction
+pipelines.
 
 ## When to use
 
-- Download step returns noisy sources.
-- Ordinance recall is weak across jurisdictions.
+- Download step returns noisy sources (one-shot extraction).
+- Ordinance recall is weak across jurisdictions (one-shot extraction).
 - LLM filtering is compensating for poor search quality.
 
 ## Scope
@@ -19,13 +21,11 @@ Use this skill to improve retrieval precision/recall before extraction tuning.
 - URL ranking and filtering patterns.
 - Heuristic phrase controls before LLM validation.
 
-## Example references (optional)
+## Canonical reference
 
-- `examples/one_shot_schema_extraction_geothermal/geothermal_plugin_config.yaml`
-- `examples/one_shot_schema_extraction_geothermal/geothermal_config.json5`
-- `examples/one_shot_schema_extraction_geothermal/geothermal_jurisdictions_one.csv`
-- `examples/one_shot_schema_extraction_cst/cst_plugin_config.yaml`
-- `examples/compass_tech_pipeline/README.md`
+Consult example plugin configurations in `examples/` following the tech-first naming pattern:
+- `<tech>_plugin_config.yaml` — standard one-shot config
+- See `examples/water_rights_demo/one-shot/plugin_config.yaml` for multi-document edge cases
 
 ## Two retrieval phases
 
@@ -70,6 +70,15 @@ Avoid spaces around `=` in `.env` assignments.
 5. Start with dynamic search, then switch to deterministic known URLs when
   search infrastructure is unstable.
 
+When using `heuristic_keywords`, include all required lists:
+- `GOOD_TECH_KEYWORDS`
+- `GOOD_TECH_PHRASES`
+- `GOOD_TECH_ACRONYMS`
+- `NOT_TECH_WORDS`
+
+If any required list is missing or empty, COMPASS raises a plugin
+configuration error and extraction quality should be treated as failed.
+
 For first-pass reliability, test retrieval with deterministic known URLs
 before using live web search.
 
@@ -104,7 +113,7 @@ Then validate:
 	- query templates (affects SE phase),
 	- URL weights (affects both phases),
 	- include/exclude heuristic patterns (pre-LLM filter),
-	- `not_tech_words` (upstream document rejection).
+  - `NOT_TECH_WORDS` (upstream document rejection).
 6. Re-run same sample and compare.
 
 ## Cross-tech onboarding
@@ -117,7 +126,7 @@ When reusing this workflow for any technology:
 - seed `known_doc_urls` with authoritative regulatory documents for smoke
   testing,
 - avoid copying negatives from previous technologies into the new tech config,
-- verify `not_tech_words` excludes adjacent technologies for your domain.
+- verify `NOT_TECH_WORDS` excludes adjacent technologies for your domain.
 
 ## Phase gates
 
diff --git a/.github/skills/yaml-setup/SKILL.md b/.github/skills/yaml-setup/SKILL.md
index a9f93d17..11a360af 100644
--- a/.github/skills/yaml-setup/SKILL.md
+++ b/.github/skills/yaml-setup/SKILL.md
@@ -5,23 +5,25 @@ description: Author and tune one-shot plugin YAML configs for COMPASS-native doc
 
 # YAML Setup Skill
 
+**ONE-SHOT EXTRACTION ONLY.** This skill applies only to schema-driven extraction.
+For legacy decision-tree extraction, consult COMPASS architecture docs.
+
 Use this skill to create or tune one-shot plugin YAML that controls retrieval,
 filtering, and text collection behavior.
 
 ## When to use
 
-- New technology onboarding in one-shot extraction.
+- New technology onboarding in one-shot extraction (NOT decision-tree extraction).
 - Schema exists but source relevance is weak.
 - You need reproducible config handoff across teams.
 
-## Example references (optional)
+## Canonical reference
+
+With tech-first naming, configuration examples follow this pattern:
+- `examples/one_shot_schema_extraction/<tech>_plugin_config.yaml` — standard working example
+- `examples/water_rights_demo/one-shot/plugin_config.yaml` — multi-doc edge case
 
-- `examples/one_shot_schema_extraction_geothermal/geothermal_plugin_config.yaml`
-- `examples/one_shot_schema_extraction_geothermal/README.rst`
-- `examples/one_shot_schema_extraction_geothermal/geothermal_one_shot_guide.md`
-- `docs/source/examples/one_shot_schema_extraction/plugin_config_minimal.json`
-- `docs/source/examples/one_shot_schema_extraction/plugin_config_simple.json5`
-- `docs/source/examples/one_shot_schema_extraction/plugin_config.yaml`
+Refer to any complete example in `examples/` that matches your retrieval goals.
 
 ## Naming convention
 
@@ -42,6 +44,14 @@ Avoid spaces around `=` in `.env` assignments.
 schema: ./my_schema.json
 ```
 
+## Non-negotiable runtime constraints
+
+- Jurisdiction CSV headers are case-sensitive: use `County,State`.
+- If `heuristic_keywords` is present, it must include all four lists and
+  none may be empty.
+- A run is not considered passing if logs show config errors or if
+  extracted jurisdiction count is zero.
+
 ## Key plugin YAML fields
 
 | Field | Type | Behavior |
@@ -56,6 +66,26 @@ schema: ./my_schema.json
 | `extraction_system_prompt` | string | Overrides default LLM system prompt for the extraction step. Use this to scope extraction tightly to the target technology. |
 | `cache_llm_generated_content` | bool | Cache LLM-generated `query_templates` and `website_keywords`. Set to `false` when iterating schema to see live changes. |
 
+## Required `heuristic_keywords` shape
+
+Use this exact structure when defining `heuristic_keywords`:
+
+```yaml
+heuristic_keywords:
+  GOOD_TECH_KEYWORDS:
+    - "<required single-word term>"
+  GOOD_TECH_PHRASES:
+    - "<required multi-word phrase>"
+  GOOD_TECH_ACRONYMS:
+    - "<required acronym or short token>"
+  NOT_TECH_WORDS:
+    - "<required exclusion term>"
+```
+
+Notes:
+- Keys are normalized, but using canonical key names reduces mistakes.
+- All four lists are required and must be non-empty.
+
 ### `collection_prompts: true` and `text_extraction_prompts: true`
 
 Setting either flag to `true` (not a list) instructs COMPASS to use the LLM
@@ -87,11 +117,11 @@ for a complete example.
    - Confirm schema path and extraction invocation work.
 2. **Simple**
    - Add `query_templates`, `heuristic_keywords`, and `cache_llm_generated_content`.
-   - Set `collection_prompts: true` and `text_extraction_prompts: true` to
-     let the LLM auto-generate prompts from the schema.
 3. **Full**
    - Add `extraction_system_prompt` if scope bleed or off-domain extraction
      is observed.
+   - Set `collection_prompts: true` and `text_extraction_prompts: true` to
+     let the LLM auto-generate prompts from the schema.
    - Replace `heuristic_keywords: true` with an explicit list if precision
      is insufficient.
 
@@ -114,44 +144,61 @@ website_keywords:
   zoning: 2880
   permit: 1440
 heuristic_keywords:
-  good_tech_keywords:
+  GOOD_TECH_KEYWORDS:
     - "<tech keyword 1>"
     - "<tech keyword 2>"
-  good_tech_acronyms:
+  GOOD_TECH_ACRONYMS:
     - "<tech acronym>"
-  good_tech_phrases:
+  GOOD_TECH_PHRASES:
     - "<tech phrase 1>"
     - "<tech phrase 2>"
-  not_tech_words:
+  NOT_TECH_WORDS:
     - "<adjacent technology term 1>"
     - "<adjacent technology term 2>"
-collection_prompts: true
-text_extraction_prompts: true
-extraction_system_prompt: |-
-  You are a legal scholar extracting structured data from
-  utility-scale <tech> ordinances.
-
-  Extract only requirements for utility-scale <tech> facilities.
-  Exclude adjacent technologies and non-target use cases.
 ```
 
 Swap vocabulary for any technology while keeping the same structure.
 
 ## Stable development mode
 
-Plugin YAML controls retrieval behavior, but deterministic acquisition for
-smoke tests belongs in run config:
+Use run-config controls for deterministic smoke tests while iterating schema:
 
-- `known_doc_urls` or `known_local_docs`
-- `perform_se_search: false`
-- `perform_website_search: false` (disables the website crawl second phase)
+- `known_doc_urls` or `known_local_docs` — bypass live search
+- `perform_se_search: false` — disable search-engine phase
+- `perform_website_search: false` — disable website crawl phase
 
-Use this mode first, then re-enable search once schema extraction quality is
-stable.
+Re-enable search only after extraction quality is stable on known documents.
 
 Recommended baseline: use dynamic search first, then use deterministic mode
 if search infrastructure fails.
 
+## Minimal run-config contract (to pair with plugin YAML)
+
+Use this pattern and require users to provide their own model and client
+values:
+
+```json5
+{
+  out_dir: "./outputs_<tech>_<run_id>",
+  tech: "<tech>",
+  jurisdiction_fp: "./<tech>_jurisdictions.csv",
+  perform_se_search: true,
+  perform_website_search: false,
+  model: [
+    {
+      name: "<PROVIDE-YOUR-MODEL-NAME>",
+      llm_call_kwargs: { temperature: 0, timeout: 600 },
+      client_kwargs: {
+        api_version: "<PROVIDE-YOUR-API-VERSION>",
+        azure_endpoint: "<PROVIDE-YOUR-AZURE-ENDPOINT>"
+      }
+    }
+  ]
+}
+```
+
+Do not hardcode model names in skills. Prompt the user to supply `name`.
+
 ## Acquisition phases
 
 COMPASS acquisition runs in two sequential phases per jurisdiction:
@@ -179,7 +226,7 @@ When adapting to another technology:
 
 - replace vocabulary in `query_templates` and `website_keywords`,
 - keep legal-code terms (`ordinance`, `zoning`, `code of ordinances`),
-- keep non-target exclusions explicit in `not_tech_words`,
+- keep non-target exclusions explicit in `NOT_TECH_WORDS`,
 - do not carry terms from a previous technology into new tech configs,
 - write a technology-specific `extraction_system_prompt`.
 

From a71447f1dbf7197cf651538d720f88faaf73a7e8 Mon Sep 17 00:00:00 2001
From: Byron Pullutasig <115118857+bpulluta@users.noreply.github.com>
Date: Tue, 17 Mar 2026 13:32:28 -0600
Subject: [PATCH 3/7] update one-shot SKILL.md structure and trigger contracts

---
 .github/skills/extraction-run/SKILL.md  | 28 ++++++++++++++++++-------
 .github/skills/schema-creation/SKILL.md | 20 +++++++++++++++---
 .github/skills/web-scraper/SKILL.md     | 24 ++++++++++++++++-----
 .github/skills/yaml-setup/SKILL.md      | 17 ++++++++++++++-
 4 files changed, 73 insertions(+), 16 deletions(-)

diff --git a/.github/skills/extraction-run/SKILL.md b/.github/skills/extraction-run/SKILL.md
index e77a10cc..00be8f92 100644
--- a/.github/skills/extraction-run/SKILL.md
+++ b/.github/skills/extraction-run/SKILL.md
@@ -1,6 +1,6 @@
 ---
 name: extraction-run
-description: Execute one-shot extraction with COMPASS, evaluate outputs, and iterate schema/config changes with minimal cost.
+description: Execute one-shot extraction with COMPASS and iterate quickly with low cost. Use whenever a user asks to run, smoke-test, validate, debug, or scale one-shot schema extraction for any technology.
 ---
 
 # Extraction Run Skill
@@ -19,6 +19,26 @@ then iterate quickly until you have stable structured outputs.
 - You need a reliable smoke-test workflow before scaling.
 - You are NOT using legacy decision-tree extraction.
 
+## Do not use
+
+- Legacy decision-tree extraction feature engineering.
+- Python parser implementation in `compass/extraction/<tech>/parse.py`.
+- Non-extraction tasks (for example docs-only updates).
+
+## Expected assistant output
+
+When using this skill, return:
+
+1. The exact `pixi run compass process ...` command used.
+2. A pass/fail decision against extraction-quality gates.
+3. The smallest next config/schema change and why.
+
+## Canonical reference
+
+- `examples/one_shot_schema_extraction/` — complete working examples
+- `examples/one_shot_schema_extraction/README.rst` — general one-shot overview
+- `examples/water_rights_demo/one-shot/` — multi-doc extraction example
+
 ## Two-pipeline modes
 
 COMPASS supports two distinct extraction pipelines. Choose one and do not mix
@@ -107,12 +127,6 @@ unchanged and swap only technology-specific inputs:
 
 Change one axis per run unless debugging infrastructure failures.
 
-## Canonical reference
-
-- `examples/one_shot_schema_extraction/` — complete working examples
-- `examples/one_shot_schema_extraction/README.rst` — general one-shot overview
-- `examples/water_rights_demo/one-shot/` — multi-doc extraction example
-
 ## Environment setup
 
 Load secrets from `.env` before running. Never commit key values in config files.
diff --git a/.github/skills/schema-creation/SKILL.md b/.github/skills/schema-creation/SKILL.md
index 805d8dc9..c4941bc1 100644
--- a/.github/skills/schema-creation/SKILL.md
+++ b/.github/skills/schema-creation/SKILL.md
@@ -1,6 +1,6 @@
 ---
 name: schema-creation
-description: Author and iterate one-shot extraction schemas that replace legacy decision-tree extraction logic in native COMPASS.
+description: Author and iterate one-shot extraction schemas for native COMPASS. Use whenever a user asks to create, expand, or debug schema feature definitions, value/unit rules, or extraction instructions.
 ---
 
 # Schema Creation Skill
@@ -19,6 +19,19 @@ The schema is the single most important config file for output quality.
 - Fixing inconsistent or incorrect extracted values in one-shot extraction.
 - Adding new features to an existing one-shot extraction.
 
+## Do not use
+
+- Retrieval tuning tasks that belong in plugin YAML.
+- Legacy decision-tree extraction parser implementation.
+
+## Expected assistant output
+
+When using this skill, return:
+
+1. The proposed schema diff (or full schema block) for the targeted features.
+2. The rationale for VALUE, UNITS, and IGNORE wording.
+3. A smoke-test check plan for validating the schema change.
+
 ## Canonical reference
 
 For complete examples, see the `examples/` directory:
@@ -63,7 +76,8 @@ These five fields map directly to the output CSV columns. COMPASS adds
 ## Build sequence
 
 1. **Define the feature enum** — one stable lowercase ID per siting-relevant
-   requirement. Group IDs by family (setbacks, noise, zoning, permitting).
+   requirement. Keep naming consistent across iterations and group IDs by
+   family (setbacks, noise, zoning, permitting).
 2. **Define `value` and `units` rules per feature family** — in each
    feature's `description`, state the expected value type and accepted unit
    vocabulary explicitly.
@@ -147,7 +161,7 @@ When cloning this schema for a new technology:
 
 ## Quality checklist
 
-- [ ] Feature enum uses stable, lowercase, underscore-separated IDs.
+- [ ] Feature enum uses stable, consistent IDs across all runs.
 - [ ] Every feature description contains VALUE, UNITS, and IGNORE clauses.
 - [ ] `$instructions` block is present with all five keys.
 - [ ] `additionalProperties: false` is set on the top-level object and on
diff --git a/.github/skills/web-scraper/SKILL.md b/.github/skills/web-scraper/SKILL.md
index f5149364..27a3fa37 100644
--- a/.github/skills/web-scraper/SKILL.md
+++ b/.github/skills/web-scraper/SKILL.md
@@ -1,6 +1,6 @@
 ---
 name: web-scraper
-description: Build and tune one-shot plugin configs that search, rank, and collect ordinance documents with native COMPASS pipeline settings.
+description: Build and tune retrieval configs that search, rank, and collect ordinance documents in COMPASS. Use whenever a user asks to improve retrieval precision/recall, tune search queries/keywords, or debug acquisition quality before extraction tuning.
 ---
 
 # Web Scraper Skill
@@ -15,11 +15,18 @@ pipelines.
 - Ordinance recall is weak across jurisdictions (one-shot extraction).
 - LLM filtering is compensating for poor search quality.
 
-## Scope
+## Do not use
 
-- Query-template strategy.
-- URL ranking and filtering patterns.
-- Heuristic phrase controls before LLM validation.
+- Schema feature definition or value extraction logic design.
+- Post-extraction feature/value debugging when retrieval is already correct.
+
+## Expected assistant output
+
+When using this skill, return:
+
+1. The retrieval axis changed (queries, keyword weights, or heuristics).
+2. Evidence from artifacts/logs showing why the change was needed.
+3. The next run command against the same jurisdiction sample.
 
 ## Canonical reference
 
@@ -27,6 +34,12 @@ Consult example plugin configurations in `examples/` following the tech-first na
 - `<tech>_plugin_config.yaml` — standard one-shot config
 - See `examples/water_rights_demo/one-shot/plugin_config.yaml` for multi-document edge cases
 
+## Scope
+
+- Query-template strategy.
+- URL ranking and filtering patterns.
+- Heuristic phrase controls before LLM validation.
+
 ## Two retrieval phases
 
 COMPASS runs two sequential acquisition passes per jurisdiction:
@@ -141,3 +154,4 @@ When reusing this workflow for any technology:
 - Preserve auditable rationale for each retrieval change.
 - Keep one canonical retrieval config per active technology.
 - Ensure each run uses a unique `out_dir` to avoid COMPASS aborting early.
+
diff --git a/.github/skills/yaml-setup/SKILL.md b/.github/skills/yaml-setup/SKILL.md
index 11a360af..af2a82e5 100644
--- a/.github/skills/yaml-setup/SKILL.md
+++ b/.github/skills/yaml-setup/SKILL.md
@@ -1,6 +1,6 @@
 ---
 name: yaml-setup
-description: Author and tune one-shot plugin YAML configs for COMPASS-native document discovery, filtering, and text collection.
+description: Author and tune one-shot plugin YAML for COMPASS document discovery, filtering, and text collection. Use whenever a user asks to create, clean up, standardize, or troubleshoot one-shot plugin YAML for technology onboarding.
 ---
 
 # YAML Setup Skill
@@ -17,6 +17,20 @@ filtering, and text collection behavior.
 - Schema exists but source relevance is weak.
 - You need reproducible config handoff across teams.
 
+## Do not use
+
+- Legacy decision-tree parser implementation changes.
+- Schema feature semantics work that belongs in `<tech>_schema.json`.
+- Run-result diagnosis after outputs are generated (use iteration loop skill).
+
+## Expected assistant output
+
+When using this skill, return:
+
+1. The finalized plugin YAML content or exact diff.
+2. Any required paired run-config changes.
+3. A validation command and pass/fail checks for the edited YAML.
+
 ## Canonical reference
 
 With tech-first naming, configuration examples follow this pattern:
@@ -244,3 +258,4 @@ If running outside the tech folder, use absolute paths for `-c` and `-p`.
 - Feature logic belongs in schema.
 - Adjust one tuning axis per run for clean attribution.
 - Keep one canonical plugin file per technology in the active example folder.
+

From 74495a6648822194a966b0e48c80a4b37413c5b8 Mon Sep 17 00:00:00 2001
From: Copilot <198982749+Copilot@users.noreply.github.com>
Date: Tue, 17 Mar 2026 15:21:58 -0600
Subject: [PATCH 4/7] Initial plan (#398)

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>

From 81fcbff8b786d1167527d0e0b5a22841ab05055c Mon Sep 17 00:00:00 2001
From: Copilot <198982749+Copilot@users.noreply.github.com>
Date: Tue, 17 Mar 2026 16:26:22 -0600
Subject: [PATCH 5/7] Fix skills documentation: correct paths, caching
 behavior, and tab formatting (#399)

* Initial plan

* Fix all review comments in skills documentation

Co-authored-by: bpulluta <115118857+bpulluta@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: bpulluta <115118857+bpulluta@users.noreply.github.com>
---
 .github/skills/extraction-run/SKILL.md | 44 ++++++++++++++------------
 .github/skills/web-scraper/SKILL.md    | 20 +++++++-----
 .github/skills/yaml-setup/SKILL.md     | 14 +++++---
 3 files changed, 44 insertions(+), 34 deletions(-)

diff --git a/.github/skills/extraction-run/SKILL.md b/.github/skills/extraction-run/SKILL.md
index 00be8f92..a356eea2 100644
--- a/.github/skills/extraction-run/SKILL.md
+++ b/.github/skills/extraction-run/SKILL.md
@@ -56,15 +56,17 @@ only a schema JSON, a plugin YAML, and a run config — no Python source changes
 
 New technology assets start in `examples/` and finish in `compass/extraction/`:
 
-1. **Develop** — place all assets in `examples/one_shot_schema_extraction_<tech>/`
+1. **Develop** — place all assets in `examples/one_shot_schema_extraction/`
 2. **Stabilize** — iterate schema/plugin until smoke and robustness gates pass
 3. **Promote** — copy the three finalized files into `compass/extraction/<tech>/`:
    - `<tech>_schema.json`
    - `<tech>_plugin_config.yaml`
    - `<tech>_config.json5` (optional; useful as a reference run config)
+   - `__init__.py` — registers the plugin via `create_schema_based_one_shot_extraction_plugin`
 
-The promoted extraction folder contains only config files — no Python code is
-needed for one-shot techs.
+   After creating the package, add an import in `compass/extraction/__init__.py`
+   to register the plugin at startup. See `compass/extraction/ghp/__init__.py`
+   for a reference implementation.
 
 ## Required inputs
 
@@ -78,10 +80,10 @@ needed for one-shot techs.
 - Jurisdiction CSV has headers `County,State`.
 - `out_dir` is unique for this run.
 - At least one acquisition step is enabled:
-	`perform_se_search: true`, `perform_website_search: true`,
-	`known_doc_urls`, or `known_local_docs`.
+  `perform_se_search: true`, `perform_website_search: true`,
+  `known_doc_urls`, or `known_local_docs`.
 - If `heuristic_keywords` exists, all four required lists are present and
-	non-empty.
+  non-empty.
 
 ## Naming convention
 
@@ -106,7 +108,7 @@ to deterministic mode only when search infrastructure is unstable:
 2. Use your preferred configured search engine.
 3. Load `.env` into shell (`set -a && source .env && set +a`).
 4. Run with verbose logs:
-	 - `pixi run compass process -c config.json5 -p plugin.yaml -v`
+   - `pixi run compass process -c config.json5 -p plugin.yaml -v`
 5. Confirm output artifacts exist before tuning schema semantics.
 
 Fallback mode when needed:
@@ -144,11 +146,11 @@ pixi run compass process -c config.json5 -p path/to/plugin_config.yaml -v
 ## Phase-gated workflow
 
 1. **Smoke test (1 jurisdiction)**
-	 - Goal: verify wiring and output contract.
+   - Goal: verify wiring and output contract.
 2. **Robustness (5 jurisdictions)**
-	 - Goal: verify feature stability and edge-case handling.
+   - Goal: verify feature stability and edge-case handling.
 3. **Scale (full set)**
-	 - Goal: only after earlier phases pass acceptance gates.
+   - Goal: only after earlier phases pass acceptance gates.
 
 ## Validation checklist
 
@@ -194,7 +196,7 @@ Check in order:
 1. `outputs/*/cleaned_text/*.txt` (text extraction present)
 2. `outputs/*/jurisdiction_dbs/*.csv` (per-jurisdiction parsed rows)
 3. `outputs/*/quantitative_ordinances.csv` and
-	 `outputs/*/qualitative_ordinances.csv` (final compiled results)
+   `outputs/*/qualitative_ordinances.csv` (final compiled results)
 
 Treat the run as **failed for extraction quality** when either is true:
 - `Number of jurisdictions with extracted data: 0`
@@ -207,20 +209,20 @@ Only treat a run as passing when both are true:
 ## Root-cause triage
 
 - **Wrong or noisy documents**
-	- Tune query templates, URL keywords, and exclusions.
-	- Prefer `known_doc_urls` while stabilizing.
+  - Tune query templates, URL keywords, and exclusions.
+  - Prefer `known_doc_urls` while stabilizing.
 - **Right documents, wrong fields**
-	- Tune schema descriptions/examples and ambiguity rules.
-	- Check `extraction_system_prompt` in plugin YAML — it is the primary
-	  guard against scope bleed from generic legal documents.
+  - Tune schema descriptions/examples and ambiguity rules.
+  - Check `extraction_system_prompt` in plugin YAML — it is the primary
+    guard against scope bleed from generic legal documents.
 - **Correct values, unstable formatting**
-	- Tighten enums, unit vocabulary, and null behavior.
+  - Tighten enums, unit vocabulary, and null behavior.
 - **Nothing downloaded / unstable search**
-	- Disable live search and use deterministic known URLs/local docs.
+  - Disable live search and use deterministic known URLs/local docs.
 - **0 documents found for a jurisdiction during website crawl**
-	- Expected for jurisdictions with few online ordinances. The website
-	  crawl is a second acquisition pass after search-engine retrieval;
-	  0 results there is not a pipeline failure.
+  - Expected for jurisdictions with few online ordinances. The website
+    crawl is a second acquisition pass after search-engine retrieval;
+    0 results there is not a pipeline failure.
 
 ## Acceptance gates
 
diff --git a/.github/skills/web-scraper/SKILL.md b/.github/skills/web-scraper/SKILL.md
index 27a3fa37..05a078f0 100644
--- a/.github/skills/web-scraper/SKILL.md
+++ b/.github/skills/web-scraper/SKILL.md
@@ -30,9 +30,12 @@ When using this skill, return:
 
 ## Canonical reference
 
-Consult example plugin configurations in `examples/` following the tech-first naming pattern:
-- `<tech>_plugin_config.yaml` — standard one-shot config
-- See `examples/water_rights_demo/one-shot/plugin_config.yaml` for multi-document edge cases
+Consult example plugin configurations in `examples/`:
+- `examples/one_shot_schema_extraction/plugin_config.yaml` — standard one-shot config
+- `examples/water_rights_demo/one-shot/plugin_config.yaml` — multi-document edge cases
+
+When creating new tech configs, use `<tech>_plugin_config.yaml` as a recommended
+naming convention (e.g. `geothermal_plugin_config.yaml`).
 
 ## Scope
 
@@ -49,7 +52,8 @@ COMPASS runs two sequential acquisition passes per jurisdiction:
    ordinance documents.
 2. **Website crawl phase** — crawls the jurisdiction's official website,
    ranking pages using `website_keywords`. This phase is a secondary pass
-   and runs even if the SE phase found documents.
+   and runs only if the search-engine phase did not yield an ordinance
+   context.
 
 Key behaviors:
 - Playwright browser errors during the website crawl phase are **non-fatal**.
@@ -123,10 +127,10 @@ Then validate:
 3. Run heuristic filter and review false rejects/accepts (`cleaned_text/`).
 4. Check website crawl phase independently if needed (enable, run, inspect logs).
 5. Update one axis only:
-	- query templates (affects SE phase),
-	- URL weights (affects both phases),
-	- include/exclude heuristic patterns (pre-LLM filter),
-  - `NOT_TECH_WORDS` (upstream document rejection).
+   - query templates (affects SE phase),
+   - URL weights (affects both phases),
+   - include/exclude heuristic patterns (pre-LLM filter),
+   - `NOT_TECH_WORDS` (upstream document rejection).
 6. Re-run same sample and compare.
 
 ## Cross-tech onboarding
diff --git a/.github/skills/yaml-setup/SKILL.md b/.github/skills/yaml-setup/SKILL.md
index af2a82e5..1502085c 100644
--- a/.github/skills/yaml-setup/SKILL.md
+++ b/.github/skills/yaml-setup/SKILL.md
@@ -33,10 +33,15 @@ When using this skill, return:
 
 ## Canonical reference
 
-With tech-first naming, configuration examples follow this pattern:
-- `examples/one_shot_schema_extraction/<tech>_plugin_config.yaml` — standard working example
+Consult the working examples in `examples/`:
+- `examples/one_shot_schema_extraction/plugin_config.yaml` — standard working example
 - `examples/water_rights_demo/one-shot/plugin_config.yaml` — multi-doc edge case
 
+When creating new tech configs, `<tech>_plugin_config.yaml` is the recommended
+naming convention (e.g. `geothermal_plugin_config.yaml`). The existing
+`plugin_config.yaml` examples use a generic name; new tech-specific assets
+should use the tech-first naming pattern.
+
 Refer to any complete example in `examples/` that matches your retrieval goals.
 
 ## Naming convention
@@ -78,7 +83,7 @@ schema: ./my_schema.json
 | `collection_prompts` | list or `true` | Text collection prompt(s). If **`true`**, LLM auto-generates from schema. |
 | `text_extraction_prompts` | list or `true` | Text consolidation prompt(s). If **`true`**, LLM auto-generates from schema. |
 | `extraction_system_prompt` | string | Overrides default LLM system prompt for the extraction step. Use this to scope extraction tightly to the target technology. |
-| `cache_llm_generated_content` | bool | Cache LLM-generated `query_templates` and `website_keywords`. Set to `false` when iterating schema to see live changes. |
+| `cache_llm_generated_content` | bool | Cache LLM-generated `query_templates`, `website_keywords`, and `heuristic_keywords`. Set to `false` when iterating schema to see live changes. |
 
 ## Required `heuristic_keywords` shape
 
@@ -122,8 +127,7 @@ extraction_system_prompt: |-
   Prefer explicit values. Use null for qualitative obligations.
 ```
 
-See `compass/extraction/geothermal_electricity/geothermal_plugin_config.yaml`
-for a complete example.
+See `compass/extraction/ghp/plugin_config.yaml` for a complete example.
 
 ## Progressive config path
 

From 1b8571f283056d906ee91ccbb8f933966edd6cc8 Mon Sep 17 00:00:00 2001
From: Byron Pullutasig <115118857+bpulluta@users.noreply.github.com>
Date: Thu, 19 Mar 2026 17:07:05 -0600
Subject: [PATCH 6/7] renamed skills and fixed minor comments

---
 .../SKILL.md                                  |  28 ++-
 .github/skills/extraction-run/SKILL.md        |  18 +-
 .github/skills/iteration-development/SKILL.md | 224 ++++++++++++++++++
 .../SKILL.md                                  |  18 +-
 .github/skills/schema-creation/SKILL.md       |   5 +-
 5 files changed, 266 insertions(+), 27 deletions(-)
 rename .github/skills/{web-scraper => document-retrieval}/SKILL.md (81%)
 create mode 100644 .github/skills/iteration-development/SKILL.md
 rename .github/skills/{yaml-setup => plugin-config-setup}/SKILL.md (91%)

diff --git a/.github/skills/web-scraper/SKILL.md b/.github/skills/document-retrieval/SKILL.md
similarity index 81%
rename from .github/skills/web-scraper/SKILL.md
rename to .github/skills/document-retrieval/SKILL.md
index 05a078f0..9c077424 100644
--- a/.github/skills/web-scraper/SKILL.md
+++ b/.github/skills/document-retrieval/SKILL.md
@@ -1,5 +1,5 @@
 ---
-name: web-scraper
+name: document-retrieval
 description: Build and tune retrieval configs that search, rank, and collect ordinance documents in COMPASS. Use whenever a user asks to improve retrieval precision/recall, tune search queries/keywords, or debug acquisition quality before extraction tuning.
 ---
 
@@ -87,11 +87,19 @@ Avoid spaces around `=` in `.env` assignments.
 5. Start with dynamic search, then switch to deterministic known URLs when
   search infrastructure is unstable.
 
-When using `heuristic_keywords`, include all required lists:
-- `GOOD_TECH_KEYWORDS`
-- `GOOD_TECH_PHRASES`
-- `GOOD_TECH_ACRONYMS`
-- `NOT_TECH_WORDS`
+When using `heuristic_keywords`, use these four lists to guide pre-LLM filtering:
+- `GOOD_TECH_KEYWORDS` — strong indicators of the target technology
+  (e.g., facility types, deployment modes). Documents matching even a
+  few keywords are marked as candidates.
+- `GOOD_TECH_PHRASES` — multi-word phrases that signal relevant
+  ordinance content. Keep specific to avoid false positives.
+- `GOOD_TECH_ACRONYMS` — industry-standard abbreviations for the
+  technology. Narrow list; include only widely recognized acronyms.
+- `NOT_TECH_WORDS` — pre-heuristic filter that rejects documents
+  before keyword matching. Use to exclude adjacent technologies and
+  irrelevant domains (e.g., residential HVAC, unrelated industries).
+  Runs first; prevents wasted keyword evaluation on clearly-wrong
+  documents.
 
 If any required list is missing or empty, COMPASS raises a plugin
 configuration error and extraction quality should be treated as failed.
@@ -107,6 +115,10 @@ before using live web search.
   `code of ordinances`, `use table`, and `special use permit`.
 
 ## Deterministic smoke-test mode
+For this smoke test, at least one of the following documentation sources must be provided:
+
+- **`known_doc_urls`**: A list of URLs pointing to external documentation that the scraper can access and parse
+- **`known_local_docs`**: A collection of local documentation files available in the repository or system
 
 Use run-config controls to bypass flaky search while tuning:
 
@@ -148,8 +160,8 @@ When reusing this workflow for any technology:
 ## Phase gates
 
 - **3 jurisdictions**: ensure major source classes are found.
-- **10-25 jurisdictions**: verify stability across regions.
-- **Full scale**: only once false positive/negative rates stabilize.
+- **10 jurisdictions**: verify stability across regions.
+
 
 ## Guardrails
 
diff --git a/.github/skills/extraction-run/SKILL.md b/.github/skills/extraction-run/SKILL.md
index a356eea2..ed41439d 100644
--- a/.github/skills/extraction-run/SKILL.md
+++ b/.github/skills/extraction-run/SKILL.md
@@ -6,7 +6,7 @@ description: Execute one-shot extraction with COMPASS and iterate quickly with l
 # Extraction Run Skill
 
 **ONE-SHOT EXTRACTION ONLY.** This skill applies only to schema-driven extraction.
-For legacy decision-tree extraction (solar, wind, small wind), consult COMPASS 
+For decision-tree extraction (solar, wind, small wind), consult COMPASS
 architecture docs.
 
 Use this skill to run one-shot extraction in a repeatable, low-risk way,
@@ -17,11 +17,11 @@ then iterate quickly until you have stable structured outputs.
 - Schema exists and plugin config points to it.
 - You are onboarding a new technology (diesel generator, geothermal, CHP, hydrogen).
 - You need a reliable smoke-test workflow before scaling.
-- You are NOT using legacy decision-tree extraction.
+- You are NOT using decision-tree extraction.
 
 ## Do not use
 
-- Legacy decision-tree extraction feature engineering.
+- Decision-tree extraction feature engineering.
 - Python parser implementation in `compass/extraction/<tech>/parse.py`.
 - Non-extraction tasks (for example docs-only updates).
 
@@ -47,7 +47,7 @@ them for the same technology:
 | Mode | Where code lives | Good for |
 |---|---|---|
 | **One-shot (schema-based)** | `examples/` → `compass/extraction/<tech>/` | New techs, no Python changes |
-| **Legacy decision-tree** | Python code in `compass/extraction/<tech>/` | Existing solar, wind, small wind |
+| **decision-tree** | Python code in `compass/extraction/<tech>/` | Existing solar, wind, small wind |
 
 One-shot is the correct path for all new technology onboarding. It requires
 only a schema JSON, a plugin YAML, and a run config — no Python source changes.
@@ -61,7 +61,6 @@ New technology assets start in `examples/` and finish in `compass/extraction/`:
 3. **Promote** — copy the three finalized files into `compass/extraction/<tech>/`:
    - `<tech>_schema.json`
    - `<tech>_plugin_config.yaml`
-   - `<tech>_config.json5` (optional; useful as a reference run config)
    - `__init__.py` — registers the plugin via `create_schema_based_one_shot_extraction_plugin`
 
    After creating the package, add an import in `compass/extraction/__init__.py`
@@ -77,7 +76,7 @@ New technology assets start in `examples/` and finish in `compass/extraction/`:
 
 ## Preflight checks (must pass before run)
 
-- Jurisdiction CSV has headers `County,State`.
+- Jurisdiction CSV has headers `County,State` or `County,State,Subdivision,Jurisdiction Type`.
 - `out_dir` is unique for this run.
 - At least one acquisition step is enabled:
   `perform_se_search: true`, `perform_website_search: true`,
@@ -89,7 +88,6 @@ New technology assets start in `examples/` and finish in `compass/extraction/`:
 
 Use tech-first names for all one-shot assets:
 
-- `<tech>_config*.json5`
 - `<tech>_plugin_config.yaml`
 - `<tech>_schema.json`
 - `<tech>_jurisdictions*.csv`
@@ -149,8 +147,6 @@ pixi run compass process -c config.json5 -p path/to/plugin_config.yaml -v
    - Goal: verify wiring and output contract.
 2. **Robustness (5 jurisdictions)**
    - Goal: verify feature stability and edge-case handling.
-3. **Scale (full set)**
-   - Goal: only after earlier phases pass acceptance gates.
 
 ## Validation checklist
 
@@ -161,10 +157,6 @@ Evaluate each run on:
 - section/summary traceability,
 - unit consistency,
 - null discipline,
-- **scope bleed** — check that no features appear in the output CSVs that
-  fall outside the schema enum; generic land-use-code documents can cause
-  unrelated provisions to leak through. Tighten `extraction_system_prompt`
-  in plugin YAML to fix this.
 
 ## Expected output artifacts
 
diff --git a/.github/skills/iteration-development/SKILL.md b/.github/skills/iteration-development/SKILL.md
new file mode 100644
index 00000000..120bbaef
--- /dev/null
+++ b/.github/skills/iteration-development/SKILL.md
@@ -0,0 +1,224 @@
+---
+name: iteration-development
+description: Run → inspect → fix cycle for one-shot extraction after initial setup. Use whenever a user asks to diagnose poor output, reduce scope bleed, improve precision/recall, or scale from smoke tests.
+---
+
+# Iteration Development Skill
+
+Use this skill after you have a working schema, plugin YAML, and run config
+and want to improve extraction quality through systematic iteration.
+
+## When to use
+
+- First smoke run produced output that needs diagnosis or improvement.
+- Feature values or units are wrong, missing, or inconsistent.
+- Retrieval is returning off-target documents.
+- Scaling from 3 jurisdictions to 10–25 or full production.
+
+## Do not use
+
+- First-time setup before any successful smoke run.
+- Legacy decision-tree extraction development.
+
+## Expected assistant output
+
+When using this skill, return:
+
+1. The observed failure class (retrieval, extraction scope, value/units, or null handling).
+2. One concrete fix on a single axis.
+3. The re-run command and pass/fail gate check.
+
+## Canonical reference
+
+- `examples/one_shot_schema_extraction/` — working examples
+  to use as a baseline for comparing output quality.
+
+
+## The run → inspect → fix loop
+
+**Three Phases:** This skill guides you through three phases, all built into
+example plugin configurations in the `examples/` directory.
+
+Repeat this cycle once per iteration. Change exactly one axis per cycle.
+
+```
+Run → Inspect outputs → Identify failure → Fix one axis → Re-run same sample
+```
+
+**Never change multiple axes in the same iteration.** You will not know
+which change caused the result.
+
+**Phases encoded in plugin YAML comments:**
+
+- **Phase 1 (Initial):** Includes query templates, website keywords, and
+  basic heuristic filters to avoid obvious off-domain results.
+  **This is ready to run immediately.**
+- **Phase 2 (Optional Refinement):** Uncomment advanced heuristic tuning
+  if Phase 1 retrieval produces off-target documents.
+- **Phase 3 (Optional Refinement):** Uncomment extraction_system_prompt
+  if Phase 1-2 retrieval works but extracted features are wrong (scope bleed).
+
+Start with Phase 1. Only add Phase 2 / 3 if Phase 1 results need improvement.
+See README.rst for the progression path.
+
+
+## Step 1: Inspect output artifacts
+
+After each run, check these locations inside `out_dir`:
+
+| Artifact | What to look for |
+|---|---|
+| `ordinance_files/*.pdf` | Are these on-target documents? |
+| `cleaned_text/*.txt` | Does page text contain target technology language? |
+| `jurisdiction_dbs/*.csv` | Are feature rows present? Are values correct? |
+| `quantitative_ordinances.csv` and `qualitative_ordinances.csv` | Final compiled output — check feature coverage and null rate |
+| `logs/<jurisdiction>/*.log` | Error messages, 0-document warnings |
+
+Minimum passing state for a smoke run:
+- At least one `ordinance_files/` PDF per jurisdiction.
+- At least one `cleaned_text/` file per jurisdiction.
+- Compiled ordinance CSV outputs contain rows for most jurisdictions.
+
+Immediate fail conditions (fix before any tuning):
+- Jurisdiction CSV header mismatch (must include at least `County,State`).
+- Plugin configuration exceptions in logs (for example missing required
+  `heuristic_keywords` lists).
+- `Number of jurisdictions with extracted data: 0`.
+
+
+## Step 2: Classify the failure
+
+Use this decision tree for any defect:
+
+```
+Is the right document being retrieved?
+  └─ No → retrieval problem → fix query templates / heuristic_keywords
+  └─ Yes
+       Is the document text present in cleaned_text/?
+         └─ No → text extraction problem → check PDF quality / OCR
+         └─ Yes
+              Are the right features being extracted?
+                └─ No, wrong feature names → schema enum or description problem
+                └─ No, off-domain features → scope bleed → fix extraction_system_prompt
+                └─ Yes, but wrong values/units → schema description or units problem
+                └─ Yes, but nulls where values should be → schema IGNORE clause too broad
+```
+
+
+## Step 3: Fix the right axis
+
+### Retrieval problems (wrong or missing documents)
+
+Fix in plugin YAML:
+- Add more specific `query_templates` with legal code terms
+  (e.g., `"filetype:pdf {jurisdiction} generator zoning code"`).
+- Add target technology terms to `GOOD_TECH_KEYWORDS` and
+  `GOOD_TECH_PHRASES`.
+- Add adjacent-technology terms being confounded to `NOT_TECH_WORDS`.
+- Increase `website_keywords` score for the most discriminating terms.
+
+Required `heuristic_keywords` keys when present:
+- `GOOD_TECH_KEYWORDS`
+- `GOOD_TECH_PHRASES`
+- `GOOD_TECH_ACRONYMS`
+- `NOT_TECH_WORDS`
+
+### Scope bleed (off-domain features extracted)
+
+Fix in plugin YAML `extraction_system_prompt`:
+- State explicitly what is excluded (e.g., "Do not extract requirements for
+  residential portable generators").
+- Add the same language to `$instructions.scope` in the schema for
+  reinforcement.
+
+### Wrong values or units
+
+Fix in schema JSON, in the affected feature's `description`:
+- Add or sharpen the `VALUE` rule.
+- Expand the `UNITS` vocabulary list.
+- Add a `IGNORE` clause for the near-miss case.
+
+### Missing values (nulls where data exists)
+
+Fix in schema JSON:
+- Broaden the feature description to cover the phrasing used in source
+  documents.
+- Remove overly restrictive IGNORE clauses.
+- Check that the feature ID is spelled exactly as it appears in the enum.
+
+### Text extraction failures (blank cleaned_text)
+
+- Verify the PDF is readable (not scanned without OCR).
+- Add `from_ocr: true` to the doc entry in `known_local_docs`.
+- Set `pytesseract_exe_fp` in run config if OCR is needed.
+
+
+## Iteration hygiene
+
+- Use a **unique `out_dir`** per iteration run. COMPASS aborts early if the
+  output directory already contains results.
+- Keep the same small jurisdiction sample across all iterations until
+  quality gates pass.
+- Record what changed and why in a short comment in the config file or
+  a separate `CHANGELOG.md` in the example folder.
+- Save schema versions as `<tech>_schema_v2.json` etc. to
+  preserve a diff history. Point `schema:` in plugin YAML to the active
+  version.
+
+
+## Scale-up protocol
+
+Only advance to the next phase when the current phase passes all gates.
+
+| Phase | Jurisdictions | Gates |
+|---|---|---|
+| Smoke | 1–3 | Output rows exist; feature names match schema enum; section/summary present for most rows |
+| Robustness | 10–25 | Feature value types are stable; null rate is explainable; no scope bleed |
+| Production | Full national set | False positive/negative rates acceptable; repeated runs show minimal drift |
+
+When advancing, keep the same config files. Only change the jurisdictions CSV.
+
+
+## Diagnostic commands
+
+```bash
+# Check if cleaned text was produced
+ls outputs/*/cleaned_text/
+
+# Count output rows per jurisdiction
+wc -l outputs/*/jurisdiction_dbs/*.csv
+
+# Check for scope bleed — feature values that are off-domain
+grep -v "diesel\|generator\|backup\|emergency" outputs/ordinances.csv | head -20
+
+# View logs for a specific jurisdiction
+cat outputs/logs/San\ Diego*/run.log | grep -i "error\|warning\|found 0"
+```
+
+
+## Common failure modes
+
+| Symptom | Most likely cause | Fix axis |
+|---|---|---|
+| 0 documents for all jurisdictions | Credentials not loaded / search API down | Load `.env`; use `known_doc_urls` |
+| Downloaded PDFs are from wrong domain | `query_templates` too generic | Narrow queries with `filetype:pdf` and legal code terms |
+| `cleaned_text` present but no output CSV rows | Schema enum mismatch or extraction prompt failing | Check schema path in plugin YAML; verify `tech` value in run config |
+| Off-domain feature names in output | Scope bleed from large land-use code | Add exclusion language to `extraction_system_prompt` |
+| Correct features but wrong values | Feature description lacks VALUE rule | Add explicit VALUE rule to affected descriptions |
+| Setback in wrong units | UNITS rule missing or implicit | Add explicit UNITS vocabulary to description |
+| Null rows for features that are in the document | IGNORE clause too broad, or feature description doesn't match source phrasing | Broaden description; remove over-strict IGNORE clause |
+| Playwright timeout errors in logs | Website crawl phase browser failure | Non-fatal; COMPASS continues. Use `known_doc_urls` while iterating |
+
+
+## Acceptance criteria before promotion
+
+A technology is ready to promote from `examples/` to
+`compass/extraction/<tech>/` when all of the following are true on the
+robustness run (10–25 jurisdictions):
+
+- [ ] Output CSV rows conform to required schema contract.
+- [ ] Feature IDs are stable and match the schema enum exactly.
+- [ ] Most non-null rows include a useful `section` and `summary`.
+- [ ] Repeated runs on the same sample show minimal drift.
+- [ ] No scope bleed (off-domain features) is observed.
+- [ ] Null rate for common features is explainable (jurisdiction has no rule).
diff --git a/.github/skills/yaml-setup/SKILL.md b/.github/skills/plugin-config-setup/SKILL.md
similarity index 91%
rename from .github/skills/yaml-setup/SKILL.md
rename to .github/skills/plugin-config-setup/SKILL.md
index 1502085c..57ec663d 100644
--- a/.github/skills/yaml-setup/SKILL.md
+++ b/.github/skills/plugin-config-setup/SKILL.md
@@ -1,5 +1,5 @@
 ---
-name: yaml-setup
+name: plugin-config-setup
 description: Author and tune one-shot plugin YAML for COMPASS document discovery, filtering, and text collection. Use whenever a user asks to create, clean up, standardize, or troubleshoot one-shot plugin YAML for technology onboarding.
 ---
 
@@ -87,6 +87,20 @@ schema: ./my_schema.json
 
 ## Required `heuristic_keywords` shape
 
+When using `heuristic_keywords`, use these four lists to guide pre-LLM filtering:
+- `GOOD_TECH_KEYWORDS` — strong indicators of the target technology
+  (e.g., facility types, deployment modes). Documents matching even a
+  few keywords are marked as candidates.
+- `GOOD_TECH_PHRASES` — multi-word phrases that signal relevant
+  ordinance content. Keep specific to avoid false positives.
+- `GOOD_TECH_ACRONYMS` — industry-standard abbreviations for the
+  technology. Narrow list; include only widely recognized acronyms.
+- `NOT_TECH_WORDS` — pre-heuristic filter that rejects documents
+  before keyword matching. Use to exclude adjacent technologies and
+  irrelevant domains (e.g., residential HVAC, unrelated industries).
+  Runs first; prevents wasted keyword evaluation on clearly-wrong
+  documents.
+
 Use this exact structure when defining `heuristic_keywords`:
 
 ```yaml
@@ -215,8 +229,6 @@ values:
 }
 ```
 
-Do not hardcode model names in skills. Prompt the user to supply `name`.
-
 ## Acquisition phases
 
 COMPASS acquisition runs in two sequential phases per jurisdiction:
diff --git a/.github/skills/schema-creation/SKILL.md b/.github/skills/schema-creation/SKILL.md
index c4941bc1..08981132 100644
--- a/.github/skills/schema-creation/SKILL.md
+++ b/.github/skills/schema-creation/SKILL.md
@@ -119,7 +119,7 @@ Organize `$definitions` by these families:
 | Physical design | `screening requirement`, `enclosure requirement`, `exhaust stack height` |
 | Zoning | `primary use districts`, `conditional use districts`, `prohibited use districts` |
 | Permitting | `permit requirement`, `capacity threshold` |
-| Compliance | `decommissioning`, `enactment date` |
+| Compliance | `decommissioning` |
 
 ## `$instructions` block
 
@@ -129,7 +129,6 @@ Always include a `$instructions` object at the top level with these keys:
 "$instructions": {
   "scope": "Describe exactly what to extract and what to ignore.",
   "null_handling": "Output every enum feature. Use null value and null summary when a feature is not found in the document. Do not omit features.",
-  "one_row_per_feature": "Output exactly one row per feature. If multiple values apply, use the most restrictive and describe variants in summary.",
   "verbatim_quotes": "In summary fields, prefer verbatim quotes from the source. Enclose in double quotation marks.",
   "units_discipline": "Do not convert units. Record them exactly as they appear in the document."
 }
@@ -150,7 +149,7 @@ Do not expand the feature enum to absorb scope bleed. Narrow the prompt.
 
 ## Cross-technology adaptation checklist
 
-When cloning this schema for a new technology:
+When cloning a schema for a new technology:
 
 - [ ] Replace all feature IDs with technology-specific names.
 - [ ] Replace value/units rules in every feature description.

From 3288e2663d174cc4f15cf8f4d1c2d777ea509c3f Mon Sep 17 00:00:00 2001
From: Byron Pullutasig <115118857+bpulluta@users.noreply.github.com>
Date: Thu, 26 Mar 2026 16:35:41 -0600
Subject: [PATCH 7/7] udpated skills Paul review march 26

---
 .github/skills/extraction-run/SKILL.md        |   1 -
 .github/skills/iteration-development/SKILL.md | 224 ------------------
 .github/skills/plugin-config-setup/SKILL.md   |  22 +-
 3 files changed, 12 insertions(+), 235 deletions(-)
 delete mode 100644 .github/skills/iteration-development/SKILL.md

diff --git a/.github/skills/extraction-run/SKILL.md b/.github/skills/extraction-run/SKILL.md
index ed41439d..c2fafa8b 100644
--- a/.github/skills/extraction-run/SKILL.md
+++ b/.github/skills/extraction-run/SKILL.md
@@ -15,7 +15,6 @@ then iterate quickly until you have stable structured outputs.
 ## When to use
 
 - Schema exists and plugin config points to it.
-- You are onboarding a new technology (diesel generator, geothermal, CHP, hydrogen).
 - You need a reliable smoke-test workflow before scaling.
 - You are NOT using decision-tree extraction.
 
diff --git a/.github/skills/iteration-development/SKILL.md b/.github/skills/iteration-development/SKILL.md
deleted file mode 100644
index 120bbaef..00000000
--- a/.github/skills/iteration-development/SKILL.md
+++ /dev/null
@@ -1,224 +0,0 @@
----
-name: iteration-development
-description: Run → inspect → fix cycle for one-shot extraction after initial setup. Use whenever a user asks to diagnose poor output, reduce scope bleed, improve precision/recall, or scale from smoke tests.
----
-
-# Iteration Development Skill
-
-Use this skill after you have a working schema, plugin YAML, and run config
-and want to improve extraction quality through systematic iteration.
-
-## When to use
-
-- First smoke run produced output that needs diagnosis or improvement.
-- Feature values or units are wrong, missing, or inconsistent.
-- Retrieval is returning off-target documents.
-- Scaling from 3 jurisdictions to 10–25 or full production.
-
-## Do not use
-
-- First-time setup before any successful smoke run.
-- Legacy decision-tree extraction development.
-
-## Expected assistant output
-
-When using this skill, return:
-
-1. The observed failure class (retrieval, extraction scope, value/units, or null handling).
-2. One concrete fix on a single axis.
-3. The re-run command and pass/fail gate check.
-
-## Canonical reference
-
-- `examples/one_shot_schema_extraction/` — working examples
-  to use as a baseline for comparing output quality.
-
-
-## The run → inspect → fix loop
-
-**Three Phases:** This skill guides you through three phases, all built into
-example plugin configurations in the `examples/` directory.
-
-Repeat this cycle once per iteration. Change exactly one axis per cycle.
-
-```
-Run → Inspect outputs → Identify failure → Fix one axis → Re-run same sample
-```
-
-**Never change multiple axes in the same iteration.** You will not know
-which change caused the result.
-
-**Phases encoded in plugin YAML comments:**
-
-- **Phase 1 (Initial):** Includes query templates, website keywords, and
-  basic heuristic filters to avoid obvious off-domain results.
-  **This is ready to run immediately.**
-- **Phase 2 (Optional Refinement):** Uncomment advanced heuristic tuning
-  if Phase 1 retrieval produces off-target documents.
-- **Phase 3 (Optional Refinement):** Uncomment extraction_system_prompt
-  if Phase 1-2 retrieval works but extracted features are wrong (scope bleed).
-
-Start with Phase 1. Only add Phase 2 / 3 if Phase 1 results need improvement.
-See README.rst for the progression path.
-
-
-## Step 1: Inspect output artifacts
-
-After each run, check these locations inside `out_dir`:
-
-| Artifact | What to look for |
-|---|---|
-| `ordinance_files/*.pdf` | Are these on-target documents? |
-| `cleaned_text/*.txt` | Does page text contain target technology language? |
-| `jurisdiction_dbs/*.csv` | Are feature rows present? Are values correct? |
-| `quantitative_ordinances.csv` and `qualitative_ordinances.csv` | Final compiled output — check feature coverage and null rate |
-| `logs/<jurisdiction>/*.log` | Error messages, 0-document warnings |
-
-Minimum passing state for a smoke run:
-- At least one `ordinance_files/` PDF per jurisdiction.
-- At least one `cleaned_text/` file per jurisdiction.
-- Compiled ordinance CSV outputs contain rows for most jurisdictions.
-
-Immediate fail conditions (fix before any tuning):
-- Jurisdiction CSV header mismatch (must include at least `County,State`).
-- Plugin configuration exceptions in logs (for example missing required
-  `heuristic_keywords` lists).
-- `Number of jurisdictions with extracted data: 0`.
-
-
-## Step 2: Classify the failure
-
-Use this decision tree for any defect:
-
-```
-Is the right document being retrieved?
-  └─ No → retrieval problem → fix query templates / heuristic_keywords
-  └─ Yes
-       Is the document text present in cleaned_text/?
-         └─ No → text extraction problem → check PDF quality / OCR
-         └─ Yes
-              Are the right features being extracted?
-                └─ No, wrong feature names → schema enum or description problem
-                └─ No, off-domain features → scope bleed → fix extraction_system_prompt
-                └─ Yes, but wrong values/units → schema description or units problem
-                └─ Yes, but nulls where values should be → schema IGNORE clause too broad
-```
-
-
-## Step 3: Fix the right axis
-
-### Retrieval problems (wrong or missing documents)
-
-Fix in plugin YAML:
-- Add more specific `query_templates` with legal code terms
-  (e.g., `"filetype:pdf {jurisdiction} generator zoning code"`).
-- Add target technology terms to `GOOD_TECH_KEYWORDS` and
-  `GOOD_TECH_PHRASES`.
-- Add adjacent-technology terms being confounded to `NOT_TECH_WORDS`.
-- Increase `website_keywords` score for the most discriminating terms.
-
-Required `heuristic_keywords` keys when present:
-- `GOOD_TECH_KEYWORDS`
-- `GOOD_TECH_PHRASES`
-- `GOOD_TECH_ACRONYMS`
-- `NOT_TECH_WORDS`
-
-### Scope bleed (off-domain features extracted)
-
-Fix in plugin YAML `extraction_system_prompt`:
-- State explicitly what is excluded (e.g., "Do not extract requirements for
-  residential portable generators").
-- Add the same language to `$instructions.scope` in the schema for
-  reinforcement.
-
-### Wrong values or units
-
-Fix in schema JSON, in the affected feature's `description`:
-- Add or sharpen the `VALUE` rule.
-- Expand the `UNITS` vocabulary list.
-- Add a `IGNORE` clause for the near-miss case.
-
-### Missing values (nulls where data exists)
-
-Fix in schema JSON:
-- Broaden the feature description to cover the phrasing used in source
-  documents.
-- Remove overly restrictive IGNORE clauses.
-- Check that the feature ID is spelled exactly as it appears in the enum.
-
-### Text extraction failures (blank cleaned_text)
-
-- Verify the PDF is readable (not scanned without OCR).
-- Add `from_ocr: true` to the doc entry in `known_local_docs`.
-- Set `pytesseract_exe_fp` in run config if OCR is needed.
-
-
-## Iteration hygiene
-
-- Use a **unique `out_dir`** per iteration run. COMPASS aborts early if the
-  output directory already contains results.
-- Keep the same small jurisdiction sample across all iterations until
-  quality gates pass.
-- Record what changed and why in a short comment in the config file or
-  a separate `CHANGELOG.md` in the example folder.
-- Save schema versions as `<tech>_schema_v2.json` etc. to
-  preserve a diff history. Point `schema:` in plugin YAML to the active
-  version.
-
-
-## Scale-up protocol
-
-Only advance to the next phase when the current phase passes all gates.
-
-| Phase | Jurisdictions | Gates |
-|---|---|---|
-| Smoke | 1–3 | Output rows exist; feature names match schema enum; section/summary present for most rows |
-| Robustness | 10–25 | Feature value types are stable; null rate is explainable; no scope bleed |
-| Production | Full national set | False positive/negative rates acceptable; repeated runs show minimal drift |
-
-When advancing, keep the same config files. Only change the jurisdictions CSV.
-
-
-## Diagnostic commands
-
-```bash
-# Check if cleaned text was produced
-ls outputs/*/cleaned_text/
-
-# Count output rows per jurisdiction
-wc -l outputs/*/jurisdiction_dbs/*.csv
-
-# Check for scope bleed — feature values that are off-domain
-grep -v "diesel\|generator\|backup\|emergency" outputs/ordinances.csv | head -20
-
-# View logs for a specific jurisdiction
-cat outputs/logs/San\ Diego*/run.log | grep -i "error\|warning\|found 0"
-```
-
-
-## Common failure modes
-
-| Symptom | Most likely cause | Fix axis |
-|---|---|---|
-| 0 documents for all jurisdictions | Credentials not loaded / search API down | Load `.env`; use `known_doc_urls` |
-| Downloaded PDFs are from wrong domain | `query_templates` too generic | Narrow queries with `filetype:pdf` and legal code terms |
-| `cleaned_text` present but no output CSV rows | Schema enum mismatch or extraction prompt failing | Check schema path in plugin YAML; verify `tech` value in run config |
-| Off-domain feature names in output | Scope bleed from large land-use code | Add exclusion language to `extraction_system_prompt` |
-| Correct features but wrong values | Feature description lacks VALUE rule | Add explicit VALUE rule to affected descriptions |
-| Setback in wrong units | UNITS rule missing or implicit | Add explicit UNITS vocabulary to description |
-| Null rows for features that are in the document | IGNORE clause too broad, or feature description doesn't match source phrasing | Broaden description; remove over-strict IGNORE clause |
-| Playwright timeout errors in logs | Website crawl phase browser failure | Non-fatal; COMPASS continues. Use `known_doc_urls` while iterating |
-
-
-## Acceptance criteria before promotion
-
-A technology is ready to promote from `examples/` to
-`compass/extraction/<tech>/` when all of the following are true on the
-robustness run (10–25 jurisdictions):
-
-- [ ] Output CSV rows conform to required schema contract.
-- [ ] Feature IDs are stable and match the schema enum exactly.
-- [ ] Most non-null rows include a useful `section` and `summary`.
-- [ ] Repeated runs on the same sample show minimal drift.
-- [ ] No scope bleed (off-domain features) is observed.
-- [ ] Null rate for common features is explainable (jurisdiction has no rule).
diff --git a/.github/skills/plugin-config-setup/SKILL.md b/.github/skills/plugin-config-setup/SKILL.md
index 57ec663d..0c83b5f9 100644
--- a/.github/skills/plugin-config-setup/SKILL.md
+++ b/.github/skills/plugin-config-setup/SKILL.md
@@ -73,17 +73,19 @@ schema: ./my_schema.json
 
 ## Key plugin YAML fields
 
-| Field | Type | Behavior |
+| Field | Type | Code Reference |
 |---|---|---|
-| `schema` | string (path) | **Required.** Path to JSON schema file, relative to plugin YAML location. |
-| `data_type_short_desc` | string | Short description used in LLM prompts (e.g. `utility-scale <tech> ordinance`). |
-| `query_templates` | list | Search query templates; `{jurisdiction}` is replaced at runtime. |
-| `website_keywords` | dict | Keyword → score map for URL ranking during website crawl. |
-| `heuristic_keywords` | dict or `true` | Pre-LLM text filter. If `true`, LLM generates lists from schema. |
-| `collection_prompts` | list or `true` | Text collection prompt(s). If **`true`**, LLM auto-generates from schema. |
-| `text_extraction_prompts` | list or `true` | Text consolidation prompt(s). If **`true`**, LLM auto-generates from schema. |
-| `extraction_system_prompt` | string | Overrides default LLM system prompt for the extraction step. Use this to scope extraction tightly to the target technology. |
-| `cache_llm_generated_content` | bool | Cache LLM-generated `query_templates`, `website_keywords`, and `heuristic_keywords`. Set to `false` when iterating schema to see live changes. |
+| `schema` | string (path) | [base.py#L124–L131](../../../compass/plugin/one_shot/base.py) |
+| `data_type_short_desc` | string | [base.py#L483](../../../compass/plugin/one_shot/base.py#L483) |
+| `query_templates` | list | [base.py#L217–L240](../../../compass/plugin/one_shot/base.py#L217) |
+| `website_keywords` | dict | [base.py#L281–L338](../../../compass/plugin/one_shot/base.py#L281) |
+| `heuristic_keywords` | dict or `true` | [base.py#L340–L390](../../../compass/plugin/one_shot/base.py#L340); [base.py#L512](../../../compass/plugin/one_shot/base.py#L512) |
+| `collection_prompts` | list or `true` | [base.py#L413–L436](../../../compass/plugin/one_shot/base.py#L413) |
+| `text_extraction_prompts` | list or `true` | [base.py#L438–L468](../../../compass/plugin/one_shot/base.py#L438) |
+| `extraction_system_prompt` | string | [base.py#L476–L488](../../../compass/plugin/one_shot/base.py#L476) |
+| `cache_llm_generated_content` | bool | [base.py#L107–L117](../../../compass/plugin/one_shot/base.py#L107) |
+
+**For the complete list of all configuration options (including `allow_multi_doc_extraction` and any future additions), consult the docstring of [`create_schema_based_one_shot_extraction_plugin()`](../../../compass/plugin/one_shot/base.py#L51).**
 
 ## Required `heuristic_keywords` shape