Skip to content

Latest commit

 

History

History
589 lines (501 loc) · 25.8 KB

File metadata and controls

589 lines (501 loc) · 25.8 KB

v0.2 scope: canonical schema + audit loop

Status: proposed, not implemented. This is a reviewable spec, not a roadmap. Audience: the Claude team (for feedback on the schema shape and the audit contract) and contributors (as the target for implementation).

0. Positioning

This v0.2 is not an attempt to ship a competitor to Claude for PowerPoint or Claude for Excel. It is an attempt to propose a concrete canonical schema for IB slide primitives — the shape a reference implementation should have if any tool in the ecosystem wanted to adopt it.

The 5 templates in the canonical core cover roughly 80% of the day-to-day rebuild-frequency slides in a banker's workflow. The 4 handshake files (spec.json, model.provenance.json, deck.exceptions.json, audit.report.json) define the integration contract any tool could speak. Everything else in the repo is a reference extension — useful, but not part of the proposed canonical core.

The ask to the Claude team is not "please integrate with this plugin." The ask is: are these primitives good enough that Claude for Excel and Claude for PowerPoint could adopt them internally? If the answer is yes, this repo stops being a plugin and becomes a reference implementation of a schema Anthropic could ship. If the answer is no, the pushback tells the plugin what to fix.

1. Canonical core (5 templates)

The canonical core is the minimum set of templates that, if adopted as primitives, would cover the most-rebuilt slides in an analyst's day-to-day workflow.

Canonical name Purpose Current renderer status
financial_summary Historical P&L summary with growth rates and margins Ships in v0.1 as render_financial_summary
trading_comps Public company trading multiples table Ships in v0.1 as render_trading_comps
transaction_comps Precedent M&A transaction multiples table New in v0.2
sensitivity Two-axis sensitivity grid (WACC × TGR, EV × multiple, etc.) Ships in v0.1 as render_sensitivity
operating_metrics KPI trends over time (revenue, EBITDA, margin, etc.) Ships in v0.1 under the alias render_dual_chart; will be canonically renamed in v0.2

The remaining 7 templates in the repo (cover, section_divider, toc, exec_summary, investment_highlights, stacked_bar_table, football_field, sources_uses) remain as reference extensions. They are useful and stable, but they are not being pitched as canonical primitives in v0.2.

Canonical rename: dual_chartoperating_metrics

The current render_dual_chart template is named after its visual output (two stacked bar charts side-by-side) but its actual purpose is to show operating KPIs over time. v0.2 renames it to operating_metrics because names should describe purpose, not appearance. The old name remains available as a deprecation alias for one version.

New template: transaction_comps

render_transaction_comps(spec) — renders a precedent M&A multiples table. Structurally similar to trading_comps, but with different column defaults (Date, Target, Acquirer, EV, EV/Revenue, EV/EBITDA, Premium %) and a highlight convention for the most-comparable deal.

2. Handshake files

Four JSON files form the integration contract. All four are designed so that tools outside this plugin — including future Claude surfaces — can read, write, and reason about them without depending on this plugin's code.

analyst_workspace/
├── ADUS_model_v7.xlsx                  # analyst's working model
├── ADUS_model_v7.provenance.json       # maps workbook ranges to canonical metrics
├── slides/
│   ├── financial_summary.spec.json     # renderer input + source_refs
│   ├── financial_summary.pptx          # rendered output
│   ├── trading_comps.spec.json
│   └── trading_comps.pptx
├── deck.exceptions.json                # reviewed intentional deviations
└── audit/
    └── 2026-04-09T14-22.report.json    # audit run output

2.1 *.spec.json — renderer input with source_refs

Extension of the current renderer spec with an optional source_refs field parallel to each row's values array.

{
  "template": "financial_summary",
  "slide_number": 6,
  "title": "Addus Has Delivered Consistent Revenue Growth with Expanding Profitability",
  "section_header": "Historical Financial Summary ($ in thousands)",
  "headers": ["Metric", "FY2023A", "FY2024A", "FY2025A", "3Y CAGR"],
  "source_workbook": "ADUS_model_v7.xlsx",
  "source_sheet_default": "Model",
  "rows": [
    {
      "label": "Revenue",
      "values": ["1,058,651", "1,154,599", "1,422,530", "14.4%"],
      "source_refs": ["D42", "E42", "F42", null],
      "style": "bold",
      "unit": "USD_thousands"
    },
    {
      "label": "EBITDA",
      "values": ["105,082", "116,221", "155,027", "21.4%"],
      "source_refs": [
        {"sheet": "Ratios", "cell": "D88"},
        {"sheet": "Ratios", "cell": "E88"},
        {"sheet": "Ratios", "cell": "F88"},
        null
      ],
      "style": "highlight",
      "unit": "USD_thousands"
    }
  ],
  "source": "Source: SEC EDGAR. ADUS FY2025 10-K filed February 24, 2026."
}

Rules:

  • source_refs[i] is parallel to values[i].
  • Each entry is one of: a string (cell only, uses source_sheet_default), an object (explicit {sheet, cell}), or null (value is derived, not sourced).
  • null means "this value is computed from others in the spec" — for example, a growth-rate column computed from two adjacent revenue cells. The audit validates these by recomputation, not by workbook lookup.
  • unit is one of: USD, USD_thousands, USD_millions, USD_billions, percent, multiple, count, ratio.
  • The renderer ignores source_refs, source_workbook, source_sheet_default, and unit — they're metadata for the audit layer, not for rendering.

2.2 *.provenance.json — workbook-to-canonical-metric map

Produced by /ib-import-excel. Maps workbook ranges to canonical metric names, so the audit can reason about "what is Revenue FY2024A in this workbook" without depending on cell addresses being memorized by the user.

{
  "workbook": "ADUS_model_v7.xlsx",
  "workbook_hash": "sha256:a7c3...",
  "generated_at": "2026-04-09T12:14:00Z",
  "last_recalc_detected": "2026-04-08T23:41:00Z",
  "period_axis": {
    "sheet": "Model",
    "range": "C4:H4",
    "values": ["FY2021A", "FY2022A", "FY2023A", "FY2024A", "FY2025A", "FY2026E"]
  },
  "metrics": [
    {
      "canonical": "Revenue",
      "sheet": "Model",
      "range": "C42:H42",
      "unit": "USD_thousands"
    },
    {
      "canonical": "EBITDA",
      "sheet": "Ratios",
      "range": "C88:H88",
      "unit": "USD_thousands"
    },
    {
      "canonical": "Gross_Profit",
      "sheet": "Model",
      "range": "C55:H55",
      "unit": "USD_thousands"
    }
  ]
}

The point: a spec can reference Model!D42 directly (low-level), OR it can reference Revenue.FY2024A (canonical). The provenance file is what makes the canonical form resolvable back to a workbook cell.

v0.2 supports the low-level form only. The canonical form is reserved for v0.3 once the mapping format has been stress-tested by real models.

2.3 deck.exceptions.json — reviewed intentional deviations

Durable record of discrepancies the analyst has reviewed and marked as intentional. Read by the audit command. Suppresses matched findings from FAIL to INFO. Mandatory from day one — without it, the audit becomes an anxiety machine that analysts disable.

{
  "deck": "ADUS_board_deck_v3",
  "exceptions": [
    {
      "id": "exc-2026-04-09-001",
      "slide_id": "financial_summary",
      "metric": "EBITDA",
      "period": "FY2025A",
      "expected_value": 155027,
      "actual_value": 148514,
      "reason": "Excluded 6,513 in one-time transaction costs per management adjustment",
      "reviewer": "jchoi@firm.com",
      "reviewed_at": "2026-04-09T14:30:00Z",
      "expires_at": "2026-06-30T00:00:00Z"
    }
  ]
}

Rules:

  • Each exception is matched on {slide_id, metric, period}. If those three match a FAIL finding, the finding is downgraded to INFO and tagged with the exception id.
  • expires_at is optional. If present and in the past, the exception is ignored and the underlying finding fires again — this prevents stale exceptions from hiding real regressions on the next quarterly refresh.
  • reason is free-text but required. The audit report surfaces it verbatim.

2.4 audit.report.json — machine-readable audit output

Structured output from /ib-audit. Designed so that other tools (a CI job, a review dashboard, a Slack bot) can consume audit results without parsing human-readable text.

{
  "run_id": "2026-04-09T14-22-00Z-a7c3",
  "generated_at": "2026-04-09T14:22:00Z",
  "workbook": "ADUS_model_v7.xlsx",
  "workbook_last_recalc": "2026-04-08T23:41:00Z",
  "workbook_recalc_staleness_hours": 14.7,
  "specs": ["slides/financial_summary.spec.json", "slides/trading_comps.spec.json"],
  "exceptions_file": "deck.exceptions.json",
  "summary": {
    "pass": 48,
    "fail": 2,
    "warn": 3,
    "info": 1
  },
  "findings": [
    {
      "severity": "FAIL",
      "check": "value_mismatch",
      "slide_id": "financial_summary",
      "metric": "EBITDA",
      "period": "FY2025A",
      "expected": 155027,
      "actual": 153027,
      "source_ref": {"workbook": "ADUS_model_v7.xlsx", "sheet": "Ratios", "cell": "F88"},
      "message": "Spec value 155,027 does not match workbook value 153,027",
      "exception_matched": null
    },
    {
      "severity": "WARN",
      "check": "workbook_recalc_staleness",
      "message": "Workbook last recalculated 14.7 hours ago; audit may be comparing against stale cached values"
    },
    {
      "severity": "INFO",
      "check": "value_mismatch",
      "slide_id": "financial_summary",
      "metric": "EBITDA",
      "period": "FY2024A",
      "expected": 116221,
      "actual": 109708,
      "exception_matched": "exc-2026-04-09-001",
      "message": "Downgraded from FAIL per reviewed exception: Excluded one-time transaction costs"
    }
  ]
}

3. source_refs schema

Formal schema for the source_refs field in a spec.

// source_refs[i] — one of:

// (a) null — value is computed from others in the spec
null

// (b) string — cell-only reference, uses source_sheet_default
"D42"

// (c) object — explicit sheet + cell
{"sheet": "Ratios", "cell": "F88"}

// (d) object — range reference for values that represent a sum
{"sheet": "Model", "range": "D42:D45", "aggregator": "sum"}

Aggregators supported in v0.2: sum, mean, median, max, min.

Required fields per type:

  • null: no fields
  • string: just the cell address (^[A-Z]+[0-9]+$)
  • object (cell): sheet + cell
  • object (range): sheet + range + aggregator

Forbidden: formula references. The audit compares evaluated values, never chases formula trees. If a spec needs a computed value, the computation lives in the spec (as a derived row with source_refs: null), not in the audit engine.

4. import_config.json schema

Produced by /ib-import-excel. Functionally equivalent to provenance.json but written in the workspace rather than alongside the workbook. Either can be used interchangeably by /ib-audit.

{
  "workbook": "ADUS_model_v7.xlsx",
  "workbook_path": "../models/ADUS_model_v7.xlsx",
  "workbook_hash": "sha256:a7c3...",
  "period_axis": {
    "sheet": "Model",
    "range": "C4:H4"
  },
  "metric_mappings": [
    {
      "canonical": "Revenue",
      "sheet": "Model",
      "range": "C42:H42",
      "unit": "USD_thousands",
      "confirmed_by_user": true
    }
  ],
  "mapping_version": 1
}

Reuse semantics: If the import command is re-run against the same workbook (same workbook_hash), it reuses the existing config without re-prompting. If the hash differs, it re-prompts for any ranges that no longer evaluate to plausible values (i.e., the user added a new period column).

5. Command contracts

Three commands. Not four. /ib-refresh is explicitly out of scope for v0.2 — it requires state management that's premature before the audit contract is validated against real models.

5.1 /ib-import-excel WORKBOOK.xlsx [--output CONFIG.json]

Inputs:

  • Required: path to a .xlsx file
  • Optional: --output path for the import_config.json (defaults to <workbook_stem>.provenance.json next to the workbook)

Behavior:

  • Opens the workbook in data_only mode (reads last-saved calculated values)
  • Walks the analyst through identifying canonical metric ranges via an interactive Claude-assisted prompt loop. The user confirms each mapping before it's committed.
  • Writes the provenance file
  • Prints a summary of what was mapped and flags obvious gaps (e.g., "You mapped Revenue but not EBITDA — is EBITDA in this workbook?")

Outputs:

  • A valid provenance.json file
  • Exit 0 on success, exit 1 on user abort, exit 2 on invalid workbook

Non-goals:

  • Does not render slides
  • Does not run the audit
  • Does not attempt to auto-infer mappings without user confirmation
  • Does not modify the workbook

5.2 /ib-render SPEC.json [--output OUT.pptx]

Inputs:

  • Required: path to a canonical spec JSON
  • Optional: --output path for the .pptx (defaults to <spec_stem>.pptx)

Behavior:

  • Validates the spec against the canonical schema for the named template
  • Calls IBRenderer.render_<template>(spec)
  • Writes a standalone .pptx file

Outputs:

  • A single .pptx file with one slide (if the spec is a single-slide template)
  • Exit 0 on success, exit 1 on validation failure, exit 2 on render failure

Non-goals:

  • Does not import from Excel
  • Does not run the audit
  • Does not modify working decks in place (standalone output only)
  • Does not insert into an open PowerPoint via Office JS

5.3 /ib-audit SPEC.json WORKBOOK.xlsx [--exceptions EXC.json] [--output REPORT.json]

Inputs:

  • Required: path to a spec file (or a directory of spec files)
  • Required: path to the workbook to audit against
  • Optional: --exceptions path to a deck.exceptions.json file
  • Optional: --output path for the audit report (defaults to audit/<timestamp>.report.json)

Behavior:

  • For each spec, for each value with a non-null source_ref, reads the corresponding workbook cell in data_only mode
  • Compares cached value to spec value; emits a finding per check
  • Downgrades matching findings to INFO per deck.exceptions.json
  • Records workbook_last_recalc and emits WARN if > 24 hours stale
  • Runs cross-slide consistency checks across all provided specs
  • Writes audit.report.json and prints a human-readable summary to stdout

Outputs:

  • A valid audit.report.json file
  • stdout: PASS: 48 WARN: 3 FAIL: 2 INFO: 1
  • Exit 0 if FAIL count is 0, exit 1 if FAIL count > 0

Non-goals:

  • Does not modify the workbook or the spec
  • Does not attempt to "fix" discrepancies
  • Does not parse arbitrary existing .pptx files (only audits the spec against the workbook; it assumes the spec was the input to the last render)

6. First 10 audit checks

These are the minimum set of checks that must pass for v0.2 to be considered complete. They're ordered from highest to lowest operational value.

# Check Severity Description
1 value_mismatch FAIL For each spec value with a non-null source_ref, the workbook cell value matches the spec value within the unit's tolerance (exact for integers, ±0.01 for percents, ±$1 for currency).
2 unit_mismatch FAIL Spec unit declaration matches the workbook cell's Excel number format (e.g., spec says USD_thousands but cell is formatted as $#,##0 with values in millions).
3 sign_mismatch FAIL Sign of the workbook value matches sign of the spec value. Catches "expense displayed as positive in spec but negative in model."
4 subtotal_tie_out FAIL For each row flagged as a subtotal in the spec, the sum of its constituent rows (as sourced from the workbook) equals the subtotal value.
5 cross_slide_consistency FAIL Same {metric, period} appearing on multiple slides has the same value across all slides. Catches the "EBITDA shown as 155,027 on slide 6 but 155,207 on slide 12" class of error.
6 missing_source_ref WARN Spec value has no source_ref and is not a computed derivation. Cannot be audited. Emits WARN not FAIL to allow partial adoption.
7 workbook_recalc_staleness WARN Workbook's last-saved recalc timestamp is > 24 hours old. The audit is comparing against cached values; warn loudly.
8 required_source_line FAIL Every spec has a non-empty source field (the citation line at the bottom of the slide).
9 exception_matched INFO A FAIL finding matches an entry in deck.exceptions.json; downgraded to INFO and tagged with the exception id and reason.
10 exception_expired WARN An entry in deck.exceptions.json has an expires_at in the past. The exception is ignored (underlying finding fires again) and this warning alerts the analyst to re-review.

Checks 1–5 cover the "did the slide drift from the model" question. Check 6 makes partial adoption possible (you don't have to source_ref every cell on day one). Check 7 addresses the values-vs-formulas gotcha honestly. Checks 8–10 make the exception workflow credible.

7. Acceptance criteria

Each command has explicit, testable criteria. v0.2 ships when all of these pass against a test fixture containing a deliberately-modified ADUS model.

7.1 /ib-import-excel

  • Runs on the test fixture ADUS_model_test.xlsx without crashing
  • Prompts the user to confirm each mapping (never auto-commits)
  • Writes a valid provenance.json that passes JSON schema validation
  • Second run on the same (unchanged) workbook reuses the saved config without re-prompting
  • Second run on a modified workbook (hash mismatch) re-prompts only for ranges whose values are no longer plausible
  • Handles "workbook file deleted between runs" with exit code 2 and a clear error

7.2 /ib-render

  • Ships 5 canonical core templates: financial_summary, trading_comps, transaction_comps, sensitivity, operating_metrics
  • render_transaction_comps is implemented and has an example spec in reference/examples/transaction_comps.json
  • operating_metrics is callable as both render_operating_metrics (canonical) and render_dual_chart (deprecation alias, prints a one-line warning to stderr)
  • All 5 canonical core templates pass the existing determinism test (tests/test_determinism.py)
  • A spec with source_refs renders identically to a spec without them (the renderer ignores the field)

7.3 /ib-audit

  • Emits FAIL for deliberately-modified values in the test fixture (≥5 modifications across ≥3 slides)
  • Emits PASS for unchanged values
  • Respects deck.exceptions.json and downgrades matched findings to INFO
  • Records workbook_last_recalc and emits WARN if the fixture's recalc is > 24h stale
  • Cross-slide consistency check catches a metric that differs between two spec files in the fixture
  • Output JSON validates against the audit.report.json schema
  • Exit code is 0 when FAIL count is 0, 1 otherwise
  • Total runtime on the 5-slide fixture is < 3 seconds

7.4 Documentation

  • Each handshake file has a formal JSON schema file in docs/schemas/
  • README.md has a v0.2 section with a minimal end-to-end example
  • This scope doc (docs/v0.2-scope.md) is updated as implementation diverges

8. Non-goals

Explicit non-goals, so reviewers know what v0.2 is not trying to do.

  • Not a replacement for Macabacus, UpSlide, or any Office COM add-in. This plugin is out-of-app on purpose. Its advantage is Mac-native and IT-approval-free, not feature parity.
  • Not a live-linked Excel-to-PowerPoint system. No live connection. The audit runs on demand against a saved workbook file.
  • Not an autonomous deck generator. The /ib-deck TICKER vision is explicitly deferred. v0.2 is about supporting analysts who already have a model and want repeatable slides tied to it.
  • Does not modify the workbook. Read-only. Ever.
  • Does not modify working decks in place. Standalone .pptx output only. Analysts paste slides into their working deck manually; this is the correct UX, not a limitation.
  • Does not implement Office JS or any in-app integration. Out-of-app is the moat; going in-app would lose it.
  • Does not parse arbitrary existing .pptx files. The audit assumes the spec is the last known input to the render. If the analyst hand-edited the PPTX after render, that's what deck.exceptions.json is for.
  • Does not handle formulas. Last-saved cached values only. The workbook_recalc_staleness warning is how this limitation is surfaced to the user.
  • Does not attempt to auto-infer canonical metric mappings. Every mapping in provenance.json is user-confirmed. Auto-inference is a v0.3 question once there are enough confirmed mappings to train against.
  • Does not ship all 12 templates as canonical. Only the 5 in the canonical core are being proposed as primitives. The other 7 remain as reference extensions.

9. Known limitations and honest caveats

Things that will ship in v0.2 with known rough edges. Named explicitly so they're not hidden under the marketing.

  • Stale-recalc false negatives. If the analyst modifies a formula but doesn't save with recalculation, the workbook_last_recalc timestamp won't necessarily catch it. The audit will compare against stale cached values and emit PASS when a careful human would emit FAIL. The workbook_hash in provenance.json provides a partial defense — hash mismatch between import and audit runs warns about any workbook change — but it's not a substitute for a real recalc-on-load solution.
  • Unit inference is lossy. Excel number formats are weakly typed. A cell formatted as $#,##0 might contain thousands, millions, or raw dollars depending on how the analyst built the model. v0.2 asks the user to confirm the unit explicitly during /ib-import-excel rather than guessing.
  • Cross-slide consistency is by metric name match. If the analyst uses "EBITDA" in one spec and "Adj. EBITDA" in another, the check will treat them as different metrics and miss the inconsistency. v0.2 does not ship a metric-alias table.
  • No support for multi-workbook audits in v0.2. The audit command takes exactly one workbook. Real decks sometimes pull from multiple models (ops model, transaction model, capital structure); handling this is a v0.3 problem.
  • No CI integration. The audit command returns a non-zero exit code on FAIL, which makes CI wiring mechanically possible, but no GitHub Actions workflow example ships in v0.2.

10. Open questions for the Claude team

Things the plugin author would want feedback on before committing to implementation.

  1. Is source_refs as a parallel array the right shape? The alternative is inline objects per value. Parallel array is more compact; inline is more robust to reordering. v0.2 picks parallel. Pushback welcome.
  2. Should provenance.json live next to the workbook or in the workspace? The plugin uses "next to the workbook" by default. But that assumes the analyst has write permission to the workbook's directory, which isn't always true (shared SharePoint folders). A workspace-local default might be safer.
  3. Is "reviewed exception expires" the right default? v0.2 makes expires_at optional. A stricter stance is "all exceptions expire after 30 days unless explicitly marked permanent." That would be more defensible but more annoying. Feedback welcome.
  4. Should the canonical core be 5 templates or 7? Adding football_field and sources_uses would cover M&A sell-side pitches better but dilute the "day-to-day rebuild frequency" principle. v0.2 picks 5. If the Claude team prefers coverage over focus, this is cheap to change.
  5. Who owns the canonical schema long-term? If the answer is "this plugin repo," the proposal is a plugin. If the answer is "Anthropic," it becomes a published primitive other tools can rely on. The plugin would rather be the reference implementation of an Anthropic-owned primitive than the sole owner of its own standard.

Appendix A: Reference to the v0.1 comparison artifact

The comparison/ folder in this repo (as of commit a9e108a) establishes the narrow claim that a canonical spec + renderer can produce an IB slide with 0 lines of spatial code from the caller, compared to 166 lines for a good-faith prompt-guided implementation of the same slide from the same prompt.

That artifact is the why for v0.2. This doc is the what. Neither is the implementation.

Appendix B: What would make this proposal fail

Named honestly so the reviewer knows what to look for.

  • The schema is too IB-specific. If the canonical core only composes for investment banking slides, it's not a Claude primitive — it's a vertical. Counter: the 5 templates are domain-specific, but the shape (spec + refs + exceptions + audit report) generalizes to any domain where slides need to tie back to a source of truth (medical, legal, research, consulting).
  • The audit contract is too heavyweight. If analysts find four JSON files too much bookkeeping for one deck, the workflow won't stick. Counter: only spec.json is required to use the renderer; the other three become useful only when the analyst wants the audit loop, and they're all optional.
  • The values-vs-formulas honesty caveat is fatal. If stale-recalc false negatives turn out to be common in real workflows, the audit's PASS signal is untrustworthy and the whole moat disappears. Counter: workbook_hash detection + recalc warnings are a partial defense, and a real recalc-on-load solution (via libreoffice --headless or equivalent) is a known v0.3 improvement.