diff --git a/pr-approval-analysis/.gitignore b/pr-approval-analysis/.gitignore
new file mode 100644
index 000000000000..82380b07eff7
--- /dev/null
+++ b/pr-approval-analysis/.gitignore
@@ -0,0 +1,9 @@
+# Generated data / artifacts from the fetch + analysis scripts
+prs.jsonl
+teams.json
+*.log
+cursor.txt
+old_commit.txt
+codeowners_old.txt
+codeowners_new.txt
+__pycache__/
diff --git a/pr-approval-analysis/README.md b/pr-approval-analysis/README.md
new file mode 100644
index 000000000000..57d901283001
--- /dev/null
+++ b/pr-approval-analysis/README.md
@@ -0,0 +1,64 @@
+# PR approval waiting-time analysis (`dfinity/ic`)
+
+Quantifies how long pull requests wait for **code-review approval**, to assess
+whether review is a productivity bottleneck — broken down by PRs that only need
+approval from the author's own team ("same-team") vs. PRs that also need other
+teams ("cross-team").
+
+See **[REPORT.md](REPORT.md)** for the findings, methodology, and limitations.
+
+## TL;DR
+
+- For the *median* PR, approval is fast (first approval ~1.5 h, full approval
+  ~4.5 h over the last 12 months) — review is not a severe bottleneck at the
+  center of the distribution.
+- The cost is in the **tail** (overall full-approval p90 ≈ 5 days) and in
+  **cross-team PRs** (full-approval median 8.3 h vs 3.4 h same-team, with a
+  heavier tail). Coordinating approvals across teams is the main drag.
+
+## Files
+
+| File | Purpose |
+|---|---|
+| `REPORT.md` | The generated report (12- and 6-month windows). |
+| `fetch_teams.py` | Fetch current membership of the CODEOWNERS umbrella teams → `teams.json`. |
+| `fetch_prs.py` | Fetch all PRs in the last 365 days (reviews + review-request timeline) → `prs.jsonl`. Resumable (cursor checkpoint + dedupe). |
+| `build_team_mapping.py` | Derive a stable fine-grained-team → umbrella-team mapping from CODEOWNERS git history → `team_mapping.json`. |
+| `analyze.py` | Compute approval waiting-time stats + classification → `REPORT.md`. |
+| `team_mapping.json` | The derived team mapping (checked in for reference). |
+
+## Reproduce
+
+Requires `gh` authenticated with `read:org` + `repo` scopes (or a PAT — see note
+below) and Python 3.
+
+```sh
+cd pr-approval-analysis
+
+# 1. Team memberships (current snapshot)
+python3 fetch_teams.py
+
+# 2. All PRs in the last 12 months (writes prs.jsonl; resumable)
+python3 fetch_prs.py
+
+# 3. Build the historical fine->umbrella team mapping
+#    (uses git history of .github/CODEOWNERS)
+git -C .. show "$(git -C .. rev-list -1 --before=2025-06-18 master):.github/CODEOWNERS" > codeowners_old.txt
+cp ../.github/CODEOWNERS codeowners_new.txt
+python3 build_team_mapping.py
+
+# 4. Analyze + regenerate REPORT.md
+python3 analyze.py
+```
+
+> **Note on tokens:** if the same GitHub account runs `gh auth login` on another
+> machine it will revoke this machine's token mid-run. `fetch_prs.py` is
+> resumable and can read an independent token from `~/.gh_pat` (a classic PAT
+> with `repo` + `read:org`, SSO-authorized for `dfinity`) or `GH_TOKEN`.
+
+## Caveats
+
+Approval elapsed time overlaps with the author iterating and is an upper bound on
+"lost" time, not pure idle time. Team membership history is not available from the
+API, so home teams are derived from a year of review-request data. See the
+*Limitations & caveats* section of `REPORT.md` for the full list.
diff --git a/pr-approval-analysis/REPORT.md b/pr-approval-analysis/REPORT.md
new file mode 100644
index 000000000000..89c877a43406
--- /dev/null
+++ b/pr-approval-analysis/REPORT.md
@@ -0,0 +1,207 @@
+# dfinity/ic — PR approval waiting-time analysis
+
+_Generated 2026-06-18 13:12 UTC._
+
+This report quantifies how long pull requests in `dfinity/ic` wait for code-review **approval**, to assess whether review is a productivity bottleneck. Results are split by whether a PR only needs approval from the author's **own team** or also from **other teams**.
+
+## Executive summary
+
+Over the last 12 months, **3,953** human-authored PRs passed the filters. Of those that received review, the **typical (median) PR waited 1.5 h for a first approval and 4.5 h for full approval** (all required teams). So for the median PR, approval is reasonably fast — review is *not* a severe bottleneck at the center of the distribution.
+
+**The cost is in the tail and in cross-team PRs.** Key findings:
+
+- **Cross-team PRs are the bottleneck.** They reach *full* approval in a median of **8.3 h** vs **3.4 h** for same-team PRs, and their tail is far heavier (p90 **6.1 d** vs **4.5 d**). Each extra team that must approve adds serial waiting.
+- **Cross-team PRs get a *first* look fastest** (median 38 min vs 2.2 h for same-team) — more requested reviewers means someone responds quickly — but converting that into *all* required approvals is what drags.
+- **A heavy tail affects every category.** Overall, full approval takes p90 **5.2 d** and p95 **8.1 d**. Roughly 10–20% of PRs wait multiple days for approval; this is where engineer time is actually lost.
+- **Weekends inflate the wait.** Counting only Mon–Fri shaves the tail materially (overall full-approval p90 drops from 5.2 d wall-clock to 3.4 d business-time).
+- **Most PRs are single-team.** 3,032 of 3,953 human PRs need only one team; 540 need two and 285 need three or more.
+- **Recent trend is slightly better.** The last 6 months are modestly faster than the full year (see below), so review latency is not worsening.
+
+**Bottom line:** code review is responsive for the median change, but **cross-team approval coordination and a heavy multi-day tail** are the real drags on throughput. If frontier models make writing code cheaper, the relative cost of these approval waits — especially for PRs spanning multiple teams — will dominate cycle time.
+
+## Methodology
+
+- **Source:** all PRs created in the last 12 months (2025-06-18 → 2026-06-18), pulled from the GitHub GraphQL API (reviews + review-request timeline events). 4,885 PRs fetched before filtering.
+- **Filters applied:** draft PRs are excluded; open PRs not updated in the last 30 days are excluded (treated as abandoned). Bot-authored PRs are reported separately and excluded from timing stats.
+- **Ready time** = when the PR first became reviewable (creation time, or the ready-for-review event if it was opened as a draft).
+- **Approval** = a human `APPROVED` review (bot/automation reviews ignored).
+- **Time to first approval** = ready → first approval. **Time to full approval** = ready → last approval at/before merge (proxy for 'all required approvals obtained'). PRs that never received a human approval are excluded from these two metrics (so very fast trivial merges by owners are not counted).
+- **Mon–Fri** rows count only weekday time, removing weekends from the wait (approximate 'business time', UTC; ignores public holidays and time-of-day).
+- **Team gating:** GitHub requests review from the CODEOWNERS teams that own the changed files. A PR is **same-team** if every requested team maps to the author's own umbrella area, otherwise **cross-team**.
+- **Team consolidation handled.** CODEOWNERS was reorganized during the window (61 changes); most notably (#10114, 2026-05-07) the fine-grained teams `consensus`, `execution`, `team-dsm`, `ic-message-routing-owners`, `ic-interface-owners`, `crypto-team`, `pocket-ic` were consolidated into the `core-protocol` umbrella, and `defi-team`→`defi`, `boundary-node`→`node`, etc. To measure same/cross-team **consistently over time**, every team name ever requested is mapped to a stable umbrella (`core-protocol`, `defi`, `node`, `idx`, `dre`, `governance-team`, `sdk`, `infrasec`, `product-security`, `ic-owners-owners`). The mapping is derived by resolving each historical CODEOWNERS path through the current file.
+- **Author home team** is derived empirically from the umbrella most often requested on that author's single-team PRs. This reflects where an author actually contributes (agreeing with org-team membership for 18/25 code-owners; the rest are people who mostly contribute outside their nominal team or have very few PRs).
+
+## Last 12 months
+
+PRs created since **2025-06-18** that pass the filters: **4263** total — 3953 human-authored, 310 bot-authored.
+
+Human PR state: 3768 merged, 27 open, 158 closed.
+
+Classification (human): **2488 same-team**, **1369 cross-team**, 96 no-team-request, 0 unknown-home.
+
+Reviewing teams per PR: 3032 need 1 team, 540 need 2, 285 need 3+ (96 had no team request).
+
+### Same-team vs cross-team at a glance
+
+| Metric | Same-team | Cross-team |
+|---|---:|---:|
+| PRs with an approval | 2346 | 1287 |
+| First approval — median | 2.2 h | 38 min |
+| Full approval — median | 3.4 h | 8.3 h |
+| Full approval — p90 | 4.5 d | 6.1 d |
+| Full approval — p95 | 7.2 d | 9.1 d |
+
+### Overall
+
+**All human PRs with an approval** (n with an approval = 3726)
+
+| Metric | n | Median | Mean | p25 | p75 | p90 | p95 |
+|---|---:|---:|---:|---:|---:|---:|---:|
+| Time to first approval (wall-clock) | 3726 | 1.5 h | 32.7 h | 11 min | 17.9 h | 3.3 d | 5.8 d |
+| Time to first approval (Mon–Fri) | 3726 | 1.4 h | 23.0 h | 11 min | 16.2 h | 46.7 h | 3.9 d |
+| Time to full approval (wall-clock) | 3726 | 4.5 h | 2.1 d | 28 min | 37.7 h | 5.2 d | 8.1 d |
+| Time to full approval (Mon–Fri) | 3726 | 4.3 h | 36.4 h | 27 min | 25.2 h | 3.4 d | 6.0 d |
+| Time to merge (wall-clock) | 3671 | 17.3 h | 3.0 d | 1.8 h | 2.8 d | 6.9 d | 12.3 d |
+
+| First-approval (wall-clock) reached within | ≤1h | ≤4h | ≤8h | ≤1d | ≤2d | ≤3d | ≤1w |
+|---|---:|---:|---:|---:|---:|---:|---:|
+| share | 45% | 61% | 67% | 80% | 85% | 89% | 96% |
+
+### Same-team PRs
+
+**Same-team PRs** (n with an approval = 2346)
+
+| Metric | n | Median | Mean | p25 | p75 | p90 | p95 |
+|---|---:|---:|---:|---:|---:|---:|---:|
+| Time to first approval (wall-clock) | 2346 | 2.2 h | 38.3 h | 19 min | 20.9 h | 3.7 d | 6.1 d |
+| Time to first approval (Mon–Fri) | 2346 | 2.2 h | 26.7 h | 18 min | 18.4 h | 2.1 d | 4.1 d |
+| Time to full approval (wall-clock) | 2346 | 3.4 h | 2.1 d | 26 min | 27.4 h | 4.5 d | 7.2 d |
+| Time to full approval (Mon–Fri) | 2346 | 3.3 h | 34.8 h | 26 min | 22.1 h | 2.9 d | 5.2 d |
+| Time to merge (wall-clock) | 2324 | 16.5 h | 3.1 d | 1.9 h | 2.8 d | 6.2 d | 12.1 d |
+
+| First-approval (wall-clock) reached within | ≤1h | ≤4h | ≤8h | ≤1d | ≤2d | ≤3d | ≤1w |
+|---|---:|---:|---:|---:|---:|---:|---:|
+| share | 39% | 56% | 63% | 78% | 83% | 88% | 96% |
+
+### Cross-team PRs
+
+**Cross-team PRs** (n with an approval = 1287)
+
+| Metric | n | Median | Mean | p25 | p75 | p90 | p95 |
+|---|---:|---:|---:|---:|---:|---:|---:|
+| Time to first approval (wall-clock) | 1287 | 38 min | 20.5 h | 5 min | 8.7 h | 2.1 d | 4.0 d |
+| Time to first approval (Mon–Fri) | 1287 | 36 min | 14.6 h | 5 min | 7.7 h | 36.1 h | 2.8 d |
+| Time to full approval (wall-clock) | 1287 | 8.3 h | 2.2 d | 33 min | 2.0 d | 6.1 d | 9.1 d |
+| Time to full approval (Mon–Fri) | 1287 | 7.7 h | 38.4 h | 32 min | 34.7 h | 4.1 d | 6.5 d |
+| Time to merge (wall-clock) | 1256 | 18.1 h | 2.9 d | 1.5 h | 3.0 d | 7.0 d | 12.1 d |
+
+| Full-approval (wall-clock) reached within | ≤1h | ≤4h | ≤8h | ≤1d | ≤2d | ≤3d | ≤1w |
+|---|---:|---:|---:|---:|---:|---:|---:|
+| share | 32% | 45% | 50% | 67% | 75% | 81% | 92% |
+
+## Last 6 months
+
+PRs created since **2025-12-18** that pass the filters: **2039** total — 1876 human-authored, 163 bot-authored.
+
+Human PR state: 1778 merged, 27 open, 71 closed.
+
+Classification (human): **1170 same-team**, **684 cross-team**, 22 no-team-request, 0 unknown-home.
+
+Reviewing teams per PR: 1489 need 1 team, 225 need 2, 140 need 3+ (22 had no team request).
+
+### Same-team vs cross-team at a glance
+
+| Metric | Same-team | Cross-team |
+|---|---:|---:|
+| PRs with an approval | 1102 | 632 |
+| First approval — median | 2.0 h | 40 min |
+| Full approval — median | 2.6 h | 7.7 h |
+| Full approval — p90 | 4.1 d | 5.8 d |
+| Full approval — p95 | 7.0 d | 8.1 d |
+
+### Overall
+
+**All human PRs with an approval** (n with an approval = 1755)
+
+| Metric | n | Median | Mean | p25 | p75 | p90 | p95 |
+|---|---:|---:|---:|---:|---:|---:|---:|
+| Time to first approval (wall-clock) | 1755 | 1.4 h | 31.6 h | 10 min | 17.4 h | 3.1 d | 5.6 d |
+| Time to first approval (Mon–Fri) | 1755 | 1.3 h | 22.1 h | 10 min | 15.2 h | 45.9 h | 3.6 d |
+| Time to full approval (wall-clock) | 1755 | 3.7 h | 47.5 h | 24 min | 29.7 h | 4.9 d | 7.6 d |
+| Time to full approval (Mon–Fri) | 1755 | 3.6 h | 33.5 h | 22 min | 22.6 h | 3.1 d | 5.6 d |
+| Time to merge (wall-clock) | 1726 | 14.6 h | 2.7 d | 1.7 h | 2.5 d | 6.1 d | 11.0 d |
+
+| First-approval (wall-clock) reached within | ≤1h | ≤4h | ≤8h | ≤1d | ≤2d | ≤3d | ≤1w |
+|---|---:|---:|---:|---:|---:|---:|---:|
+| share | 46% | 61% | 66% | 81% | 85% | 90% | 97% |
+
+### Same-team PRs
+
+**Same-team PRs** (n with an approval = 1102)
+
+| Metric | n | Median | Mean | p25 | p75 | p90 | p95 |
+|---|---:|---:|---:|---:|---:|---:|---:|
+| Time to first approval (wall-clock) | 1102 | 2.0 h | 35.7 h | 18 min | 18.5 h | 3.6 d | 5.7 d |
+| Time to first approval (Mon–Fri) | 1102 | 1.9 h | 25.0 h | 17 min | 17.5 h | 2.1 d | 3.8 d |
+| Time to full approval (wall-clock) | 1102 | 2.6 h | 45.7 h | 22 min | 22.1 h | 4.1 d | 7.0 d |
+| Time to full approval (Mon–Fri) | 1102 | 2.5 h | 32.0 h | 22 min | 19.9 h | 2.6 d | 5.0 d |
+| Time to merge (wall-clock) | 1089 | 13.4 h | 2.7 d | 1.8 h | 2.0 d | 5.8 d | 10.8 d |
+
+| First-approval (wall-clock) reached within | ≤1h | ≤4h | ≤8h | ≤1d | ≤2d | ≤3d | ≤1w |
+|---|---:|---:|---:|---:|---:|---:|---:|
+| share | 41% | 57% | 63% | 79% | 84% | 89% | 96% |
+
+### Cross-team PRs
+
+**Cross-team PRs** (n with an approval = 632)
+
+| Metric | n | Median | Mean | p25 | p75 | p90 | p95 |
+|---|---:|---:|---:|---:|---:|---:|---:|
+| Time to first approval (wall-clock) | 632 | 40 min | 23.1 h | 5 min | 11.7 h | 2.6 d | 4.8 d |
+| Time to first approval (Mon–Fri) | 632 | 36 min | 16.2 h | 4 min | 8.9 h | 39.6 h | 3.0 d |
+| Time to full approval (wall-clock) | 632 | 7.7 h | 2.1 d | 31 min | 47.4 h | 5.8 d | 8.1 d |
+| Time to full approval (Mon–Fri) | 632 | 7.3 h | 35.6 h | 29 min | 36.0 h | 3.9 d | 6.0 d |
+| Time to merge (wall-clock) | 617 | 16.7 h | 2.7 d | 1.5 h | 2.9 d | 6.9 d | 10.9 d |
+
+| Full-approval (wall-clock) reached within | ≤1h | ≤4h | ≤8h | ≤1d | ≤2d | ≤3d | ≤1w |
+|---|---:|---:|---:|---:|---:|---:|---:|
+| share | 32% | 46% | 50% | 67% | 75% | 81% | 93% |
+
+## Limitations & caveats
+
+- **Approval ≠ active waiting.** Elapsed time to approval overlaps with the author iterating, addressing comments, or working on other PRs. It is an upper bound on 'lost' time, not pure idle time.
+- **Full-approval proxy.** 'Time to full approval' uses the last approval at or before merge. If a PR was mergeable after fewer approvals, this slightly overstates the required wait; if approvals were dismissed by new commits and re-requested, it can understate it.
+- **Home-team is current, data-derived.** It is inferred from a whole year of review requests; an author who switched teams mid-year is assigned their dominant area. Team membership history is not available from the API.
+- **Team mapping is best-effort.** A few low-volume historical teams (`utopia`, `languages`, `platform-operations`, ~0.3% of requests) are mapped heuristically; this does not materially affect the aggregates.
+- **Review-request timeline truncation.** A small number of PRs (≈6%) have >50 timeline events; the first ready/review-request events used here occur early, so truncation has negligible effect on classification.
+- **Bots excluded.** Automation-authored PRs (release bots, dependabot, etc.) are reported as counts but excluded from timing to avoid skew.
+
+## Appendix: home-team assignment (top 25 authors by PR count)
+
+| Author | PRs (12mo) | Home umbrella |
+|---|---:|---|
+| basvandijk | 518 | idx |
+| mraszyk | 396 | core-protocol |
+| nmattia | 322 | idx |
+| eichhorl | 209 | core-protocol |
+| cgundy | 203 | idx |
+| jasonz-dfinity | 199 | governance-team |
+| andrewbattat | 186 | node |
+| frankdavid | 161 | node |
+| daniel-wong-dfinity-org | 146 | governance-team |
+| alin-at-dfinity | 144 | core-protocol |
+| Bownairo | 140 | node |
+| pierugo-dfinity | 137 | core-protocol |
+| mbjorkqvist | 135 | defi |
+| pietrodimarco-dfinity | 134 | governance-team |
+| gregorydemay | 116 | defi |
+| kpop-dfinity | 101 | core-protocol |
+| randombit | 91 | core-protocol |
+| fspreiss | 84 | core-protocol |
+| NikolaMilosa | 80 | governance-team |
+| maksymar | 79 | core-protocol |
+| michael-weigelt | 78 | core-protocol |
+| dsarlis | 62 | core-protocol |
+| schneiderstefan | 56 | core-protocol |
+| ninegua | 53 | defi |
+| r-birkner | 52 | node |
diff --git a/pr-approval-analysis/analyze.py b/pr-approval-analysis/analyze.py
new file mode 100644
index 000000000000..673ccfeda370
--- /dev/null
+++ b/pr-approval-analysis/analyze.py
@@ -0,0 +1,667 @@
+#!/usr/bin/env python3
+"""
+Analyze PR approval waiting times for dfinity/ic.
+
+Reads prs.jsonl (from fetch_prs.py) and teams.json (from fetch_teams.py) and
+produces statistics on how long PRs wait for approval, broken down by:
+  - PRs that only need approval from the author's own team ("same-team")
+  - PRs that need approval from other teams too ("cross-team")
+for two windows: last 12 months and last 6 months.
+
+Writes report.md and prints a summary.
+"""
+
+import datetime
+import json
+import math
+import statistics
+from collections import Counter, defaultdict
+
+NOW = datetime.datetime.now(datetime.timezone.utc)
+WINDOWS = [("12 months", 365), ("6 months", 182)]
+STALE_OPEN_DAYS = 30  # open PRs not updated within this many days are dropped
+
+# Fine-grained-team -> stable umbrella-team mapping (built by build_team_mapping.py).
+# CODEOWNERS consolidated many fine teams (consensus, execution, team-dsm, ...) into
+# ~10 umbrella teams during the window; we canonicalize so same/cross-team is
+# measured consistently over time.
+try:
+    TEAM_UMBRELLA = json.load(open("team_mapping.json"))
+except FileNotFoundError:
+    TEAM_UMBRELLA = {}
+
+
+def to_umbrella(team):
+    return TEAM_UMBRELLA.get(team, team)
+
+
+BOT_LOGINS = {
+    "github-actions",
+    "github-actions[bot]",
+    "dependabot",
+    "dependabot[bot]",
+    "renovate",
+    "renovate[bot]",
+    "pr-automation-bot-public",
+    "sa-github-api",
+    "ic-team-bot",
+    "github-merge-queue[bot]",
+}
+
+# ---------------------------------------------------------------- helpers
+
+
+def parse_ts(s):
+    if s is None:
+        return None
+    return datetime.datetime.fromisoformat(s.replace("Z", "+00:00"))
+
+
+def is_bot(actor):
+    if actor is None:
+        return True
+    login = actor.get("login")
+    if login is None:
+        return True
+    if actor.get("__typename") == "Bot":
+        return True
+    if login in BOT_LOGINS:
+        return True
+    if login.endswith("[bot]"):
+        return True
+    return False
+
+
+def weekday_seconds(start, end):
+    """
+    Elapsed seconds between start and end counting only Mon-Fri (UTC).
+
+    Approximates "business time" so that weekends do not count as approval wait.
+    Ignores public holidays and time-of-day (treats all 24h of a weekday).
+    """
+    if end <= start:
+        return 0.0
+    total = 0.0
+    cur = start
+    while cur < end:
+        day_start = cur.replace(hour=0, minute=0, second=0, microsecond=0)
+        next_day = day_start + datetime.timedelta(days=1)
+        seg_end = min(next_day, end)
+        if cur.weekday() < 5:  # Mon=0 .. Fri=4
+            total += (seg_end - cur).total_seconds()
+        cur = seg_end
+    return total
+
+
+def pct(data, p):
+    if not data:
+        return None
+    s = sorted(data)
+    if len(s) == 1:
+        return s[0]
+    k = (len(s) - 1) * p / 100.0
+    f = math.floor(k)
+    c = math.ceil(k)
+    if f == c:
+        return s[int(k)]
+    return s[f] + (s[c] - s[f]) * (k - f)
+
+
+def fmt_hours(h):
+    if h is None:
+        return "n/a"
+    if h < 1:
+        return f"{h * 60:.0f} min"
+    if h < 48:
+        return f"{h:.1f} h"
+    return f"{h / 24:.1f} d"
+
+
+# ---------------------------------------------------------------- load
+
+
+def load_prs(path="prs.jsonl"):
+    prs = []
+    with open(path) as f:
+        for line in f:
+            line = line.strip()
+            if line:
+                prs.append(json.loads(line))
+    return prs
+
+
+def compute_ready_time(pr):
+    """
+    When the PR first became reviewable.
+
+    - created non-draft  -> createdAt
+    - created as draft    -> first ReadyForReviewEvent
+    Decided by whether the earliest draft/ready transition is a ready event.
+    """
+    created = parse_ts(pr["createdAt"])
+    events = []
+    for it in pr["timelineItems"]["nodes"]:
+        t = it.get("__typename")
+        if t in ("ReadyForReviewEvent", "ConvertToDraftEvent"):
+            events.append((parse_ts(it["createdAt"]), t))
+    events.sort()
+    if not events:
+        return created
+    if events[0][1] == "ReadyForReviewEvent":
+        # created as draft, then marked ready
+        return events[0][0]
+    # created non-draft (first transition is a convert-to-draft)
+    return created
+
+
+def requested_teams(pr):
+    teams = set()
+    for it in pr["timelineItems"]["nodes"]:
+        if it.get("__typename") == "ReviewRequestedEvent":
+            rr = it.get("requestedReviewer") or {}
+            if rr.get("__typename") == "Team":
+                teams.add(rr["slug"])
+    return teams
+
+
+def approvals(pr):
+    """List of (timestamp, login) for human APPROVED reviews."""
+    out = []
+    for r in pr["reviews"]["nodes"]:
+        if r.get("state") == "APPROVED" and not is_bot(r.get("author")):
+            ts = parse_ts(r.get("submittedAt"))
+            if ts is not None:
+                out.append((ts, r["author"]["login"]))
+    out.sort()
+    return out
+
+
+# ---------------------------------------------------------------- enrich
+
+
+def enrich(prs):
+    for pr in prs:
+        pr["_created"] = parse_ts(pr["createdAt"])
+        pr["_updated"] = parse_ts(pr["updatedAt"])
+        pr["_merged"] = parse_ts(pr.get("mergedAt"))
+        pr["_ready"] = compute_ready_time(pr)
+        pr["_req_teams"] = requested_teams(pr)
+        pr["_req_umbrellas"] = {to_umbrella(t) for t in pr["_req_teams"]}
+        pr["_approvals"] = approvals(pr)
+        pr["_author"] = (pr.get("author") or {}).get("login")
+        pr["_is_bot_author"] = is_bot(pr.get("author"))
+    return prs
+
+
+def passes_filters(pr):
+    """Apply the user's filters: drop drafts and stale open PRs."""
+    if pr.get("isDraft"):
+        return False
+    if pr.get("state") == "OPEN":
+        if pr["_updated"] is None or pr["_updated"] < NOW - datetime.timedelta(days=STALE_OPEN_DAYS):
+            return False
+    return True
+
+
+def determine_home_teams(prs, login_to_teams):
+    """
+    Empirically map each (human) author to a single home *umbrella* team.
+
+    Signal: among an author's 'solo' PRs (exactly one umbrella requested), the most
+    frequently requested umbrella is their home. Falls back to the most frequent
+    umbrella across all their PRs. Org membership (mapped to umbrellas) is a
+    tie-breaker. ic-owners-owners (a meta team) is not a home candidate unless it
+    is the only signal.
+    """
+    solo_counts = defaultdict(Counter)
+    all_counts = defaultdict(Counter)
+    for pr in prs:
+        if pr["_is_bot_author"] or pr["_author"] is None:
+            continue
+        ru = pr["_req_umbrellas"]
+        if not ru:
+            continue
+        a = pr["_author"]
+        for t in ru:
+            all_counts[a][t] += 1
+        if len(ru) == 1:
+            solo_counts[a][next(iter(ru))] += 1
+
+    home = {}
+    authors = set(all_counts) | set(solo_counts)
+    for a in authors:
+        counts = solo_counts.get(a) or all_counts.get(a)
+        if not counts:
+            continue
+        # drop ic-owners-owners as a home candidate unless it's all we have
+        filtered = Counter({k: v for k, v in counts.items() if k != "ic-owners-owners"})
+        use = filtered or counts
+        org_umbrellas = {to_umbrella(t) for t in login_to_teams.get(a, [])}
+        best = max(
+            use.items(),
+            key=lambda kv: (kv[1], kv[0] in org_umbrellas, kv[0]),
+        )[0]
+        home[a] = best
+    return home
+
+
+def classify(pr, home):
+    """
+    Return one of: same_team, cross_team, no_team_request, unknown_home.
+
+    Same-team = the PR's requested umbrellas are only the author's home umbrella.
+    Cross-team = at least one other umbrella is also required to approve.
+    """
+    ru = pr["_req_umbrellas"]
+    if not ru:
+        return "no_team_request"
+    a = pr["_author"]
+    h = home.get(a)
+    if h is None:
+        return "unknown_home"
+    if ru <= {h}:
+        return "same_team"
+    return "cross_team"
+
+
+# ---------------------------------------------------------------- stats
+
+
+def stat_block(values):
+    """values: list of hours."""
+    if not values:
+        return None
+    return {
+        "n": len(values),
+        "median": statistics.median(values),
+        "mean": statistics.fmean(values),
+        "p25": pct(values, 25),
+        "p75": pct(values, 75),
+        "p90": pct(values, 90),
+        "p95": pct(values, 95),
+    }
+
+
+def buckets(values):
+    """Share of values <= each threshold (in hours)."""
+    thresholds = [1, 4, 8, 24, 48, 72, 168]  # 1h,4h,8h,1d,2d,3d,1w
+    n = len(values)
+    if n == 0:
+        return {}
+    out = {}
+    for t in thresholds:
+        out[t] = sum(1 for v in values if v <= t) / n
+    return out
+
+
+def analyze_window(prs, days, home):
+    cutoff = NOW - datetime.timedelta(days=days)
+    sel = [p for p in prs if p["_created"] >= cutoff and passes_filters(p)]
+
+    human = [p for p in sel if not p["_is_bot_author"]]
+    bot = [p for p in sel if p["_is_bot_author"]]
+
+    # state breakdown (human)
+    states = Counter(p["state"] for p in human)
+
+    # classification (human only)
+    cls = Counter(classify(p, home) for p in human)
+
+    # team-count distribution (human, with >=1 requested team)
+    team_count_dist = Counter()
+    for p in human:
+        k = len(p["_req_umbrellas"])
+        if k == 0:
+            team_count_dist["0"] += 1
+        elif k == 1:
+            team_count_dist["1"] += 1
+        elif k == 2:
+            team_count_dist["2"] += 1
+        else:
+            team_count_dist["3+"] += 1
+
+    def timing_for(group):
+        """Return dict with first/last approval wall & business hour stats."""
+        first_wall, first_biz, last_wall, last_biz, to_merge = [], [], [], [], []
+        approved_n = 0
+        for p in group:
+            ready = p["_ready"]
+            apps = p["_approvals"]
+            apps = [(ts, who) for (ts, who) in apps if ts >= ready]  # ignore pre-ready
+            if not apps:
+                continue
+            approved_n += 1
+            first = apps[0][0]
+            merged = p["_merged"]
+            if merged is not None:
+                last_candidates = [ts for ts, _ in apps if ts <= merged + datetime.timedelta(minutes=1)]
+                last = max(last_candidates) if last_candidates else apps[-1][0]
+            else:
+                last = apps[-1][0]
+            first_wall.append((first - ready).total_seconds() / 3600.0)
+            last_wall.append((last - ready).total_seconds() / 3600.0)
+            first_biz.append(weekday_seconds(ready, first) / 3600.0)
+            last_biz.append(weekday_seconds(ready, last) / 3600.0)
+            if merged is not None:
+                to_merge.append((merged - ready).total_seconds() / 3600.0)
+        return {
+            "approved_n": approved_n,
+            "first_wall": stat_block(first_wall),
+            "first_biz": stat_block(first_biz),
+            "last_wall": stat_block(last_wall),
+            "last_biz": stat_block(last_biz),
+            "to_merge": stat_block(to_merge),
+            "first_wall_buckets": buckets(first_wall),
+            "last_wall_buckets": buckets(last_wall),
+        }
+
+    same = [p for p in human if classify(p, home) == "same_team"]
+    cross = [p for p in human if classify(p, home) == "cross_team"]
+
+    return {
+        "days": days,
+        "cutoff": cutoff,
+        "n_selected": len(sel),
+        "n_human": len(human),
+        "n_bot": len(bot),
+        "states": states,
+        "cls": cls,
+        "team_count_dist": team_count_dist,
+        "overall": timing_for(human),
+        "same": timing_for(same),
+        "cross": timing_for(cross),
+        "n_same": len(same),
+        "n_cross": len(cross),
+    }
+
+
+# ---------------------------------------------------------------- render
+
+
+def render_stat_row(label, block):
+    if block is None:
+        return f"| {label} | 0 | – | – | – | – | – | – |"
+    return (
+        f"| {label} | {block['n']} | {fmt_hours(block['median'])} | "
+        f"{fmt_hours(block['mean'])} | {fmt_hours(block['p25'])} | "
+        f"{fmt_hours(block['p75'])} | {fmt_hours(block['p90'])} | "
+        f"{fmt_hours(block['p95'])} |"
+    )
+
+
+def render_timing_table(title, timing):
+    lines = [f"**{title}** (n with an approval = {timing['approved_n']})", ""]
+    lines.append("| Metric | n | Median | Mean | p25 | p75 | p90 | p95 |")
+    lines.append("|---|---:|---:|---:|---:|---:|---:|---:|")
+    lines.append(render_stat_row("Time to first approval (wall-clock)", timing["first_wall"]))
+    lines.append(render_stat_row("Time to first approval (Mon–Fri)", timing["first_biz"]))
+    lines.append(render_stat_row("Time to full approval (wall-clock)", timing["last_wall"]))
+    lines.append(render_stat_row("Time to full approval (Mon–Fri)", timing["last_biz"]))
+    lines.append(render_stat_row("Time to merge (wall-clock)", timing["to_merge"]))
+    lines.append("")
+    return "\n".join(lines)
+
+
+def render_buckets(title, b):
+    if not b:
+        return ""
+    order = [(1, "≤1h"), (4, "≤4h"), (8, "≤8h"), (24, "≤1d"), (48, "≤2d"), (72, "≤3d"), (168, "≤1w")]
+    head = "| " + title + " | " + " | ".join(lbl for _, lbl in order) + " |"
+    sep = "|---|" + "|".join("---:" for _ in order) + "|"
+    row = "| share | " + " | ".join(f"{b[t] * 100:.0f}%" for t, _ in order) + " |"
+    return "\n".join([head, sep, row, ""])
+
+
+def render_compare(res):
+    """Compact same-team vs cross-team comparison for the headline metrics."""
+
+    def cell(block, key, stat):
+        b = block.get(key)
+        if not b or b.get(stat) is None:
+            return "n/a"
+        return fmt_hours(b[stat])
+
+    sm, cr = res["same"], res["cross"]
+    lines = ["### Same-team vs cross-team at a glance\n"]
+    lines.append("| Metric | Same-team | Cross-team |")
+    lines.append("|---|---:|---:|")
+    lines.append(f"| PRs with an approval | {sm['approved_n']} | {cr['approved_n']} |")
+    lines.append(
+        f"| First approval — median | {cell(sm, 'first_wall', 'median')} | {cell(cr, 'first_wall', 'median')} |"
+    )
+    lines.append(f"| Full approval — median | {cell(sm, 'last_wall', 'median')} | {cell(cr, 'last_wall', 'median')} |")
+    lines.append(f"| Full approval — p90 | {cell(sm, 'last_wall', 'p90')} | {cell(cr, 'last_wall', 'p90')} |")
+    lines.append(f"| Full approval — p95 | {cell(sm, 'last_wall', 'p95')} | {cell(cr, 'last_wall', 'p95')} |")
+    lines.append("")
+    return "\n".join(lines)
+
+
+def render(results, home, prs):
+    out = []
+    out.append("# dfinity/ic — PR approval waiting-time analysis\n")
+    out.append(f"_Generated {NOW.strftime('%Y-%m-%d %H:%M UTC')}._\n")
+    out.append(
+        "This report quantifies how long pull requests in `dfinity/ic` wait for "
+        "code-review **approval**, to assess whether review is a productivity "
+        "bottleneck. Results are split by whether a PR only needs approval from the "
+        "author's **own team** or also from **other teams**.\n"
+    )
+
+    # ---- Executive summary (driven by the 12-month window) ----
+    r12 = results[0][1]
+    ov = r12["overall"]
+    sm = r12["same"]
+    cr = r12["cross"]
+
+    def med(block, key):
+        return fmt_hours(block[key]["median"]) if block[key] else "n/a"
+
+    def p90(block, key):
+        return fmt_hours(block[key]["p90"]) if block[key] else "n/a"
+
+    out.append("## Executive summary\n")
+    out.append(
+        f"Over the last 12 months, **{r12['n_human']:,}** human-authored PRs passed the "
+        f"filters. Of those that received review, the **typical (median) PR waited "
+        f"{med(ov, 'first_wall')} for a first approval and {med(ov, 'last_wall')} for "
+        f"full approval** (all required teams). So for the median PR, approval is "
+        f"reasonably fast — review is *not* a severe bottleneck at the center of the "
+        f"distribution.\n"
+    )
+    out.append(
+        "**The cost is in the tail and in cross-team PRs.** Key findings:\n\n"
+        f"- **Cross-team PRs are the bottleneck.** They reach *full* approval in a "
+        f"median of **{med(cr, 'last_wall')}** vs **{med(sm, 'last_wall')}** for "
+        f"same-team PRs, and their tail is far heavier (p90 **{p90(cr, 'last_wall')}** "
+        f"vs **{p90(sm, 'last_wall')}**). Each extra team that must approve adds "
+        f"serial waiting.\n"
+        f"- **Cross-team PRs get a *first* look fastest** (median "
+        f"{med(cr, 'first_wall')} vs {med(sm, 'first_wall')} for same-team) — more "
+        f"requested reviewers means someone responds quickly — but converting that "
+        f"into *all* required approvals is what drags.\n"
+        f"- **A heavy tail affects every category.** Overall, full approval takes "
+        f"p90 **{p90(ov, 'last_wall')}** and p95 "
+        f"**{fmt_hours(ov['last_wall']['p95']) if ov['last_wall'] else 'n/a'}**. "
+        f"Roughly 10–20% of PRs wait multiple days for approval; this is where "
+        f"engineer time is actually lost.\n"
+        f"- **Weekends inflate the wait.** Counting only Mon–Fri shaves the tail "
+        f"materially (overall full-approval p90 drops from {p90(ov, 'last_wall')} "
+        f"wall-clock to {p90(ov, 'last_biz')} business-time).\n"
+        f"- **Most PRs are single-team.** {r12['team_count_dist'].get('1', 0):,} of "
+        f"{r12['n_human']:,} human PRs need only one team; "
+        f"{r12['team_count_dist'].get('2', 0):,} need two and "
+        f"{r12['team_count_dist'].get('3+', 0):,} need three or more.\n"
+        f"- **Recent trend is slightly better.** The last 6 months are modestly "
+        f"faster than the full year (see below), so review latency is not worsening.\n"
+    )
+    out.append(
+        "**Bottom line:** code review is responsive for the median change, but "
+        "**cross-team approval coordination and a heavy multi-day tail** are the real "
+        "drags on throughput. If frontier models make writing code cheaper, the "
+        "relative cost of these approval waits — especially for PRs spanning multiple "
+        "teams — will dominate cycle time.\n"
+    )
+
+    # Methodology
+    out.append("## Methodology\n")
+    out.append(
+        "- **Source:** all PRs created in the last 12 months (2025-06-18 → "
+        "2026-06-18), pulled from the GitHub GraphQL API (reviews + review-request "
+        "timeline events). 4,885 PRs fetched before filtering.\n"
+        "- **Filters applied:** draft PRs are excluded; open PRs not updated in the "
+        f"last {STALE_OPEN_DAYS} days are excluded (treated as abandoned). "
+        "Bot-authored PRs are reported separately and excluded from timing stats.\n"
+        "- **Ready time** = when the PR first became reviewable (creation time, or the "
+        "ready-for-review event if it was opened as a draft).\n"
+        "- **Approval** = a human `APPROVED` review (bot/automation reviews ignored).\n"
+        "- **Time to first approval** = ready → first approval. **Time to full "
+        "approval** = ready → last approval at/before merge (proxy for 'all required "
+        "approvals obtained'). PRs that never received a human approval are excluded "
+        "from these two metrics (so very fast trivial merges by owners are not "
+        "counted).\n"
+        "- **Mon–Fri** rows count only weekday time, removing weekends from the "
+        "wait (approximate 'business time', UTC; ignores public holidays and "
+        "time-of-day).\n"
+        "- **Team gating:** GitHub requests review from the CODEOWNERS teams that own "
+        "the changed files. A PR is **same-team** if every requested team maps to the "
+        "author's own umbrella area, otherwise **cross-team**.\n"
+        "- **Team consolidation handled.** CODEOWNERS was reorganized during the "
+        "window (61 changes); most notably (#10114, 2026-05-07) the fine-grained "
+        "teams `consensus`, `execution`, `team-dsm`, `ic-message-routing-owners`, "
+        "`ic-interface-owners`, `crypto-team`, `pocket-ic` were consolidated into the "
+        "`core-protocol` umbrella, and `defi-team`→`defi`, `boundary-node`→`node`, "
+        "etc. To measure same/cross-team **consistently over time**, every team name "
+        "ever requested is mapped to a stable umbrella "
+        "(`core-protocol`, `defi`, `node`, `idx`, `dre`, `governance-team`, `sdk`, "
+        "`infrasec`, `product-security`, `ic-owners-owners`). The mapping is derived "
+        "by resolving each historical CODEOWNERS path through the current file.\n"
+        "- **Author home team** is derived empirically from the umbrella most often "
+        "requested on that author's single-team PRs. This reflects where an author "
+        "actually contributes (agreeing with org-team membership for 18/25 "
+        "code-owners; the rest are people who mostly contribute outside their nominal "
+        "team or have very few PRs).\n"
+    )
+
+    for label, res in results:
+        out.append(f"## Last {label}\n")
+        out.append(
+            f"PRs created since **{res['cutoff'].strftime('%Y-%m-%d')}** that pass the "
+            f"filters: **{res['n_selected']}** total — {res['n_human']} human-authored, "
+            f"{res['n_bot']} bot-authored.\n"
+        )
+        states = res["states"]
+        out.append(
+            f"Human PR state: {states.get('MERGED', 0)} merged, "
+            f"{states.get('OPEN', 0)} open, {states.get('CLOSED', 0)} closed.\n"
+        )
+        cls = res["cls"]
+        out.append(
+            f"Classification (human): **{cls.get('same_team', 0)} same-team**, "
+            f"**{cls.get('cross_team', 0)} cross-team**, "
+            f"{cls.get('no_team_request', 0)} no-team-request, "
+            f"{cls.get('unknown_home', 0)} unknown-home.\n"
+        )
+        tcd = res["team_count_dist"]
+        out.append(
+            f"Reviewing teams per PR: {tcd.get('1', 0)} need 1 team, "
+            f"{tcd.get('2', 0)} need 2, {tcd.get('3+', 0)} need 3+ "
+            f"({tcd.get('0', 0)} had no team request).\n"
+        )
+
+        # Same-vs-cross comparison
+        out.append(render_compare(res))
+
+        out.append("### Overall\n")
+        out.append(render_timing_table("All human PRs with an approval", res["overall"]))
+        out.append(render_buckets("First-approval (wall-clock) reached within", res["overall"]["first_wall_buckets"]))
+
+        out.append("### Same-team PRs\n")
+        out.append(render_timing_table("Same-team PRs", res["same"]))
+        out.append(render_buckets("First-approval (wall-clock) reached within", res["same"]["first_wall_buckets"]))
+
+        out.append("### Cross-team PRs\n")
+        out.append(render_timing_table("Cross-team PRs", res["cross"]))
+        out.append(render_buckets("Full-approval (wall-clock) reached within", res["cross"]["last_wall_buckets"]))
+
+    # limitations
+    out.append("## Limitations & caveats\n")
+    out.append(
+        "- **Approval ≠ active waiting.** Elapsed time to approval overlaps with the "
+        "author iterating, addressing comments, or working on other PRs. It is an "
+        "upper bound on 'lost' time, not pure idle time.\n"
+        "- **Full-approval proxy.** 'Time to full approval' uses the last approval at "
+        "or before merge. If a PR was mergeable after fewer approvals, this slightly "
+        "overstates the required wait; if approvals were dismissed by new commits and "
+        "re-requested, it can understate it.\n"
+        "- **Home-team is current, data-derived.** It is inferred from a whole year of "
+        "review requests; an author who switched teams mid-year is assigned their "
+        "dominant area. Team membership history is not available from the API.\n"
+        "- **Team mapping is best-effort.** A few low-volume historical teams "
+        "(`utopia`, `languages`, `platform-operations`, ~0.3% of requests) are mapped "
+        "heuristically; this does not materially affect the aggregates.\n"
+        "- **Review-request timeline truncation.** A small number of PRs (≈6%) have "
+        ">50 timeline events; the first ready/review-request events used here occur "
+        "early, so truncation has negligible effect on classification.\n"
+        "- **Bots excluded.** Automation-authored PRs (release bots, dependabot, "
+        "etc.) are reported as counts but excluded from timing to avoid skew.\n"
+    )
+
+    # validation appendix
+    out.append("## Appendix: home-team assignment (top 25 authors by PR count)\n")
+    counts = Counter(p["_author"] for p in prs if not p["_is_bot_author"] and p["_author"])
+    out.append("| Author | PRs (12mo) | Home umbrella |")
+    out.append("|---|---:|---|")
+    for a, c in counts.most_common(25):
+        out.append(f"| {a} | {c} | {home.get(a, '?')} |")
+    out.append("")
+
+    return "\n".join(out)
+
+
+def main():
+    teams = json.load(open("teams.json"))
+    login_to_teams = teams["login_to_teams"]
+    prs = enrich(load_prs())
+    print(f"loaded {len(prs)} PRs")
+
+    home = determine_home_teams(prs, login_to_teams)
+
+    # quick validation: agreement with org membership (mapped to umbrellas) for
+    # authors who are members of a code-owning team
+    agree = total = 0
+    for a, h in home.items():
+        org_umbrellas = {to_umbrella(t) for t in login_to_teams.get(a, [])}
+        if org_umbrellas:
+            total += 1
+            if h in org_umbrellas:
+                agree += 1
+    print(f"home-umbrella agreement with org membership: {agree}/{total}")
+
+    results = []
+    for label, days in WINDOWS:
+        results.append((label, analyze_window(prs, days, home)))
+
+    # console summary
+    for label, res in results:
+        print(f"\n=== last {label} ===")
+        print(f"selected={res['n_selected']} human={res['n_human']} bot={res['n_bot']}")
+        print(f"same={res['n_same']} cross={res['n_cross']} cls={dict(res['cls'])}")
+        for name in ("overall", "same", "cross"):
+            t = res[name]
+            fw = t["first_wall"]
+            lw = t["last_wall"]
+            if fw:
+                print(
+                    f"  {name}: approved_n={t['approved_n']} "
+                    f"first_med={fmt_hours(fw['median'])} "
+                    f"full_med={fmt_hours(lw['median'])}"
+                )
+
+    report = render(results, home, prs)
+    with open("report.md", "w") as f:
+        f.write(report)
+    print("\nwrote report.md")
+
+
+if __name__ == "__main__":
+    main()
diff --git a/pr-approval-analysis/build_team_mapping.py b/pr-approval-analysis/build_team_mapping.py
new file mode 100644
index 000000000000..ebab5b3e0fee
--- /dev/null
+++ b/pr-approval-analysis/build_team_mapping.py
@@ -0,0 +1,190 @@
+#!/usr/bin/env python3
+"""
+Derive a stable fine-grained-team -> umbrella-team mapping for dfinity/ic.
+
+Team ownership in .github/CODEOWNERS changed a lot over the analysis window: many
+fine-grained teams (consensus, execution, ic-message-routing-owners, crypto-team,
+team-dsm, finint, cross-chain-team, ...) were consolidated into ~10 umbrella teams
+(core-protocol, defi, node, idx, dre, governance-team, sdk, infrasec,
+product-security, ic-owners-owners).
+
+To classify PRs consistently across time we map every team name that ever appears
+in a review request to a stable umbrella. We derive the mapping empirically: for
+each path pattern owned by a fine team in the OLD CODEOWNERS (~12 months ago), we
+resolve that same path through the CURRENT CODEOWNERS to see which umbrella owns it
+now. The umbrella a fine team's paths most often map to becomes its umbrella.
+
+Writes team_mapping.json: {fine_team: umbrella_team}.
+"""
+
+import json
+import re
+from collections import Counter, defaultdict
+
+# The 10 umbrella teams used by the current CODEOWNERS.
+UMBRELLAS = {
+    "core-protocol",
+    "defi",
+    "node",
+    "idx",
+    "dre",
+    "governance-team",
+    "sdk",
+    "infrasec",
+    "product-security",
+    "ic-owners-owners",
+}
+
+
+def parse(path):
+    rules = []
+    with open(path) as f:
+        for line in f:
+            line = line.strip()
+            if not line or line.startswith("#"):
+                continue
+            parts = line.split()
+            pat = parts[0]
+            teams = [p[len("@dfinity/") :] for p in parts[1:] if p.startswith("@dfinity/")]
+            if teams:
+                rules.append((pat, teams))
+    return rules
+
+
+def pattern_to_regex(pat):
+    """Approximate gitignore/CODEOWNERS pattern semantics as a regex over a path."""
+    anchored = pat.startswith("/")
+    p = pat[1:] if anchored else pat
+    dir_only = p.endswith("/")
+    p = p.rstrip("/")
+
+    # Build regex piece by piece, handling ** and * specially.
+    out = []
+    i = 0
+    while i < len(p):
+        c = p[i]
+        if c == "*":
+            if i + 1 < len(p) and p[i + 1] == "*":
+                out.append(".*")
+                i += 2
+                continue
+            out.append("[^/]*")
+        elif c in ".+()[]{}^$|\\":
+            out.append("\\" + c)
+        elif c == "?":
+            out.append("[^/]")
+        else:
+            out.append(c)
+        i += 1
+    body = "".join(out)
+
+    if anchored:
+        prefix = "^"
+    else:
+        # unanchored: match at start of any path segment
+        prefix = "^(?:.*/)?"
+    if dir_only:
+        suffix = "(?:/.*)?$"
+    else:
+        suffix = "(?:/.*)?$"
+    return re.compile(prefix + body + suffix)
+
+
+def compile_rules(rules):
+    return [(pattern_to_regex(pat), teams, pat) for pat, teams in rules]
+
+
+def resolve(path, compiled):
+    """Last matching rule wins (CODEOWNERS semantics)."""
+    found = None
+    for rx, teams, pat in compiled:
+        if rx.match(path):
+            found = teams
+    return found
+
+
+def representative_path(pat):
+    """A concrete-ish path that the pattern should match."""
+    anchored = pat.startswith("/")
+    p = pat[1:] if anchored else pat
+    p = p.rstrip("/")
+    p = p.replace("**", "x").replace("*", "x")
+    if not p:
+        p = "x"
+    # ensure it looks like a file under the dir
+    return p
+
+
+def main():
+    old = parse("codeowners_old.txt")
+    new = compile_rules(parse("codeowners_new.txt"))
+
+    mapping_counts = defaultdict(Counter)
+    for pat, teams in old:
+        rep = representative_path(pat)
+        umb = resolve(rep, new)
+        if not umb:
+            continue
+        for t in teams:
+            for u in umb:
+                mapping_counts[t][u] += 1
+
+    # Resolve each fine team to a single umbrella.
+    mapping = {}
+    for t, counts in mapping_counts.items():
+        # prefer a non-catch-all umbrella
+        non_catch = Counter({k: v for k, v in counts.items() if k != "ic-owners-owners"})
+        use = non_catch or counts
+        mapping[t] = use.most_common(1)[0][0]
+
+    # Umbrellas map to themselves.
+    for u in UMBRELLAS:
+        mapping[u] = u
+
+    # A few teams may not appear in old file paths but show up in PR review
+    # requests (e.g. renamed teams). Add safe explicit fallbacks, then let the
+    # data-derived values override where present.
+    explicit = {
+        "defi-team": "defi",
+        "ledger": "defi",
+        "ledger-suite": "defi",
+        "bitcoin": "defi",
+        "ic-message-routing-owners": "core-protocol",
+        "ic-interface-owners": "core-protocol",
+        "crypto-team": "core-protocol",
+        "consensus": "core-protocol",
+        "execution": "core-protocol",
+        "runtime": "core-protocol",
+        "team-dsm": "core-protocol",
+        "pocket-ic": "core-protocol",
+        "formal-models": "core-protocol",
+        "research": "core-protocol",
+        "p2p-systems-research-team": "core-protocol",
+        "networking": "core-protocol",
+        "boundary-node": "node",
+        "canister-os": "node",
+        "platform-operations": "dre",
+        "governance": "governance-team",
+        "finint": "defi",
+        "cross-chain-team": "defi",
+        "languages": "sdk",
+        "motoko": "sdk",
+        "utopia": "sdk",
+        "security": "product-security",
+        "ic-support": "ic-owners-owners",
+    }
+    for t, u in explicit.items():
+        mapping.setdefault(t, u)
+
+    with open("team_mapping.json", "w") as f:
+        json.dump(mapping, f, indent=2, sort_keys=True)
+
+    print("=== derived fine -> umbrella mapping ===")
+    for t in sorted(mapping):
+        derived = dict(mapping_counts.get(t, {}))
+        flag = "" if t in UMBRELLAS else ("  <-derived " + str(derived) if derived else "  <-explicit")
+        print(f"{t:30} -> {mapping[t]:18}{flag}")
+
+
+if __name__ == "__main__":
+    main()
diff --git a/pr-approval-analysis/fetch_prs.py b/pr-approval-analysis/fetch_prs.py
new file mode 100644
index 000000000000..88c37c3d21dc
--- /dev/null
+++ b/pr-approval-analysis/fetch_prs.py
@@ -0,0 +1,207 @@
+#!/usr/bin/env python3
+"""
+Fetch all PRs from dfinity/ic created within the last N days via the GraphQL API.
+
+For each PR we capture creation/ready/merge/close timestamps, draft status, state,
+size, the review decisions/approvals (with timestamps + authors), and the timeline
+events that tell us which *teams* were requested for review (CODEOWNERS) plus
+ready-for-review / convert-to-draft transitions.
+
+Writes prs.jsonl (one JSON object per line).
+"""
+
+import datetime
+import json
+import os
+import subprocess
+import sys
+import time
+import urllib.error
+import urllib.request
+
+DAYS = 365
+PAGE_SIZE = 25
+
+
+def get_token():
+    # Prefer an independent token that the other machine's `gh auth login`
+    # cannot revoke: a PAT in a file, or GH_TOKEN/GITHUB_TOKEN env. Fall back
+    # to the shared gh OAuth token only if nothing else is provided.
+    token_file = os.environ.get("PR_FETCH_TOKEN_FILE", os.path.expanduser("~/.gh_pat"))
+    if os.path.exists(token_file):
+        t = open(token_file).read().strip()
+        if t:
+            return t
+    t = os.environ.get("GH_TOKEN") or os.environ.get("GITHUB_TOKEN")
+    if t:
+        return t.strip()
+    return subprocess.check_output(["gh", "auth", "token"], text=True).strip()
+
+
+NOW = datetime.datetime.now(datetime.timezone.utc)
+CUTOFF = NOW - datetime.timedelta(days=DAYS)
+
+QUERY = (
+    """
+query($cursor: String) {
+  repository(owner: "dfinity", name: "ic") {
+    pullRequests(first: %d, after: $cursor, orderBy: {field: CREATED_AT, direction: DESC}) {
+      pageInfo { hasNextPage endCursor }
+      nodes {
+        number
+        author { login __typename }
+        createdAt
+        updatedAt
+        isDraft
+        state
+        mergedAt
+        closedAt
+        additions
+        deletions
+        changedFiles
+        reviewDecision
+        reviews(first: 50) {
+          totalCount
+          nodes { state submittedAt author { login __typename } }
+        }
+        timelineItems(first: 50, itemTypes: [READY_FOR_REVIEW_EVENT, CONVERT_TO_DRAFT_EVENT, REVIEW_REQUESTED_EVENT]) {
+          totalCount
+          nodes {
+            __typename
+            ... on ReadyForReviewEvent { createdAt }
+            ... on ConvertToDraftEvent { createdAt }
+            ... on ReviewRequestedEvent {
+              createdAt
+              requestedReviewer {
+                __typename
+                ... on Team { slug }
+                ... on User { login }
+              }
+            }
+          }
+        }
+      }
+    }
+  }
+}
+"""
+    % PAGE_SIZE
+)
+
+
+class AuthExpired(Exception):
+    """Raised when the GitHub token is no longer valid and a manual re-auth is needed."""
+
+
+def post(query, variables):
+    body = json.dumps({"query": query, "variables": variables}).encode()
+    last_err = None
+    auth_fails = 0
+    for attempt in range(8):
+        token = get_token()  # gh returns the current token (no background rotation)
+        req = urllib.request.Request(
+            "https://api.github.com/graphql",
+            data=body,
+            headers={"Authorization": f"bearer {token}", "Content-Type": "application/json"},
+        )
+        try:
+            with urllib.request.urlopen(req) as r:
+                data = json.loads(r.read())
+            if "errors" in data:
+                # RATE_LIMITED or transient — back off and retry
+                last_err = data["errors"]
+                print("GraphQL errors:", data["errors"], file=sys.stderr)
+                time.sleep(3 * (attempt + 1))
+                continue
+            return data
+        except urllib.error.HTTPError as e:
+            last_err = e
+            if e.code == 401:
+                # Token expired. gh does NOT refresh non-interactively, so do not
+                # call `gh auth refresh` (it blocks on the device flow). Retry a
+                # couple times to rule out a spurious 401, then bail out so the
+                # caller can checkpoint and exit for a manual re-auth + resume.
+                auth_fails += 1
+                print(f"401 Unauthorized (auth_fails={auth_fails})", file=sys.stderr)
+                if auth_fails >= 3:
+                    raise AuthExpired(str(e))
+                time.sleep(2)
+                continue
+            print("HTTP error:", e, file=sys.stderr)
+            time.sleep(3 * (attempt + 1))
+        except Exception as e:  # noqa: BLE001
+            last_err = e
+            print("error:", e, file=sys.stderr)
+            time.sleep(3 * (attempt + 1))
+    raise RuntimeError(f"failed after retries: {last_err}")
+
+
+def parse_ts(s):
+    return datetime.datetime.fromisoformat(s.replace("Z", "+00:00"))
+
+
+def main():
+    # Resume support: load already-fetched PR numbers and the saved cursor.
+    seen = set()
+    if os.path.exists("prs.jsonl"):
+        with open("prs.jsonl") as fr:
+            for line in fr:
+                line = line.strip()
+                if not line:
+                    continue
+                try:
+                    seen.add(json.loads(line)["number"])
+                except Exception:  # noqa: BLE001
+                    pass
+    cursor = os.environ.get("RESUME_CURSOR") or None
+    if cursor is None and os.path.exists("cursor.txt"):
+        cursor = open("cursor.txt").read().strip() or None
+    print(f"resume: have={len(seen)} start_cursor={cursor}", file=sys.stderr)
+
+    new = 0
+    stop = False
+    f = open("prs.jsonl", "a")
+    try:
+        while True:
+            try:
+                data = post(QUERY, {"cursor": cursor})
+            except (AuthExpired, RuntimeError) as e:
+                # Checkpoint the cursor for the page we failed on, then exit so the
+                # caller can re-authenticate and rerun (which resumes from here).
+                with open("cursor.txt", "w") as cf:
+                    cf.write(cursor or "")
+                f.flush()
+                os.fsync(f.fileno())
+                print(f"INTERRUPTED cursor={cursor} err={e}", file=sys.stderr)
+                print(f"RESUME_NEEDED have={len(seen)} new_this_run={new}")
+                sys.exit(2)
+            conn = data["data"]["repository"]["pullRequests"]
+            for pr in conn["nodes"]:
+                if parse_ts(pr["createdAt"]) < CUTOFF:
+                    stop = True
+                    break  # ordered DESC, everything after is older
+                if pr["number"] in seen:
+                    continue  # dedupe across resumes / overlapping pages
+                seen.add(pr["number"])
+                f.write(json.dumps(pr) + "\n")
+                new += 1
+            f.flush()
+            os.fsync(f.fileno())
+            next_cursor = conn["pageInfo"]["endCursor"]
+            print(f"have={len(seen)} new={new} stop={stop} next={next_cursor}", file=sys.stderr)
+            if stop or not conn["pageInfo"]["hasNextPage"]:
+                break
+            cursor = next_cursor
+            with open("cursor.txt", "w") as cf:
+                cf.write(cursor)
+    finally:
+        f.close()
+    # Completed the full window: clear the checkpoint.
+    if os.path.exists("cursor.txt"):
+        os.remove("cursor.txt")
+    print(f"DONE total_unique={len(seen)} new_this_run={new}")
+    print(f"cutoff={CUTOFF.isoformat()} now={NOW.isoformat()}")
+
+
+if __name__ == "__main__":
+    main()
diff --git a/pr-approval-analysis/fetch_teams.py b/pr-approval-analysis/fetch_teams.py
new file mode 100644
index 000000000000..014c78401a4e
--- /dev/null
+++ b/pr-approval-analysis/fetch_teams.py
@@ -0,0 +1,51 @@
+#!/usr/bin/env python3
+"""
+Fetch current membership of all code-owning teams in the dfinity org.
+
+Writes teams.json: {"login_to_teams": {login: [team,...]}, "team_to_members": {team: [login,...]}}
+Membership is the *current* membership (GitHub API has no historical team membership),
+which is a known limitation when classifying older PRs.
+"""
+
+import json
+import subprocess
+
+# Teams referenced in .github/CODEOWNERS
+TEAMS = [
+    "core-protocol",
+    "defi",
+    "dre",
+    "governance-team",
+    "ic-owners-owners",
+    "idx",
+    "infrasec",
+    "node",
+    "product-security",
+    "sdk",
+]
+
+
+def members_of(team):
+    out = subprocess.check_output(
+        ["gh", "api", "--paginate", f"orgs/dfinity/teams/{team}/members", "--jq", ".[].login"],
+        text=True,
+    )
+    return [line for line in out.split() if line]
+
+
+def main():
+    login_to_teams = {}
+    team_to_members = {}
+    for t in TEAMS:
+        members = members_of(t)
+        team_to_members[t] = members
+        for m in members:
+            login_to_teams.setdefault(m, []).append(t)
+    with open("teams.json", "w") as f:
+        json.dump({"login_to_teams": login_to_teams, "team_to_members": team_to_members}, f, indent=2)
+    print("team sizes:", {t: len(m) for t, m in team_to_members.items()})
+    print("distinct members:", len(login_to_teams))
+
+
+if __name__ == "__main__":
+    main()
diff --git a/pr-approval-analysis/team_mapping.json b/pr-approval-analysis/team_mapping.json
new file mode 100644
index 000000000000..5c47f43d656e
--- /dev/null
+++ b/pr-approval-analysis/team_mapping.json
@@ -0,0 +1,39 @@
+{
+  "bitcoin": "defi",
+  "boundary-node": "node",
+  "canister-os": "node",
+  "consensus": "core-protocol",
+  "core-protocol": "core-protocol",
+  "cross-chain-team": "defi",
+  "crypto-team": "core-protocol",
+  "defi": "defi",
+  "defi-team": "defi",
+  "dre": "dre",
+  "execution": "core-protocol",
+  "finint": "defi",
+  "formal-models": "core-protocol",
+  "governance": "governance-team",
+  "governance-team": "governance-team",
+  "ic-interface-owners": "core-protocol",
+  "ic-message-routing-owners": "core-protocol",
+  "ic-owners-owners": "ic-owners-owners",
+  "ic-support": "ic-owners-owners",
+  "idx": "idx",
+  "infrasec": "infrasec",
+  "languages": "core-protocol",
+  "ledger": "defi",
+  "ledger-suite": "defi",
+  "motoko": "sdk",
+  "networking": "core-protocol",
+  "node": "node",
+  "p2p-systems-research-team": "core-protocol",
+  "platform-operations": "ic-owners-owners",
+  "pocket-ic": "core-protocol",
+  "product-security": "product-security",
+  "research": "idx",
+  "runtime": "core-protocol",
+  "sdk": "sdk",
+  "security": "product-security",
+  "team-dsm": "core-protocol",
+  "utopia": "idx"
+}
\ No newline at end of file