Council Digest makes Toronto City Council meetings easy to understand. It scrapes meeting decisions and minutes from the City of Toronto website, uses Google Gemini to extract individual motions into plain-language summaries, and presents them in a clean, searchable interface.
- Timeline of meetings — Browse recent council and committee meetings (Community Councils, Board of Health, advisory committees, etc.), newest first
- Region filter — Focus on meetings from North York, Etobicoke York, Toronto & East York, Scarborough, or city-wide bodies
- Decision cards — Each motion is summarized with title, status (Passed, Amended, Deferred, etc.), category (housing, transportation, governance, etc.), and impact tags
- Outcome-based sorting — Sort decisions by what happened: Passed first, Deferred first, Amended first, Failed first, or by category
- Full-text search — Search within a meeting by keywords across titles, summaries, and tags
- Trends view — Per-meeting analytics: breakdown by category and status
- Link to source — View the original council document for verification
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ City of │ │ Node + Playwright│ │ scraper/output/ │
│ Toronto site │ ──► │ scraper │ ──► │ .txt + index.json│
└─────────────────┘ └──────────────────┘ └────────┬────────┘
│
▼
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Frontend │ ◄── │ FastAPI backend │ ◄── │ Gemini │
│ (HTML/CSS/JS) │ │ port 8000 │ │ extraction │
└─────────────────┘ └────────┬─────────┘ └────────┬────────┘
│
▼
┌──────────────────────────┐
│ Supabase (Postgres) │
│ meetings, meeting_details│
└──────────────────────────┘
- Scraper — Visits secure.toronto.ca, collects meeting links from the Recent meetings table, fetches Decisions and Minutes text
- Backend — Serves meeting list and detail; runs Gemini on first access per meeting, then caches
- Extraction — Section-aware Gemini prompt turns meeting text into structured motions (title, summary, status, category, impact_tags, full_text)
- Backend: Python 3.11+, FastAPI, Google Gemini (genai), Supabase (Postgres)
- Frontend: Vanilla HTML, CSS, JavaScript (no framework)
- Scraper: Node.js, Playwright (Chromium)
- Clone and cd into the project root
- Follow LOCAL_SETUP.md for prerequisites, venv,
.env, and scraper setup - Start backend:
uvicorn backend.main:app --reload --port 8000 - Serve frontend:
python -m http.server 5500fromfrontend/→ open http://localhost:5500
Meetings are built on demand: the first time you open a meeting via the backend (/api/meetings/{code}), Gemini runs (~30–60 s) and writes a MeetingDetail with motions to data/cache/meetings/*.json and (optionally) Supabase. After that it’s instant from cache.
For bulk/offline extraction you can run:
- Scrape + build cache locally:
python -m backend.refresh_cache --max-meetings 3 - Inspect counts:
python -m backend.debug_counts
Macathon/
├── .env # GOOGLE_API_KEY (required for extraction)
├── .env.example
├── requirements.txt
├── README.md
├── LOCAL_SETUP.md # Full setup and run guide
├── backend/
│ ├── main.py # API routes, cache logic
│ ├── extractor.py # Gemini motion extraction + status normalization
│ ├── models.py # Pydantic models
│ ├── scraper_bridge.py # Calls Node scraper and parses scraper/output/index.json
│ ├── supabase_client.py # Reads/writes meetings + meeting_details in Supabase
│ ├── refresh_cache.py # Offline precompute script (overviews + full details)
│ └── debug_counts.py # Utility to print total motions and status breakdown
├── frontend/
│ ├── index.html
│ ├── app.js
│ └── styles.css
├── scraper/
│ ├── scrape-content.js # Playwright scraper
│ ├── output/ # .txt files + index.json
│ └── package.json
├── data/
│ └── cache/ # (legacy) meetings_index.json, meetings/*.json
├── resync_meetings_index.py
├── prewarm_single.py # Cache one meeting at a time
├── prewarm_all.py # Bulk prewarm
└── start-dev.bat # Windows: start backend + frontend
| Endpoint | Description |
|---|---|
GET /api/meetings |
List meetings with motion counts, topics, region, and detail_cached flag |
GET /api/meetings/{code} |
Meeting detail with motions (lazy Gemini + cache + optional Supabase mirror) |
GET /api/stats |
Global stats across cached meetings (only those with motions) |
POST /api/refresh |
Re-run scraper (requires ALLOW_LIVE_EXTRACTION=true) and rebuild index |
POST /api/prewarm |
Pre-cache all meeting details for the current meetings list |
The production setup for this repo is GitHub Pages + Supabase. The frontend (frontend/index.html, app.js, styles.css) is served as static files (e.g. via GitHub Pages), and it reads meeting data directly from Supabase using the anon key — no FastAPI server or Cloud Run is required at runtime.
- Supabase: Create tables and allow public read (Supabase → SQL Editor):
create table if not exists public.meetings (
meeting_code text primary key,
title text not null,
date text not null,
topics text[] not null default '{}',
motion_count integer not null default 0,
region text,
detail_cached boolean,
updated_at timestamptz not null default now()
);
create table if not exists public.meeting_details (
meeting_code text primary key references public.meetings(meeting_code) on delete cascade,
detail jsonb not null,
generated_at timestamptz not null default now()
);
alter table public.meetings enable row level security;
alter table public.meeting_details enable row level security;
create policy "Allow public read meetings" on public.meetings for select to anon using (true);
create policy "Allow public read meeting_details" on public.meeting_details for select to anon using (true);- Frontend config (Supabase-only mode): In
frontend/index.html, set (or deploy with) your Supabase project URL and anon key (not the service role key). When these are present, the frontend will:
<script>
window.APP_CONFIG = {
supabaseUrl: 'https://YOUR_PROJECT.supabase.co',
supabaseAnonKey: 'YOUR_ANON_KEY'
};
</script>- Fetch the meeting list from
/rest/v1/meetings?select=*&order=date.desc - Fetch individual meeting details from
/rest/v1/meeting_details?meeting_code=eq.{code}&select=detail - Hide admin-only buttons like “Refresh from council” and “Preload all meetings” (those are intended to run via CI, not from the browser)
- Populate data (Supabase):
- Via GitHub Actions: Use
.github/workflows/daily_refresh.ymlto run the Node scraper + Gemini extraction on a schedule and write into Supabase. Configure repo secrets:GOOGLE_API_KEY,SUPABASE_URL,SUPABASE_SERVICE_ROLE_KEY. - Locally (one-off): With env vars set (
GOOGLE_API_KEY,SUPABASE_URL,SUPABASE_SERVICE_ROLE_KEY), run:python -m backend.refresh_cache --max-meetings 3- Optionally inspect:
python -m backend.debug_counts
- Host: Push the frontend to GitHub and enable Pages (e.g. from the
frontend/folder on themainbranch). The site will load meetings and summarized motion cards from Supabase.
- Config (with API instead of Supabase): If you host the FastAPI backend somewhere, set
window.APP_CONFIG = { apiUrl: 'https://your-api.example.com' }instead of Supabase keys inindex.html. The frontend will then call/api/meetingsand/api/meetings/{code}on that API instead of Supabase. - Security (backend mode): Protect or disable
POST /api/refreshandPOST /api/prewarmin production when using a public API. - Timeline filtering: In Supabase mode, meetings that have
detail_cached = trueandmotion_count = 0are hidden from the timeline by the frontend so users only see meetings with actual decisions.
- Build image:
- From the repo root (this folder), run:
gcloud builds submit --tag REGION-docker.pkg.dev/PROJECT_ID/macathon/macathon-api .
- From the repo root (this folder), run:
- Deploy to Cloud Run:
gcloud run deploy macathon-api --image REGION-docker.pkg.dev/PROJECT_ID/macathon/macathon-api --platform managed --region REGION --allow-unauthenticated --port 8000- Configure environment variables on the service:
GOOGLE_API_KEYSUPABASE_URL,SUPABASE_SERVICE_ROLE_KEYALLOW_LIVE_EXTRACTION(only if you want Cloud Run to be allowed to run the scraper)
- Scheduler jobs (optional):
- Use Cloud Scheduler to call:
POST {CLOUD_RUN_URL}/api/refreshdaily for new meetings.POST {CLOUD_RUN_URL}/api/prewarmnightly to precompute all meeting details.
- Use Cloud Scheduler to call:
See project root for license details.