Council Digest

Council Digest makes Toronto City Council meetings easy to understand. It scrapes meeting decisions and minutes from the City of Toronto website, uses Google Gemini to extract individual motions into plain-language summaries, and presents them in a clean, searchable interface.

What it does

Timeline of meetings — Browse recent council and committee meetings (Community Councils, Board of Health, advisory committees, etc.), newest first
Region filter — Focus on meetings from North York, Etobicoke York, Toronto & East York, Scarborough, or city-wide bodies
Decision cards — Each motion is summarized with title, status (Passed, Amended, Deferred, etc.), category (housing, transportation, governance, etc.), and impact tags
Outcome-based sorting — Sort decisions by what happened: Passed first, Deferred first, Amended first, Failed first, or by category
Full-text search — Search within a meeting by keywords across titles, summaries, and tags
Trends view — Per-meeting analytics: breakdown by category and status
Link to source — View the original council document for verification

Architecture

┌─────────────────┐     ┌──────────────────┐     ┌─────────────────┐
│  City of        │     │  Node + Playwright│     │  scraper/output/ │
│  Toronto site   │ ──► │  scraper          │ ──► │  .txt + index.json│
└─────────────────┘     └──────────────────┘     └────────┬────────┘
                                                           │
                                                           ▼
┌─────────────────┐     ┌──────────────────┐     ┌─────────────────┐
│  Frontend       │ ◄── │  FastAPI backend │ ◄── │  Gemini          │
│  (HTML/CSS/JS)  │     │  port 8000        │     │  extraction      │
└─────────────────┘     └────────┬─────────┘     └────────┬────────┘
                                                           │
                                                           ▼
                                                ┌──────────────────────────┐
                                                │  Supabase (Postgres)     │
                                                │  meetings, meeting_details│
                                                └──────────────────────────┘

Scraper — Visits secure.toronto.ca, collects meeting links from the Recent meetings table, fetches Decisions and Minutes text
Backend — Serves meeting list and detail; runs Gemini on first access per meeting, then caches
Extraction — Section-aware Gemini prompt turns meeting text into structured motions (title, summary, status, category, impact_tags, full_text)

Tech stack

Backend: Python 3.11+, FastAPI, Google Gemini (genai), Supabase (Postgres)
Frontend: Vanilla HTML, CSS, JavaScript (no framework)
Scraper: Node.js, Playwright (Chromium)

Quick start (local dev with API)

Clone and cd into the project root
Follow LOCAL_SETUP.md for prerequisites, venv, .env, and scraper setup
Start backend: uvicorn backend.main:app --reload --port 8000
Serve frontend: python -m http.server 5500 from frontend/ → open http://localhost:5500

Meetings are built on demand: the first time you open a meeting via the backend (/api/meetings/{code}), Gemini runs (~30–60 s) and writes a MeetingDetail with motions to data/cache/meetings/*.json and (optionally) Supabase. After that it’s instant from cache.

For bulk/offline extraction you can run:

Scrape + build cache locally: python -m backend.refresh_cache --max-meetings 3
Inspect counts: python -m backend.debug_counts

Project layout

Macathon/
├── .env                    # GOOGLE_API_KEY (required for extraction)
├── .env.example
├── requirements.txt
├── README.md
├── LOCAL_SETUP.md          # Full setup and run guide
├── backend/
│   ├── main.py             # API routes, cache logic
│   ├── extractor.py        # Gemini motion extraction + status normalization
│   ├── models.py           # Pydantic models
│   ├── scraper_bridge.py   # Calls Node scraper and parses scraper/output/index.json
│   ├── supabase_client.py  # Reads/writes meetings + meeting_details in Supabase
│   ├── refresh_cache.py    # Offline precompute script (overviews + full details)
│   └── debug_counts.py     # Utility to print total motions and status breakdown
├── frontend/
│   ├── index.html
│   ├── app.js
│   └── styles.css
├── scraper/
│   ├── scrape-content.js   # Playwright scraper
│   ├── output/             # .txt files + index.json
│   └── package.json
├── data/
│   └── cache/              # (legacy) meetings_index.json, meetings/*.json
├── resync_meetings_index.py
├── prewarm_single.py       # Cache one meeting at a time
├── prewarm_all.py          # Bulk prewarm
└── start-dev.bat           # Windows: start backend + frontend

API overview (backend mode)

Endpoint	Description
`GET /api/meetings`	List meetings with motion counts, topics, region, and detail_cached flag
`GET /api/meetings/{code}`	Meeting detail with motions (lazy Gemini + cache + optional Supabase mirror)
`GET /api/stats`	Global stats across cached meetings (only those with motions)
`POST /api/refresh`	Re-run scraper (requires `ALLOW_LIVE_EXTRACTION=true`) and rebuild index
`POST /api/prewarm`	Pre-cache all meeting details for the current meetings list

Deployment

GitHub Pages + Supabase only (no FastAPI in production)

The production setup for this repo is GitHub Pages + Supabase. The frontend (frontend/index.html, app.js, styles.css) is served as static files (e.g. via GitHub Pages), and it reads meeting data directly from Supabase using the anon key — no FastAPI server or Cloud Run is required at runtime.

Supabase: Create tables and allow public read (Supabase → SQL Editor):

create table if not exists public.meetings (
  meeting_code text primary key,
  title text not null,
  date text not null,
  topics text[] not null default '{}',
  motion_count integer not null default 0,
  region text,
  detail_cached boolean,
  updated_at timestamptz not null default now()
);

create table if not exists public.meeting_details (
  meeting_code text primary key references public.meetings(meeting_code) on delete cascade,
  detail jsonb not null,
  generated_at timestamptz not null default now()
);

alter table public.meetings enable row level security;
alter table public.meeting_details enable row level security;
create policy "Allow public read meetings" on public.meetings for select to anon using (true);
create policy "Allow public read meeting_details" on public.meeting_details for select to anon using (true);

Frontend config (Supabase-only mode): In frontend/index.html, set (or deploy with) your Supabase project URL and anon key (not the service role key). When these are present, the frontend will:

<script>
  window.APP_CONFIG = {
    supabaseUrl: 'https://YOUR_PROJECT.supabase.co',
    supabaseAnonKey: 'YOUR_ANON_KEY'
  };
</script>

Fetch the meeting list from /rest/v1/meetings?select=*&order=date.desc
Fetch individual meeting details from /rest/v1/meeting_details?meeting_code=eq.{code}&select=detail
Hide admin-only buttons like “Refresh from council” and “Preload all meetings” (those are intended to run via CI, not from the browser)

Populate data (Supabase):

Via GitHub Actions: Use .github/workflows/daily_refresh.yml to run the Node scraper + Gemini extraction on a schedule and write into Supabase. Configure repo secrets: GOOGLE_API_KEY, SUPABASE_URL, SUPABASE_SERVICE_ROLE_KEY.
Locally (one-off): With env vars set (GOOGLE_API_KEY, SUPABASE_URL, SUPABASE_SERVICE_ROLE_KEY), run:
- python -m backend.refresh_cache --max-meetings 3
- Optionally inspect: python -m backend.debug_counts

Host: Push the frontend to GitHub and enable Pages (e.g. from the frontend/ folder on the main branch). The site will load meetings and summarized motion cards from Supabase.

Config (with API instead of Supabase): If you host the FastAPI backend somewhere, set window.APP_CONFIG = { apiUrl: 'https://your-api.example.com' } instead of Supabase keys in index.html. The frontend will then call /api/meetings and /api/meetings/{code} on that API instead of Supabase.
Security (backend mode): Protect or disable POST /api/refresh and POST /api/prewarm in production when using a public API.
Timeline filtering: In Supabase mode, meetings that have detail_cached = true and motion_count = 0 are hidden from the timeline by the frontend so users only see meetings with actual decisions.

Docker + Cloud Run (optional) (legacy)

Build image:
- From the repo root (this folder), run:
  - gcloud builds submit --tag REGION-docker.pkg.dev/PROJECT_ID/macathon/macathon-api .
Deploy to Cloud Run:
- gcloud run deploy macathon-api --image REGION-docker.pkg.dev/PROJECT_ID/macathon/macathon-api --platform managed --region REGION --allow-unauthenticated --port 8000
- Configure environment variables on the service:
  - GOOGLE_API_KEY
  - SUPABASE_URL, SUPABASE_SERVICE_ROLE_KEY
  - ALLOW_LIVE_EXTRACTION (only if you want Cloud Run to be allowed to run the scraper)
Scheduler jobs (optional):
- Use Cloud Scheduler to call:
  - POST {CLOUD_RUN_URL}/api/refresh daily for new meetings.
  - POST {CLOUD_RUN_URL}/api/prewarm nightly to precompute all meeting details.

License

See project root for license details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Council Digest

What it does

Architecture

Tech stack

Quick start (local dev with API)

Project layout

API overview (backend mode)

Deployment

GitHub Pages + Supabase only (no FastAPI in production)

Docker + Cloud Run (optional) (legacy)

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.github/workflows		.github/workflows
backend		backend
data		data
frontend		frontend
scraper		scraper
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
ERRORS.md		ERRORS.md
LOCAL_SETUP.md		LOCAL_SETUP.md
README.md		README.md
compare_index_vs_detail.py		compare_index_vs_detail.py
meeting_detail.json		meeting_detail.json
prewarm_all.py		prewarm_all.py
prewarm_single.py		prewarm_single.py
requirements.txt		requirements.txt
resync_meetings_index.py		resync_meetings_index.py
run-demo.bat		run-demo.bat
start-backend.bat		start-backend.bat
start-dev.bat		start-dev.bat

Folders and files

Latest commit

History

Repository files navigation

Council Digest

What it does

Architecture

Tech stack

Quick start (local dev with API)

Project layout

API overview (backend mode)

Deployment

GitHub Pages + Supabase only (no FastAPI in production)

Docker + Cloud Run (optional) (legacy)

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages