xataio · exekias · May 28, 2026
diff --git a/lessons/02-changing-data/03-upsert/lesson.mdx b/lessons/02-changing-data/03-upsert/lesson.mdx
@@ -0,0 +1,126 @@
+Sooner or later you'll write the same buggy code twice: a `SELECT` to check if a row exists, then an `INSERT` or `UPDATE` based on the result. Between those two statements, another connection can do its own check and you get a duplicate-key error — or worse, a duplicate row.
+
+Postgres' answer is `INSERT … ON CONFLICT`, often called **upsert**. One statement, atomic, race-free.
+
+The seed has a `page_views` table with `UNIQUE (page, user_email)` and a `tags` table with `UNIQUE name`. Both are realistic shapes for upsert.
+
+## The problem upsert solves
+
+"Record a page view: insert a new row at views = 1, or bump the counter if (page, user_email) already exists."
+
+The naive way is a check-then-write:
+
+```sql
+SELECT views FROM page_views WHERE page = '/home' AND user_email = 'ada@example.com';
+-- If found: UPDATE. If not: INSERT.
+```
+
+Two round trips, and any concurrent writer can slip between them. `INSERT … ON CONFLICT` collapses both branches into one atomic statement.
+
+## `ON CONFLICT (...) DO UPDATE`
+
+The shape:
+
+```sql
+INSERT INTO <table> (cols...) VALUES (...)
+ON CONFLICT (<conflict_target>) DO UPDATE
+SET col = <expr>;
+```
+
+The `conflict_target` is a column or set of columns covered by a `UNIQUE` constraint or primary key — Postgres needs an index to detect the conflict against. Here it's the `(page, user_email)` unique constraint.
+
+<Run>
+INSERT INTO page_views (page, user_email, views, last_seen)
+VALUES ('/home', 'ada@example.com', 1, now())
+ON CONFLICT (page, user_email) DO UPDATE
+SET views     = page_views.views + 1,
+    last_seen = EXCLUDED.last_seen;
+</Run>
+
+<Check id="ada-page-views-incremented">
+Ada already had one view on `/home`. After the upsert, she has two.
+</Check>
+
+Two new pieces of syntax:
+
+- **`page_views.views`** — the *existing* row's value. Qualify with the table name so it doesn't get confused with the incoming column.
+- **`EXCLUDED.col`** — the value from the row you tried to `INSERT`. It's a pseudo-table (think "the row that was excluded by the conflict") and it's the bridge between the INSERT side and the UPDATE side.
+
+So `EXCLUDED.last_seen` says "use the timestamp we just tried to insert" — useful when the new value comes from the caller, not from a computation on the old row.
+
+## Insert path: same statement, new row
+
+The same statement also handles the case where no conflict exists — the row just gets inserted.
+
+<Run>
+INSERT INTO page_views (page, user_email, views, last_seen)
+VALUES ('/pricing', 'newbie@example.com', 1, now())
+ON CONFLICT (page, user_email) DO UPDATE
+SET views     = page_views.views + 1,
+    last_seen = EXCLUDED.last_seen;
+</Run>
+
+<Check id="new-page-view-inserted">
+There was no `(/pricing, newbie@example.com)` row — the upsert inserted it with views = 1.
+</Check>
+
+One statement, two branches. The race condition is gone.
+
+## `ON CONFLICT (...) DO NOTHING`
+
+Sometimes the "update" branch is "ignore it, you're done". Use `DO NOTHING`.
+
+<Run>
+INSERT INTO tags (name) VALUES ('postgres'), ('sql')
+ON CONFLICT (name) DO NOTHING;
+</Run>
+
+<Check id="dedup-do-nothing">
+Both tags already existed. `DO NOTHING` quietly skipped them, so `postgres` still appears exactly once in the table.
+</Check>
+
+`DO NOTHING` is great for **idempotent inserts** — re-run the same script and you don't get errors. Common uses:
+
+- Seeding lookup tables ("ensure these tags exist").
+- Event ingestion with a unique event id ("if we've already processed this id, skip").
+- Migrating data where the source might be replayed.
+
+If you want to know whether the row was actually inserted, combine `DO NOTHING` with `RETURNING` — only inserted rows come back.
+
+<Run>
+INSERT INTO tags (name) VALUES ('postgres'), ('graphql')
+ON CONFLICT (name) DO NOTHING
+RETURNING id, name;
+</Run>
+
+Only `graphql` comes back — `postgres` already existed and was skipped.
+
+## `WHERE` on the UPDATE branch
+
+`DO UPDATE` accepts a `WHERE` that filters which conflicting rows actually get updated. "Update only if the incoming value is newer":
+
+```sql
+INSERT INTO page_views (page, user_email, views, last_seen)
+VALUES ('/home', 'ada@example.com', 1, '2024-05-01 09:00:00+00')
+ON CONFLICT (page, user_email) DO UPDATE
+SET last_seen = EXCLUDED.last_seen
+WHERE page_views.last_seen < EXCLUDED.last_seen;
+```
+
+If the incoming `last_seen` is older than the stored one, the conflict matches but the `WHERE` drops the update — the row is left alone. Conflict not handled? Postgres still doesn't raise an error; the row just stays as-is.
+
+## Common pitfalls
+
+- **No matching UNIQUE constraint.** Postgres needs an index to detect the conflict. `ON CONFLICT (foo)` fails at planning time if `foo` isn't covered by a primary key or unique index (or partial unique index — `ON CONFLICT (foo) WHERE <expr>` matches a partial unique index).
+- **`EXCLUDED` vs the table name.** `EXCLUDED.x` is the incoming row, `tbl.x` is the existing row. Swap them and you'll be writing the old value back over itself.
+- **Triggers fire on the path actually taken.** A `BEFORE INSERT` trigger fires when the row is inserted; on the UPDATE path it doesn't. If you rely on `updated_at` triggers, check they handle both.
+
+## What you learned
+
+- `INSERT … ON CONFLICT (...) DO UPDATE` collapses check-then-write into one atomic statement.
+- The conflict target must be a unique constraint or primary key.
+- `EXCLUDED.col` references the incoming row in the UPDATE branch; `tbl.col` is the existing row.
+- `ON CONFLICT DO NOTHING` makes inserts idempotent; pair with `RETURNING` to learn which rows actually landed.
+- A `WHERE` on the UPDATE branch lets you conditionally update on conflict.
+
+Up next: wrapping a chunk of work in `BEGIN ... COMMIT` so it either all happens or none of it does.
diff --git a/lessons/02-changing-data/03-upsert/lesson.yaml b/lessons/02-changing-data/03-upsert/lesson.yaml
@@ -0,0 +1,34 @@
+title: Upsert with ON CONFLICT
+summary: Insert-or-update in a single statement — INSERT … ON CONFLICT DO UPDATE / DO NOTHING, and how to use EXCLUDED.
+estimatedMinutes: 12
+tags:
+  - insert
+  - on-conflict
+  - upsert
+  - excluded
+  - dml
+authors:
+  - exekias
+seed: seed.sql
+checks:
+  - id: ada-page-views-incremented
+    type: query-returns
+    description: Upserting Ada's page view brings her count to 2.
+    sql: SELECT views FROM page_views WHERE page = '/home' AND user_email = 'ada@example.com'
+    expect:
+      rowCount: 1
+      rows: [[2]]
+  - id: new-page-view-inserted
+    type: query-returns
+    description: Upserting a brand-new (page, user_email) pair inserts it with views = 1.
+    sql: SELECT views FROM page_views WHERE page = '/pricing' AND user_email = 'newbie@example.com'
+    expect:
+      rowCount: 1
+      rows: [[1]]
+  - id: dedup-do-nothing
+    type: query-returns
+    description: DO NOTHING swallowed the duplicate — 'postgres' still appears exactly once.
+    sql: SELECT count(*)::int FROM tags WHERE name = 'postgres'
+    expect:
+      rowCount: 1
+      rows: [[1]]
diff --git a/lessons/02-changing-data/03-upsert/seed.sql b/lessons/02-changing-data/03-upsert/seed.sql
@@ -0,0 +1,30 @@
+-- Seed for "03-upsert": two tables that benefit from upsert.
+-- page_views is the "increment a counter, creating it if needed" example —
+-- it has a UNIQUE (page, user_email) so the conflict target is meaningful.
+-- tags is a tiny dedup table for DO NOTHING.
+
+CREATE TABLE page_views (
+  id          serial PRIMARY KEY,
+  page        text  NOT NULL,
+  user_email  text  NOT NULL,
+  views       int   NOT NULL DEFAULT 1,
+  last_seen   timestamptz NOT NULL DEFAULT now(),
+  UNIQUE (page, user_email)
+);
+
+INSERT INTO page_views (page, user_email, views, last_seen) VALUES
+  ('/home',    'ada@example.com',   1, '2024-05-01 09:00:00+00'),
+  ('/home',    'grace@example.com', 4, '2024-05-02 11:30:00+00'),
+  ('/pricing', 'grace@example.com', 2, '2024-05-03 12:00:00+00'),
+  ('/blog',    'linus@example.com', 7, '2024-05-04 18:45:00+00');
+
+CREATE TABLE tags (
+  id   serial PRIMARY KEY,
+  name text   NOT NULL UNIQUE
+);
+
+INSERT INTO tags (name) VALUES
+  ('postgres'),
+  ('sql'),
+  ('database'),
+  ('tutorial');