xataio · exekias · May 28, 2026 · May 28, 2026 · May 28, 2026
diff --git a/lessons/01-query-fundamentals/02-where-conditions/lesson.mdx b/lessons/01-query-fundamentals/02-where-conditions/lesson.mdx
@@ -103,4 +103,4 @@ ORDER BY signed_up_at;
 Mark this lesson done — we'll just confirm the sandbox is healthy.
 </Check>
 
-Up next: aggregating rows together with `GROUP BY`.
+Up next: sorting results predictably and paging through them without falling into the `OFFSET` trap.
diff --git a/lessons/01-query-fundamentals/03-sorting-and-pagination/lesson.mdx b/lessons/01-query-fundamentals/03-sorting-and-pagination/lesson.mdx
@@ -0,0 +1,129 @@
+Lesson 01 introduced `ORDER BY` and `LIMIT`. This lesson goes deeper: stable sorts, dropping duplicates, and the two ways to page through a result set — including why one of them quietly breaks in production.
+
+The seed loaded an `articles` table — 30 rows, several authors, and a few intentional ties on `published_at`.
+
+## `ORDER BY`: pick a direction
+
+`ORDER BY col` sorts ascending (small to large, oldest first). Add `DESC` to flip it.
+
+<Run>
+SELECT title, views
+FROM articles
+ORDER BY views DESC
+LIMIT 5;
+</Run>
+
+The top-5 most-viewed posts. Without `LIMIT 5`, you'd get all 30, sorted.
+
+## Tie-breakers
+
+What happens when two rows have the same sort key? Postgres returns them in *some* order — but which one is implementation-defined. If the order matters, spell out a tie-breaker.
+
+<Run>
+SELECT title, author, published_at
+FROM articles
+ORDER BY published_at, id;
+</Run>
+
+`published_at` has duplicates (Ada's two articles, Grace's first two — see the seed). Adding `id` as a second sort key makes the result *deterministic*: same query, same order, every time. For pagination this isn't optional — it's a correctness requirement, as we'll see in a minute.
+
+## `NULLS FIRST` / `NULLS LAST`
+
+Postgres puts `NULL`s **last** in ascending sorts and **first** in descending sorts. Override with `NULLS FIRST` or `NULLS LAST` when you want the opposite — most often when you want "newest first, but missing dates at the bottom".
+
+<Run>
+SELECT title, published_at
+FROM articles
+ORDER BY published_at DESC NULLS LAST;
+</Run>
+
+The seed now includes a couple of `NULL` `published_at` values so you can see this directly.
+
+## `DISTINCT`: drop duplicate rows
+
+`DISTINCT` removes duplicate rows from the result. It's a post-processing step on whatever the `SELECT` list produced.
+
+<Run>
+SELECT DISTINCT author
+FROM articles
+ORDER BY author;
+</Run>
+
+Twelve articles came from a handful of repeat authors — `DISTINCT` collapses them. Note `DISTINCT` is *across all selected columns*, not just one: `SELECT DISTINCT author, published_at` would keep two rows from the same author on different days.
+
+### `DISTINCT ON (...)`: one row per group, Postgres-flavored
+
+`DISTINCT ON (col)` is a Postgres extension: "one row per distinct value of `col`, and you pick which one with `ORDER BY`". Handy for "the latest article per author":
+
+<Run>
+SELECT DISTINCT ON (author) author, title, published_at
+FROM articles
+ORDER BY author, published_at DESC;
+</Run>
+
+The first column(s) in the `ORDER BY` must match the `DISTINCT ON` list — that's the rule that lets Postgres pick "the first row per group". Inside each author, `published_at DESC` chooses the newest.
+
+## `LIMIT` and `OFFSET`: the obvious way to paginate
+
+`LIMIT N OFFSET M` says "skip M rows, then return N". The classic page-2-of-10 query:
+
+<Run>
+SELECT id, title, published_at
+FROM articles
+ORDER BY published_at DESC NULLS LAST, id DESC
+LIMIT 10 OFFSET 10;
+</Run>
+
+That's page 2 (rows 11–20). Page 3 would be `OFFSET 20`. Simple, and the right tool for small result sets.
+
+## The `OFFSET` trap
+
+`OFFSET M` makes Postgres fetch *and discard* M rows before returning anything. On page 1 that's free. On page 1000 of a million-row feed, you're scanning a million rows to throw away 999,990 of them.
+
+There's a subtler bug too: if a new row gets inserted between requesting page 1 and page 2, page 2 will repeat a row from page 1 (because everything shifted down by one). The result set isn't stable across requests.
+
+For small admin tables, `OFFSET` is fine. For user-facing feeds, infinite scroll, or anything that paginates deeply, reach for keyset pagination.
+
+## Keyset pagination: page by `WHERE`
+
+Idea: instead of "skip 10,000 rows", remember the *last row you saw* and ask for "rows after that one". With a deterministic `ORDER BY`, that's just a `WHERE` clause.
+
+Page 1:
+
+<Run>
+SELECT id, title, published_at
+FROM articles
+ORDER BY published_at DESC NULLS LAST, id DESC
+LIMIT 5;
+</Run>
+
+Note the last row's `published_at` and `id`. To get the next page, plug them into a `WHERE` filter that asks for everything strictly after that key:
+
+<Run>
+SELECT id, title, published_at
+FROM articles
+WHERE (published_at, id) < ('2024-06-24 08:30:00+00', 26)
+ORDER BY published_at DESC NULLS LAST, id DESC
+LIMIT 5;
+</Run>
+
+Two important details:
+
+1. **The tuple comparison `(a, b) < (x, y)`** does lexicographic ordering — `a < x`, OR `a = x AND b < y`. That's exactly the tie-breaker logic we wrote into `ORDER BY`. They have to match.
+2. **No `OFFSET`**. Each page is a fresh `WHERE` lookup that an index on `(published_at DESC, id DESC)` can serve in constant time, no matter how deep you go.
+
+The downside: you can't jump to "page 42" — you walk forward one page at a time. For feeds and infinite scroll that's fine; for an admin grid with a page picker, `OFFSET` is the easier fit.
+
+## What you learned
+
+- `ORDER BY` sorts; add `DESC` and `NULLS FIRST`/`LAST` as needed.
+- Always include a tie-breaker (typically the primary key) for deterministic order.
+- `DISTINCT` drops duplicate rows; `DISTINCT ON (col)` picks one row per group, chosen by `ORDER BY`.
+- `LIMIT N OFFSET M` is the obvious way to paginate — and gets slow and unstable on deep pages.
+- Keyset pagination (`WHERE (key) < (last_seen)`) pages in constant time and survives concurrent inserts.
+
+<Check id="seed-loaded">
+Mark this lesson done — we'll just confirm the sandbox is healthy.
+</Check>
+
+Up next: collapsing rows into summaries with aggregations and `GROUP BY`.
diff --git a/lessons/01-query-fundamentals/03-sorting-and-pagination/lesson.yaml b/lessons/01-query-fundamentals/03-sorting-and-pagination/lesson.yaml
@@ -0,0 +1,19 @@
+title: Sorting and pagination
+summary: Order results predictably, drop duplicates with DISTINCT, and page through with LIMIT/OFFSET — and why keyset pagination is the better default.
+estimatedMinutes: 12
+tags:
+  - order-by
+  - distinct
+  - limit
+  - offset
+  - pagination
+authors:
+  - exekias
+seed: seed.sql
+checks:
+  - id: seed-loaded
+    type: row-count
+    description: The seeded articles table has 30 rows — click to mark this lesson done.
+    table: articles
+    expect:
+      rowCount: 30
diff --git a/lessons/01-query-fundamentals/03-sorting-and-pagination/seed.sql b/lessons/01-query-fundamentals/03-sorting-and-pagination/seed.sql
@@ -0,0 +1,43 @@
+-- Seed for "03-sorting-and-pagination": a small articles feed with a few
+-- intentional ties on published_at so ORDER BY tie-breakers are visible, and
+-- some duplicate authors so DISTINCT has something to do.
+
+CREATE TABLE articles (
+  id           serial PRIMARY KEY,
+  title        text NOT NULL,
+  author       text NOT NULL,
+  views        int  NOT NULL,
+  published_at timestamptz
+);
+
+INSERT INTO articles (title, author, views, published_at) VALUES
+  ('Indexing for humans',         'Ada Lovelace',      1200, '2024-01-05 09:00:00+00'),
+  ('The case for MVCC',           'Alan Turing',        890, '2024-01-05 09:00:00+00'),
+  ('Why your query is slow',      'Grace Hopper',      4300, '2024-01-12 14:00:00+00'),
+  ('EXPLAIN, explained',          'Grace Hopper',      3100, '2024-01-20 10:30:00+00'),
+  ('B-trees from first principles','Donald Knuth',     2750, '2024-01-28 08:15:00+00'),
+  ('Postgres tips, vol 1',        'Ada Lovelace',       640, '2024-02-04 16:45:00+00'),
+  ('Postgres tips, vol 2',        'Ada Lovelace',       720, '2024-02-11 16:45:00+00'),
+  ('Joins by example',            'Linus Torvalds',    1810, '2024-02-19 12:00:00+00'),
+  ('LATERAL is fine, actually',   'Barbara Liskov',     980, '2024-02-26 11:00:00+00'),
+  ('When to use JSONB',           'Guido van Rossum',  2210, '2024-03-04 09:30:00+00'),
+  ('When not to use JSONB',       'Guido van Rossum',  1560, '2024-03-11 09:30:00+00'),
+  ('Window functions in anger',   'Margaret Hamilton', 3380, '2024-03-18 13:20:00+00'),
+  ('Reading EXPLAIN ANALYZE',     'Grace Hopper',      4710, '2024-03-25 13:20:00+00'),
+  ('Vacuum and bloat',            'Dennis Ritchie',     430, '2024-04-01 07:00:00+00'),
+  ('A small note on COLLATE',     'Bjarne Stroustrup',  210, NULL),
+  ('GIN vs GiST',                 'Donald Knuth',      1990, '2024-04-15 10:00:00+00'),
+  ('Trigram search basics',       'Ken Thompson',       870, '2024-04-22 10:00:00+00'),
+  ('CTEs are not optimization fences anymore','Linus Torvalds', 2540, '2024-04-29 15:15:00+00'),
+  ('Schema migrations without tears','Barbara Liskov',  3050, '2024-05-06 11:45:00+00'),
+  ('Idempotent INSERTs with ON CONFLICT','Margaret Hamilton', 2890, '2024-05-13 11:45:00+00'),
+  ('Three flavors of UUID',       'Edsger Dijkstra',    760, '2024-05-20 09:00:00+00'),
+  ('Counting is harder than it looks','Ada Lovelace',  1680, '2024-05-27 14:30:00+00'),
+  ('Pagination, the LIMIT/OFFSET trap','Grace Hopper', 5120, '2024-06-03 14:30:00+00'),
+  ('Pagination, the keyset way',  'Grace Hopper',      4870, '2024-06-10 14:30:00+00'),
+  ('Date math in Postgres',       'Guido van Rossum',   930, '2024-06-17 08:30:00+00'),
+  ('Time zones, again',           'Bjarne Stroustrup',  410, '2024-06-24 08:30:00+00'),
+  ('Generated columns: hidden gems','Dennis Ritchie',  1240, '2024-07-01 17:00:00+00'),
+  ('Foreign keys revisited',      'Ken Thompson',      1080, '2024-07-08 17:00:00+00'),
+  ('Locking, lightly',            'Edsger Dijkstra',   1340, '2024-07-15 12:30:00+00'),
+  ('How autovacuum keeps you sane','Margaret Hamilton', 980, NULL);
diff --git a/...y-fundamentals/03-aggregations/lesson.mdx → ...y-fundamentals/04-aggregations/lesson.mdx b/...y-fundamentals/03-aggregations/lesson.mdx → ...y-fundamentals/04-aggregations/lesson.mdx
diff --git a/...-fundamentals/03-aggregations/lesson.yaml → ...-fundamentals/04-aggregations/lesson.yaml b/...-fundamentals/03-aggregations/lesson.yaml → ...-fundamentals/04-aggregations/lesson.yaml
diff --git a/...ery-fundamentals/03-aggregations/seed.sql → ...ery-fundamentals/04-aggregations/seed.sql b/...ery-fundamentals/03-aggregations/seed.sql → ...ery-fundamentals/04-aggregations/seed.sql