Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion app/lessons/[slug]/page.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ import Link from "next/link";
import { headers } from "next/headers";
import { notFound } from "next/navigation";
import { MDXRemote } from "next-mdx-remote/rsc";
import remarkGfm from "remark-gfm";
import { auth } from "@/lib/auth";
import { getAllLessons, getLesson } from "@/lib/lessons";
import { buildLessonComponents } from "@/components/lesson/mdx-components";
Expand Down Expand Up @@ -88,7 +89,7 @@ export default async function LessonPage({

<div className="mt-6 grid gap-6 lg:grid-cols-[minmax(0,1fr)_minmax(0,1.4fr)]">
<article className="prose prose-zinc dark:prose-invert">
<MDXRemote source={lesson.mdxSource} components={components} />
<MDXRemote source={lesson.mdxSource} components={components} options={{ mdxOptions: { remarkPlugins: [remarkGfm] } }} />

{(prev || next) && (
<nav
Expand Down
2 changes: 1 addition & 1 deletion lessons/03-combining-tables/01-joins/lesson.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -112,4 +112,4 @@ Revenue per category, in one query. This is the bread-and-butter of analytics wo
Mark this lesson done — we'll just confirm the sandbox is healthy.
</Check>

Up next: subqueries and Common Table Expressions (`WITH`) — composing queries from smaller queries.
Up next: the rest of the join family — `RIGHT`, `FULL`, `CROSS`, and joining a table to itself.
128 changes: 128 additions & 0 deletions lessons/03-combining-tables/02-advanced-joins/lesson.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
The previous lesson covered `INNER JOIN` and `LEFT JOIN` — the two you'll use most. This one rounds out the family: `RIGHT`, `FULL`, `CROSS`, and the trick of joining a table to itself.

The seed has a small organization: `employees` (with a `manager_id` pointing back into the same table) and `departments` (one of which has no employees). There are also two tiny `shirt_sizes`/`shirt_colors` tables for the CROSS JOIN example.

## RIGHT JOIN: LEFT JOIN, mirrored

`RIGHT JOIN` keeps every row from the *right* table, NULL-padding the left when there's no match. It's `LEFT JOIN` with the tables swapped — and that's almost always how people write it instead, because reading order matters.

<Run>
SELECT d.name AS department, e.full_name
FROM employees e
RIGHT JOIN departments d ON d.id = e.department_id
ORDER BY d.name, e.full_name;
</Run>

Operations has no employees — it shows up with `full_name` as NULL. Same shape if you reversed the tables and used `LEFT JOIN`. In practice: pick `LEFT` and put the "keep all of these" table on the left. Reads more naturally.

## FULL JOIN: keep both sides

`FULL OUTER JOIN` (the `OUTER` is optional) keeps **every** row from both tables: matched pairs where possible, plus left rows with no right match, plus right rows with no left match — each padded with NULLs on the side that didn't have a partner.

<Run>
SELECT e.full_name, d.name AS department
FROM employees e
FULL JOIN departments d ON d.id = e.department_id
ORDER BY e.full_name NULLS LAST, d.name;
</Run>

Three flavors in the result:

- Most rows: an employee with a department.
- Edsger and Donald (contractors): `department` is NULL.
- Operations: `full_name` is NULL.

`FULL JOIN` shines when you need a single result that reconciles two sources — "all the things we know about, from either side, no matter which side knew about them".

### Finding the "only-one-side" rows

A `FULL JOIN` filtered to rows where one side is NULL gives you the asymmetric difference: things on the left with no match, or vice versa. Combine both NULL checks for "rows missing from one or both sides":

<Run>
SELECT e.full_name, d.name AS department
FROM employees e
FULL JOIN departments d ON d.id = e.department_id
WHERE e.id IS NULL OR e.department_id IS NULL;
</Run>

Useful for reconciling: "who's missing from the departments table?", "which departments have nobody assigned?"

## CROSS JOIN: every row paired with every row

`CROSS JOIN` is the Cartesian product — no `ON` clause, every left row paired with every right row. Result size is `left × right`.

<Run>
SELECT size, color
FROM shirt_sizes
CROSS JOIN shirt_colors
ORDER BY size, color;
</Run>

Three sizes × three colors = nine combinations. Nine shirts to stock. CROSS JOIN earns its keep for:

- **Generating combinations** (sizes × colors, dates × users, regions × products).
- **Pairing every row with a single-row computation** — `CROSS JOIN (SELECT now() AS t)` injects the current time alongside every row.
- **`generate_series` plus a table** — fill in missing days in a time series.

The two implicit-comma forms (`FROM a, b`) and `CROSS JOIN a, b` are equivalent, but `CROSS JOIN` makes the intent explicit. Never accidentally CROSS JOIN a large table — millions × millions is a long afternoon.

## Self-join: a table joined to itself

When rows have relationships to *other rows in the same table*, you join the table to itself with a different alias for each "side". The classic example is an employee/manager hierarchy.

<Run>
SELECT e.full_name AS employee, m.full_name AS manager
FROM employees e
LEFT JOIN employees m ON m.id = e.manager_id
ORDER BY m.full_name NULLS FIRST, e.full_name;
</Run>

Two aliases for the same table — `e` is the row we're describing, `m` is the row we're looking up. The `LEFT JOIN` keeps Ada in the result even though her `manager_id` is NULL (she's at the top).

Self-joins also handle peers (employees in the same department) and chains (manager's manager). For *deep* recursion — "everyone underneath Ada, however many levels down" — you reach for a recursive CTE, which is its own future lesson.

## Multi-table joins

Each `JOIN` adds one more table. The query reads top-to-bottom as a pipeline: start with a row, attach a related row, then another.

<Run>
SELECT e.full_name AS employee,
m.full_name AS manager,
d.name AS department
FROM employees e
LEFT JOIN employees m ON m.id = e.manager_id
LEFT JOIN departments d ON d.id = e.department_id
ORDER BY d.name NULLS LAST, e.full_name;
</Run>

Three tables (two of them the same table aliased differently), one result. The `LEFT JOIN` on departments keeps the contractors in the listing even though they have no department.

## When to use which

A useful mental table:

| You want… | Use |
| ------------------------------------------------------------ | -------------- |
| Rows that match on both sides | `INNER JOIN` |
| All rows from the left, matches from the right where present | `LEFT JOIN` |
| Same, but with the tables in the other order | `RIGHT JOIN` |
| All rows from both sides, NULLs where they don't match | `FULL JOIN` |
| Every combination of left × right | `CROSS JOIN` |
| A row's relationship to other rows in the same table | self-join |

Comment on lines +102 to +112
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot this table didn't render correctly, fix it

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in the latest commit. remark-gfm isn't installed, so GFM pipe tables don't render in MDX. Replaced the markdown table with a plain HTML <table> — MDX handles it natively and @tailwindcss/typography's prose class takes care of styling.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot I would rather go for remark-gfm approach if that's not too complex

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — installed remark-gfm and added it to MDXRemote's remarkPlugins in page.tsx. The GFM pipe table in the lesson is restored as-is.

A working rule: prefer `LEFT` over `RIGHT` so the "keep all" table is on the left. Reach for `FULL` only when both sides legitimately matter. Use `CROSS` deliberately — never by accident.

## What you learned

- `RIGHT JOIN` = `LEFT JOIN` with the tables flipped; pick one shape and stick with it.
- `FULL OUTER JOIN` keeps unmatched rows from both sides, NULL-padded.
- `WHERE side.id IS NULL` filters down to the asymmetric difference — "missing on this side".
- `CROSS JOIN` is the Cartesian product; useful for generating combinations, deadly by accident.
- Self-joins use two aliases for the same table — the standard pattern for parent/child within one table.
- Chain `JOIN`s to bring in more tables; read top-to-bottom as a pipeline.

<Check id="seed-loaded">
Mark this lesson done — we'll just confirm the sandbox is healthy.
</Check>

Up next: combining results vertically with `UNION`, `INTERSECT`, and `EXCEPT`.
19 changes: 19 additions & 0 deletions lessons/03-combining-tables/02-advanced-joins/lesson.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
title: Advanced joins
summary: RIGHT, FULL, and CROSS joins, self-joins for hierarchies, and chaining joins across many tables.
estimatedMinutes: 14
tags:
- join
- right-join
- full-join
- cross-join
- self-join
authors:
- exekias
seed: seed.sql
checks:
- id: seed-loaded
type: row-count
description: The seeded employees table has 12 rows — click to mark this lesson done.
table: employees
expect:
rowCount: 12
47 changes: 47 additions & 0 deletions lessons/03-combining-tables/02-advanced-joins/seed.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
-- Seed for "02-advanced-joins": a small org with employees, departments, and
-- projects so we have material for self-joins (manager_id), FULL joins
-- (department with no employees + employee with no department), and three-
-- table chains. A few sizes and colors give CROSS JOIN something to do.

CREATE TABLE departments (
id serial PRIMARY KEY,
name text NOT NULL UNIQUE
);

INSERT INTO departments (name) VALUES
('Engineering'),
('Design'),
('Sales'),
('Operations'); -- intentionally has no employees, for FULL JOIN

CREATE TABLE employees (
id serial PRIMARY KEY,
full_name text NOT NULL,
department_id int REFERENCES departments(id), -- nullable: contractors
manager_id int REFERENCES employees(id),
hired_at date NOT NULL
);

INSERT INTO employees (full_name, department_id, manager_id, hired_at) VALUES
('Ada Lovelace', 1, NULL, '2020-01-10'), -- 1, no manager (CEO-ish)
('Alan Turing', 1, 1, '2020-03-04'), -- 2, manager: Ada
('Grace Hopper', 1, 1, '2020-04-12'), -- 3, manager: Ada
('Linus Torvalds', 1, 2, '2021-02-19'), -- 4, manager: Alan
('Margaret Hamilton', 2, 1, '2020-06-01'), -- 5, manager: Ada (design lead)
('Barbara Liskov', 2, 5, '2021-05-15'), -- 6, manager: Margaret
('Dennis Ritchie', 3, 1, '2021-09-22'), -- 7, manager: Ada (sales lead)
('Ken Thompson', 3, 7, '2022-01-11'), -- 8, manager: Dennis
('Edsger Dijkstra', NULL, NULL, '2023-03-03'), -- 9, contractor, no dept
('Donald Knuth', NULL, NULL, '2023-08-17'), -- 10, contractor, no dept
('Guido van Rossum', 1, 2, '2024-02-20'), -- 11, manager: Alan
('Bjarne Stroustrup', 2, 5, '2024-07-09'); -- 12, manager: Margaret

CREATE TABLE shirt_sizes (
size text PRIMARY KEY
);
INSERT INTO shirt_sizes (size) VALUES ('S'), ('M'), ('L');

CREATE TABLE shirt_colors (
color text PRIMARY KEY
);
INSERT INTO shirt_colors (color) VALUES ('black'), ('white'), ('navy');
Loading