-
Notifications
You must be signed in to change notification settings - Fork 0
Add lesson: advanced joins #18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
128 changes: 128 additions & 0 deletions
128
lessons/03-combining-tables/02-advanced-joins/lesson.mdx
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,128 @@ | ||
| The previous lesson covered `INNER JOIN` and `LEFT JOIN` — the two you'll use most. This one rounds out the family: `RIGHT`, `FULL`, `CROSS`, and the trick of joining a table to itself. | ||
|
|
||
| The seed has a small organization: `employees` (with a `manager_id` pointing back into the same table) and `departments` (one of which has no employees). There are also two tiny `shirt_sizes`/`shirt_colors` tables for the CROSS JOIN example. | ||
|
|
||
| ## RIGHT JOIN: LEFT JOIN, mirrored | ||
|
|
||
| `RIGHT JOIN` keeps every row from the *right* table, NULL-padding the left when there's no match. It's `LEFT JOIN` with the tables swapped — and that's almost always how people write it instead, because reading order matters. | ||
|
|
||
| <Run> | ||
| SELECT d.name AS department, e.full_name | ||
| FROM employees e | ||
| RIGHT JOIN departments d ON d.id = e.department_id | ||
| ORDER BY d.name, e.full_name; | ||
| </Run> | ||
|
|
||
| Operations has no employees — it shows up with `full_name` as NULL. Same shape if you reversed the tables and used `LEFT JOIN`. In practice: pick `LEFT` and put the "keep all of these" table on the left. Reads more naturally. | ||
|
|
||
| ## FULL JOIN: keep both sides | ||
|
|
||
| `FULL OUTER JOIN` (the `OUTER` is optional) keeps **every** row from both tables: matched pairs where possible, plus left rows with no right match, plus right rows with no left match — each padded with NULLs on the side that didn't have a partner. | ||
|
|
||
| <Run> | ||
| SELECT e.full_name, d.name AS department | ||
| FROM employees e | ||
| FULL JOIN departments d ON d.id = e.department_id | ||
| ORDER BY e.full_name NULLS LAST, d.name; | ||
| </Run> | ||
|
|
||
| Three flavors in the result: | ||
|
|
||
| - Most rows: an employee with a department. | ||
| - Edsger and Donald (contractors): `department` is NULL. | ||
| - Operations: `full_name` is NULL. | ||
|
|
||
| `FULL JOIN` shines when you need a single result that reconciles two sources — "all the things we know about, from either side, no matter which side knew about them". | ||
|
|
||
| ### Finding the "only-one-side" rows | ||
|
|
||
| A `FULL JOIN` filtered to rows where one side is NULL gives you the asymmetric difference: things on the left with no match, or vice versa. Combine both NULL checks for "rows missing from one or both sides": | ||
|
|
||
| <Run> | ||
| SELECT e.full_name, d.name AS department | ||
| FROM employees e | ||
| FULL JOIN departments d ON d.id = e.department_id | ||
| WHERE e.id IS NULL OR e.department_id IS NULL; | ||
| </Run> | ||
|
|
||
| Useful for reconciling: "who's missing from the departments table?", "which departments have nobody assigned?" | ||
|
|
||
| ## CROSS JOIN: every row paired with every row | ||
|
|
||
| `CROSS JOIN` is the Cartesian product — no `ON` clause, every left row paired with every right row. Result size is `left × right`. | ||
|
|
||
| <Run> | ||
| SELECT size, color | ||
| FROM shirt_sizes | ||
| CROSS JOIN shirt_colors | ||
| ORDER BY size, color; | ||
| </Run> | ||
|
|
||
| Three sizes × three colors = nine combinations. Nine shirts to stock. CROSS JOIN earns its keep for: | ||
|
|
||
| - **Generating combinations** (sizes × colors, dates × users, regions × products). | ||
| - **Pairing every row with a single-row computation** — `CROSS JOIN (SELECT now() AS t)` injects the current time alongside every row. | ||
| - **`generate_series` plus a table** — fill in missing days in a time series. | ||
|
|
||
| The two implicit-comma forms (`FROM a, b`) and `CROSS JOIN a, b` are equivalent, but `CROSS JOIN` makes the intent explicit. Never accidentally CROSS JOIN a large table — millions × millions is a long afternoon. | ||
|
|
||
| ## Self-join: a table joined to itself | ||
|
|
||
| When rows have relationships to *other rows in the same table*, you join the table to itself with a different alias for each "side". The classic example is an employee/manager hierarchy. | ||
|
|
||
| <Run> | ||
| SELECT e.full_name AS employee, m.full_name AS manager | ||
| FROM employees e | ||
| LEFT JOIN employees m ON m.id = e.manager_id | ||
| ORDER BY m.full_name NULLS FIRST, e.full_name; | ||
| </Run> | ||
|
|
||
| Two aliases for the same table — `e` is the row we're describing, `m` is the row we're looking up. The `LEFT JOIN` keeps Ada in the result even though her `manager_id` is NULL (she's at the top). | ||
|
|
||
| Self-joins also handle peers (employees in the same department) and chains (manager's manager). For *deep* recursion — "everyone underneath Ada, however many levels down" — you reach for a recursive CTE, which is its own future lesson. | ||
|
|
||
| ## Multi-table joins | ||
|
|
||
| Each `JOIN` adds one more table. The query reads top-to-bottom as a pipeline: start with a row, attach a related row, then another. | ||
|
|
||
| <Run> | ||
| SELECT e.full_name AS employee, | ||
| m.full_name AS manager, | ||
| d.name AS department | ||
| FROM employees e | ||
| LEFT JOIN employees m ON m.id = e.manager_id | ||
| LEFT JOIN departments d ON d.id = e.department_id | ||
| ORDER BY d.name NULLS LAST, e.full_name; | ||
| </Run> | ||
|
|
||
| Three tables (two of them the same table aliased differently), one result. The `LEFT JOIN` on departments keeps the contractors in the listing even though they have no department. | ||
|
|
||
| ## When to use which | ||
|
|
||
| A useful mental table: | ||
|
|
||
| | You want… | Use | | ||
| | ------------------------------------------------------------ | -------------- | | ||
| | Rows that match on both sides | `INNER JOIN` | | ||
| | All rows from the left, matches from the right where present | `LEFT JOIN` | | ||
| | Same, but with the tables in the other order | `RIGHT JOIN` | | ||
| | All rows from both sides, NULLs where they don't match | `FULL JOIN` | | ||
| | Every combination of left × right | `CROSS JOIN` | | ||
| | A row's relationship to other rows in the same table | self-join | | ||
|
|
||
| A working rule: prefer `LEFT` over `RIGHT` so the "keep all" table is on the left. Reach for `FULL` only when both sides legitimately matter. Use `CROSS` deliberately — never by accident. | ||
|
|
||
| ## What you learned | ||
|
|
||
| - `RIGHT JOIN` = `LEFT JOIN` with the tables flipped; pick one shape and stick with it. | ||
| - `FULL OUTER JOIN` keeps unmatched rows from both sides, NULL-padded. | ||
| - `WHERE side.id IS NULL` filters down to the asymmetric difference — "missing on this side". | ||
| - `CROSS JOIN` is the Cartesian product; useful for generating combinations, deadly by accident. | ||
| - Self-joins use two aliases for the same table — the standard pattern for parent/child within one table. | ||
| - Chain `JOIN`s to bring in more tables; read top-to-bottom as a pipeline. | ||
|
|
||
| <Check id="seed-loaded"> | ||
| Mark this lesson done — we'll just confirm the sandbox is healthy. | ||
| </Check> | ||
|
|
||
| Up next: combining results vertically with `UNION`, `INTERSECT`, and `EXCEPT`. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,19 @@ | ||
| title: Advanced joins | ||
| summary: RIGHT, FULL, and CROSS joins, self-joins for hierarchies, and chaining joins across many tables. | ||
| estimatedMinutes: 14 | ||
| tags: | ||
| - join | ||
| - right-join | ||
| - full-join | ||
| - cross-join | ||
| - self-join | ||
| authors: | ||
| - exekias | ||
| seed: seed.sql | ||
| checks: | ||
| - id: seed-loaded | ||
| type: row-count | ||
| description: The seeded employees table has 12 rows — click to mark this lesson done. | ||
| table: employees | ||
| expect: | ||
| rowCount: 12 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,47 @@ | ||
| -- Seed for "02-advanced-joins": a small org with employees, departments, and | ||
| -- projects so we have material for self-joins (manager_id), FULL joins | ||
| -- (department with no employees + employee with no department), and three- | ||
| -- table chains. A few sizes and colors give CROSS JOIN something to do. | ||
|
|
||
| CREATE TABLE departments ( | ||
| id serial PRIMARY KEY, | ||
| name text NOT NULL UNIQUE | ||
| ); | ||
|
|
||
| INSERT INTO departments (name) VALUES | ||
| ('Engineering'), | ||
| ('Design'), | ||
| ('Sales'), | ||
| ('Operations'); -- intentionally has no employees, for FULL JOIN | ||
|
|
||
| CREATE TABLE employees ( | ||
| id serial PRIMARY KEY, | ||
| full_name text NOT NULL, | ||
| department_id int REFERENCES departments(id), -- nullable: contractors | ||
| manager_id int REFERENCES employees(id), | ||
| hired_at date NOT NULL | ||
| ); | ||
|
|
||
| INSERT INTO employees (full_name, department_id, manager_id, hired_at) VALUES | ||
| ('Ada Lovelace', 1, NULL, '2020-01-10'), -- 1, no manager (CEO-ish) | ||
| ('Alan Turing', 1, 1, '2020-03-04'), -- 2, manager: Ada | ||
| ('Grace Hopper', 1, 1, '2020-04-12'), -- 3, manager: Ada | ||
| ('Linus Torvalds', 1, 2, '2021-02-19'), -- 4, manager: Alan | ||
| ('Margaret Hamilton', 2, 1, '2020-06-01'), -- 5, manager: Ada (design lead) | ||
| ('Barbara Liskov', 2, 5, '2021-05-15'), -- 6, manager: Margaret | ||
| ('Dennis Ritchie', 3, 1, '2021-09-22'), -- 7, manager: Ada (sales lead) | ||
| ('Ken Thompson', 3, 7, '2022-01-11'), -- 8, manager: Dennis | ||
| ('Edsger Dijkstra', NULL, NULL, '2023-03-03'), -- 9, contractor, no dept | ||
| ('Donald Knuth', NULL, NULL, '2023-08-17'), -- 10, contractor, no dept | ||
| ('Guido van Rossum', 1, 2, '2024-02-20'), -- 11, manager: Alan | ||
| ('Bjarne Stroustrup', 2, 5, '2024-07-09'); -- 12, manager: Margaret | ||
|
|
||
| CREATE TABLE shirt_sizes ( | ||
| size text PRIMARY KEY | ||
| ); | ||
| INSERT INTO shirt_sizes (size) VALUES ('S'), ('M'), ('L'); | ||
|
|
||
| CREATE TABLE shirt_colors ( | ||
| color text PRIMARY KEY | ||
| ); | ||
| INSERT INTO shirt_colors (color) VALUES ('black'), ('white'), ('navy'); |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot this table didn't render correctly, fix it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in the latest commit.
remark-gfmisn't installed, so GFM pipe tables don't render in MDX. Replaced the markdown table with a plain HTML<table>— MDX handles it natively and@tailwindcss/typography'sproseclass takes care of styling.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot I would rather go for remark-gfm approach if that's not too complex
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done — installed
remark-gfmand added it toMDXRemote'sremarkPluginsinpage.tsx. The GFM pipe table in the lesson is restored as-is.