Skip to content

Commit 4fccbf4

Browse files
committed
refactor: make exercises testable
1 parent b930d1f commit 4fccbf4

File tree

60 files changed

+1270
-836
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

60 files changed

+1270
-836
lines changed

sources/academy/webscraping/scraping_basics_javascript/05_parsing_html.md

Lines changed: 2 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ import CodeBlock from '@theme/CodeBlock';
99
import LegacyJsCourseAdmonition from '@site/src/components/LegacyJsCourseAdmonition';
1010
import Exercises from '../scraping_basics/_exercises.mdx';
1111
import F1AcademyTeamsExercise from '!!raw-loader!roa-loader!./exercises/f1academy_teams.mjs';
12+
import F1AcademyDriversExercise from '!!raw-loader!roa-loader!./exercises/f1academy_drivers.mjs';
1213

1314
<LegacyJsCourseAdmonition />
1415

@@ -195,19 +196,6 @@ Use the same URL as in the previous exercise, but this time print a total count
195196
<details>
196197
<summary>Solution</summary>
197198

198-
```js
199-
import * as cheerio from 'cheerio';
200-
201-
const url = "https://www.f1academy.com/Racing-Series/Teams";
202-
const response = await fetch(url);
203-
204-
if (response.ok) {
205-
const html = await response.text();
206-
const $ = cheerio.load(html);
207-
console.log($(".driver").length);
208-
} else {
209-
throw new Error(`HTTP ${response.status}`);
210-
}
211-
```
199+
<CodeBlock language="js">{F1AcademyDriversExercise.code}</CodeBlock>
212200

213201
</details>

sources/academy/webscraping/scraping_basics_javascript/06_locating_elements.md

Lines changed: 7 additions & 65 deletions
Original file line numberDiff line numberDiff line change
@@ -5,8 +5,12 @@ description: Lesson about building a Node.js application for watching prices. Us
55
slug: /scraping-basics-javascript/locating-elements
66
---
77

8+
import CodeBlock from '@theme/CodeBlock';
89
import LegacyJsCourseAdmonition from '@site/src/components/LegacyJsCourseAdmonition';
910
import Exercises from '../scraping_basics/_exercises.mdx';
11+
import WikipediaCountriesExercise from '!!raw-loader!roa-loader!./exercises/wikipedia_countries.mjs';
12+
import WikipediaCountriesSingleSelectorExercise from '!!raw-loader!roa-loader!./exercises/wikipedia_countries_single_selector.mjs';
13+
import GuardianF1TitlesExercise from '!!raw-loader!roa-loader!./exercises/guardian_f1_titles.mjs';
1014

1115
<LegacyJsCourseAdmonition />
1216

@@ -239,35 +243,7 @@ Djibouti
239243
<details>
240244
<summary>Solution</summary>
241245

242-
```js
243-
import * as cheerio from 'cheerio';
244-
245-
const url = "https://en.wikipedia.org/wiki/List_of_sovereign_states_and_dependent_territories_in_Africa";
246-
const response = await fetch(url);
247-
248-
if (response.ok) {
249-
const html = await response.text();
250-
const $ = cheerio.load(html);
251-
252-
for (const tableElement of $(".wikitable").toArray()) {
253-
const $table = $(tableElement);
254-
const $rows = $table.find("tr");
255-
256-
for (const rowElement of $rows.toArray()) {
257-
const $row = $(rowElement);
258-
const $cells = $row.find("td");
259-
260-
if ($cells.length > 0) {
261-
const $thirdColumn = $($cells[2]);
262-
const $link = $thirdColumn.find("a").first();
263-
console.log($link.text());
264-
}
265-
}
266-
}
267-
} else {
268-
throw new Error(`HTTP ${response.status}`);
269-
}
270-
```
246+
<CodeBlock language="js">{WikipediaCountriesExercise.code}</CodeBlock>
271247

272248
Because some rows contain [table headers](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/th), we skip processing a row if `table_row.select("td")` doesn't find any [table data](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/td) cells.
273249

@@ -289,25 +265,7 @@ You may want to check out the following pages:
289265
<details>
290266
<summary>Solution</summary>
291267

292-
```js
293-
import * as cheerio from 'cheerio';
294-
295-
const url = "https://en.wikipedia.org/wiki/List_of_sovereign_states_and_dependent_territories_in_Africa";
296-
const response = await fetch(url);
297-
298-
if (response.ok) {
299-
const html = await response.text();
300-
const $ = cheerio.load(html);
301-
302-
for (const element of $(".wikitable tr td:nth-child(3)").toArray()) {
303-
const $nameCell = $(element);
304-
const $link = $nameCell.find("a").first();
305-
console.log($link.text());
306-
}
307-
} else {
308-
throw new Error(`HTTP ${response.status}`);
309-
}
310-
```
268+
<CodeBlock language="js">{WikipediaCountriesSingleSelectorExercise.code}</CodeBlock>
311269

312270
</details>
313271

@@ -331,22 +289,6 @@ Max Verstappen wins Canadian Grand Prix: F1 – as it happened
331289
<details>
332290
<summary>Solution</summary>
333291

334-
```js
335-
import * as cheerio from 'cheerio';
336-
337-
const url = "https://www.theguardian.com/sport/formulaone";
338-
const response = await fetch(url);
339-
340-
if (response.ok) {
341-
const html = await response.text();
342-
const $ = cheerio.load(html);
343-
344-
for (const element of $("#maincontent ul li h3").toArray()) {
345-
console.log($(element).text());
346-
}
347-
} else {
348-
throw new Error(`HTTP ${response.status}`);
349-
}
350-
```
292+
<CodeBlock language="js">{GuardianF1TitlesExercise.code}</CodeBlock>
351293

352294
</details>

sources/academy/webscraping/scraping_basics_javascript/07_extracting_data.md

Lines changed: 7 additions & 97 deletions
Original file line numberDiff line numberDiff line change
@@ -5,8 +5,12 @@ description: Lesson about building a Node.js application for watching prices. Us
55
slug: /scraping-basics-javascript/extracting-data
66
---
77

8+
import CodeBlock from '@theme/CodeBlock';
89
import LegacyJsCourseAdmonition from '@site/src/components/LegacyJsCourseAdmonition';
910
import Exercises from '../scraping_basics/_exercises.mdx';
11+
import WarehouseUnitsExercise from '!!raw-loader!roa-loader!./exercises/warehouse_units.mjs';
12+
import WarehouseUnitsRegexExercise from '!!raw-loader!roa-loader!./exercises/warehouse_units_regex.mjs';
13+
import GuardianPublishDatesExercise from '!!raw-loader!roa-loader!./exercises/guardian_publish_dates.mjs';
1014

1115
<LegacyJsCourseAdmonition />
1216

@@ -240,41 +244,7 @@ Denon AH-C720 In-Ear Headphones | 236
240244
<details>
241245
<summary>Solution</summary>
242246

243-
```js
244-
import * as cheerio from 'cheerio';
245-
246-
function parseUnitsText(text) {
247-
const count = text
248-
.replace("In stock,", "")
249-
.replace("Only", "")
250-
.replace(" left", "")
251-
.replace("units", "")
252-
.trim();
253-
return count === "Sold out" ? 0 : parseInt(count);
254-
}
255-
256-
const url = "https://warehouse-theme-metal.myshopify.com/collections/sales";
257-
const response = await fetch(url);
258-
259-
if (response.ok) {
260-
const html = await response.text();
261-
const $ = cheerio.load(html);
262-
263-
for (const element of $(".product-item").toArray()) {
264-
const $productItem = $(element);
265-
266-
const title = $productItem.find(".product-item__title");
267-
const title = $title.text().trim();
268-
269-
const unitsText = $productItem.find(".product-item__inventory").text();
270-
const unitsCount = parseUnitsText(unitsText);
271-
272-
console.log(`${title} | ${unitsCount}`);
273-
}
274-
} else {
275-
throw new Error(`HTTP ${response.status}`);
276-
}
277-
```
247+
<CodeBlock language="js">{WarehouseUnitsExercise.code}</CodeBlock>
278248

279249
:::tip Conditional (ternary) operator
280250

@@ -291,39 +261,7 @@ Simplify the code from previous exercise. Use [regular expressions](https://deve
291261
<details>
292262
<summary>Solution</summary>
293263

294-
```js
295-
import * as cheerio from 'cheerio';
296-
297-
function parseUnitsText(text) {
298-
const match = text.match(/\d+/);
299-
if (match) {
300-
return parseInt(match[0]);
301-
}
302-
return 0;
303-
}
304-
305-
const url = "https://warehouse-theme-metal.myshopify.com/collections/sales";
306-
const response = await fetch(url);
307-
308-
if (response.ok) {
309-
const html = await response.text();
310-
const $ = cheerio.load(html);
311-
312-
for (const element of $(".product-item").toArray()) {
313-
const $productItem = $(element);
314-
315-
const $title = $productItem.find(".product-item__title");
316-
const title = $title.text().trim();
317-
318-
const unitsText = $productItem.find(".product-item__inventory").text();
319-
const unitsCount = parseUnitsText(unitsText);
320-
321-
console.log(`${title} | ${unitsCount}`);
322-
}
323-
} else {
324-
throw new Error(`HTTP ${response.status}`);
325-
}
326-
```
264+
<CodeBlock language="js">{WarehouseUnitsRegexExercise.code}</CodeBlock>
327265

328266
:::tip Conditional (ternary) operator
329267

@@ -363,34 +301,6 @@ Hamilton reveals distress over ‘devastating’ groundhog accident at Canadian
363301
<details>
364302
<summary>Solution</summary>
365303

366-
```js
367-
import * as cheerio from 'cheerio';
368-
369-
const url = "https://www.theguardian.com/sport/formulaone";
370-
const response = await fetch(url);
371-
372-
if (response.ok) {
373-
const html = await response.text();
374-
const $ = cheerio.load(html);
375-
376-
for (const element of $("#maincontent ul li").toArray()) {
377-
const $article = $(element);
378-
379-
const title = $article
380-
.find("h3")
381-
.text()
382-
.trim();
383-
const dateText = $article
384-
.find("time")
385-
.attr("datetime")
386-
.trim();
387-
const date = new Date(dateText);
388-
389-
console.log(`${title} | ${date.toDateString()}`);
390-
}
391-
} else {
392-
throw new Error(`HTTP ${response.status}`);
393-
}
394-
```
304+
<CodeBlock language="js">{GuardianPublishDatesExercise.code}</CodeBlock>
395305

396306
</details>

sources/academy/webscraping/scraping_basics_javascript/08_saving_data.md

Lines changed: 3 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,9 @@ description: Lesson about building a Node.js application for watching prices. Us
55
slug: /scraping-basics-javascript/saving-data
66
---
77

8+
import CodeBlock from '@theme/CodeBlock';
89
import LegacyJsCourseAdmonition from '@site/src/components/LegacyJsCourseAdmonition';
10+
import ProcessProductsJsonExercise from '!!raw-loader!roa-loader!./exercises/process_products_json.mjs';
911

1012
<LegacyJsCourseAdmonition />
1113

@@ -210,15 +212,7 @@ Write a new Node.js program that reads the `products.json` file we created in th
210212
<details>
211213
<summary>Solution</summary>
212214

213-
```js
214-
import { readFile } from "fs/promises";
215-
216-
const jsonData = await readFile("products.json");
217-
const data = JSON.parse(jsonData);
218-
data
219-
.filter(row => row.minPrice > 50000)
220-
.forEach(row => console.log(row));
221-
```
215+
<CodeBlock language="js">{ProcessProductsJsonExercise.code}</CodeBlock>
222216

223217
</details>
224218

sources/academy/webscraping/scraping_basics_javascript/09_getting_links.md

Lines changed: 5 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -5,8 +5,11 @@ description: Lesson about building a Node.js application for watching prices. Us
55
slug: /scraping-basics-javascript/getting-links
66
---
77

8+
import CodeBlock from '@theme/CodeBlock';
89
import LegacyJsCourseAdmonition from '@site/src/components/LegacyJsCourseAdmonition';
910
import Exercises from '../scraping_basics/_exercises.mdx';
11+
import WikipediaCountryLinksExercise from '!!raw-loader!roa-loader!./exercises/wikipedia_country_links.mjs';
12+
import GuardianF1LinksExercise from '!!raw-loader!roa-loader!./exercises/guardian_f1_links.mjs';
1013

1114
<LegacyJsCourseAdmonition />
1215

@@ -342,26 +345,7 @@ https://en.wikipedia.org/wiki/Botswana
342345
<details>
343346
<summary>Solution</summary>
344347

345-
```js
346-
import * as cheerio from 'cheerio';
347-
348-
const listingURL = "https://en.wikipedia.org/wiki/List_of_sovereign_states_and_dependent_territories_in_Africa";
349-
const response = await fetch(listingURL);
350-
351-
if (response.ok) {
352-
const html = await response.text();
353-
const $ = cheerio.load(html);
354-
355-
for (const element of $(".wikitable tr td:nth-child(3)").toArray()) {
356-
const nameCell = $(element);
357-
const link = nameCell.find("a").first();
358-
const url = new URL(link.attr("href"), listingURL).href;
359-
console.log(url);
360-
}
361-
} else {
362-
throw new Error(`HTTP ${response.status}`);
363-
}
364-
```
348+
<CodeBlock language="js">{WikipediaCountryLinksExercise.code}</CodeBlock>
365349

366350
</details>
367351

@@ -386,25 +370,7 @@ https://www.theguardian.com/sport/article/2024/sep/02/max-verstappen-damns-his-u
386370
<details>
387371
<summary>Solution</summary>
388372

389-
```js
390-
import * as cheerio from 'cheerio';
391-
392-
const listingURL = "https://www.theguardian.com/sport/formulaone";
393-
const response = await fetch(listingURL);
394-
395-
if (response.ok) {
396-
const html = await response.text();
397-
const $ = cheerio.load(html);
398-
399-
for (const element of $("#maincontent ul li").toArray()) {
400-
const link = $(element).find("a").first();
401-
const url = new URL(link.attr("href"), listingURL).href;
402-
console.log(url);
403-
}
404-
} else {
405-
throw new Error(`HTTP ${response.status}`);
406-
}
407-
```
373+
<CodeBlock language="js">{GuardianF1LinksExercise.code}</CodeBlock>
408374

409375
Note that some cards contain two links. One leads to the article, and one to the comments. If we selected all the links in the list by `#maincontent ul li a`, we would get incorrect output like this:
410376

0 commit comments

Comments
 (0)