|
51 | 51 | "\n", |
52 | 52 | "--- \n", |
53 | 53 | "\n", |
54 | | - "## 1️⃣ Load the BBC News Dataset\n", |
| 54 | + "## 💾 Load the BBC News Dataset\n", |
55 | 55 | "\n", |
56 | 56 | "We’ll start by loading the **BBC News** dataset using the Hugging Face `datasets` library.\n", |
57 | 57 | "\n", |
|
64 | 64 | "\n", |
65 | 65 | "Each article is represented as a short paragraph of text, and labeled with its corresponding topic. While we won’t use the labels for **unsupervised topic modeling**, they are useful later for evaluation and visualization.\n", |
66 | 66 | "\n", |
67 | | - "Let’s load the data and preview a few examples. We will concatenate the test and train sets to have a larger dataset to play with.\n" |
| 67 | + "Let’s load the data and preview a few examples. We will concatenate the test\n" |
68 | 68 | ] |
69 | 69 | }, |
70 | 70 | { |
|
182 | 182 | "cell_type": "markdown", |
183 | 183 | "metadata": {}, |
184 | 184 | "source": [ |
185 | | - "## 2️⃣ Quick Intro to BERTopic + LiteLLM\n", |
| 185 | + "## ℹ️ Quick Intro to BERTopic + LiteLLM\n", |
186 | 186 | "\n", |
187 | 187 | "Now let’s briefly introduce the key tools we’ll use to extract and **interpret** topics:\n", |
188 | 188 | "\n", |
|
303 | 303 | "cell_type": "markdown", |
304 | 304 | "metadata": {}, |
305 | 305 | "source": [ |
306 | | - "## 3️⃣ Fit BERTopic and Explore Topics\n", |
| 306 | + "## 💬 Fit BERTopic and Explore Topics\n", |
307 | 307 | "\n", |
308 | 308 | "Now let’s apply BERTopic to our dataset.\n", |
309 | 309 | "\n", |
|
372 | 372 | "cell_type": "markdown", |
373 | 373 | "metadata": {}, |
374 | 374 | "source": [ |
375 | | - "## 📊 4️⃣ Visualize Topic Clusters with LLM-Generated Titles\n", |
| 375 | + "## 📊 Visualize Topic Clusters with LLM-Generated Titles\n", |
376 | 376 | "\n", |
377 | 377 | "After fitting our BERTopic model on the corpus, the next step is to explore the **semantic structure** of our dataset by visualizing:\n", |
378 | 378 | "\n", |
|
6109 | 6109 | "cell_type": "markdown", |
6110 | 6110 | "metadata": {}, |
6111 | 6111 | "source": [ |
6112 | | - "## 📋 5️⃣ Topic Summary Table & Exploration\n", |
| 6112 | + "## 📋 Topic Summary Table & Exploration\n", |
6113 | 6113 | "\n", |
6114 | 6114 | "Let’s now inspect what each topic is about.\n", |
6115 | 6115 | "\n", |
|
6861 | 6861 | "\n", |
6862 | 6862 | "---\n", |
6863 | 6863 | "\n", |
6864 | | - "## 🔍 6️⃣ Step-by-Step Breakdown: How BERTopic Works\n", |
| 6864 | + "## 🔍 Step-by-Step Breakdown: How BERTopic Works\n", |
6865 | 6865 | "\n", |
6866 | 6866 | "In the first half of this notebook, we focused on using BERTopic like a black box. Now it’s time to **open that box** and understand how it really works.\n", |
6867 | 6867 | "\n", |
|
0 commit comments