UofT-DSI · lyuxiaotian · Feb 3, 2026
diff --git a/02_activities/assignments/heatmap_gender.png b/02_activities/assignments/heatmap_gender.png
diff --git a/02_activities/assignments/names_clean.csv b/02_activities/assignments/names_clean.csv
diff --git a/02_activities/assignments/ontario_top_baby_names_female.csv b/02_activities/assignments/ontario_top_baby_names_female.csv
diff --git a/02_activities/assignments/ontario_top_baby_names_male.csv b/02_activities/assignments/ontario_top_baby_names_male.csv
diff --git a/02_activities/assignments/participation/pirategraph.html b/02_activities/assignments/participation/pirategraph.html
diff --git a/02_activities/assignments/top5_names.txt b/02_activities/assignments/top5_names.txt
@@ -0,0 +1,13 @@
+Female Top 5:
+MARIE
+MARY
+JENNIFER
+MARGARET
+ELIZABETH
+
+Male Top 5:
+JOSEPH
+JOHN
+ROBERT
+MICHAEL
+WILLIAM
diff --git a/02_activities/assignments/visualization-1 code and description.ipynb b/02_activities/assignments/visualization-1 code and description.ipynb
diff --git a/02_activities/assignments/visualization-2 code and description.ipynb b/02_activities/assignments/visualization-2 code and description.ipynb
@@ -0,0 +1,130 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "97449146",
+   "metadata": {},
+   "source": [
+    " > What software did you use to create your data visualization?\n",
+    "This visualization was created in Tableau Public using the cleaned dataset exported from Python. Tableau was selected for its interactive filtering, quick prototyping, and built-in line chart aesthetics.\n",
+    "    > Who is your intended audience? \n",
+    "    General public interested in “iconic” names of each era/Students learning to read time-series charts/Journalistic or educational readers who prefer simple narratives\n",
+    "    > What information or message are you trying to convey with your visualization? \n",
+    "    The chart shows that each generation has its own signature names:\n",
+    "    > What aspects of design did you consider when making your visualization? How did you apply them? With what elements of your plots? \n",
+    "    Limited to Top-5 names per gender to avoid clutter/Color encoding by name with semantic grouping/Dual panels for gender comparison/Simple tooltip showing Year and Frequency/Removal of unnecessary gridlines\n",
+    "    > How did you ensure that your data visualizations are reproducible? If the tool you used to make your data visualization is not reproducible, how will this impact your data visualization? \n",
+    "    The visualization depends on the same processed CSV used in Python. While Tableau involves manual interface steps, all calculations are based on raw fields (Year, Gender, Freq) without hidden transformations, so another user can recreate the chart following documented steps.\n",
+    "    > How did you ensure that your data visualization is accessible?  \n",
+    "    Colors chosen from Tableau’s color-blind palette/Adequate line thickness/Clear legend ordering/Avoidance of overlapping labels\n",
+    "    > Who are the individuals and communities who might be impacted by your visualization?  \n",
+    "    Readers may connect personally with names from their generation, which can evoke identity and cultural belonging. The chart is descriptive rather than prescriptive and avoids sensitive interpretations.\n",
+    "    > How did you choose which features of your chosen dataset to include or exclude from your visualization? \n",
+    "    Only the most representative names were kept to support a clear narrative. Less frequent names were excluded to prevent visual overload and to highlight generational archetypes.\n",
+    "    > What ‘underwater labour’ contributed to your final data visualization product?\n",
+    "    Identifying meaningful Top-5 sets from Python output/Experimenting with color palettes/Cleaning legend order/Testing readability on different screen sizes\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "d1e5fe2c",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Saved: names_clean.csv and top5_names.txt\n",
+      "name\n",
+      "MARIE        127409\n",
+      "MARY         118124\n",
+      "JENNIFER      61705\n",
+      "MARGARET      59965\n",
+      "ELIZABETH     51506\n",
+      "Name: freq, dtype: int64\n",
+      "name\n",
+      "JOSEPH     185607\n",
+      "JOHN       180985\n",
+      "ROBERT     166202\n",
+      "MICHAEL    147362\n",
+      "WILLIAM    147065\n",
+      "Name: freq, dtype: int64\n"
+     ]
+    }
+   ],
+   "source": [
+    "import pandas as pd\n",
+    "\n",
+    "female_path = \"/Users/lyuxiaotian/Desktop/visualization/02_activities/assignments/ontario_top_baby_names_female.csv\"  \n",
+    "male_path   = \"/Users/lyuxiaotian/Desktop/visualization/02_activities/assignments/ontario_top_baby_names_male.csv\"\n",
+    "\n",
+    "df_f = pd.read_csv(female_path)\n",
+    "df_m = pd.read_csv(male_path)\n",
+    "\n",
+    "rename_map = {\n",
+    "    \"Year/Année\": \"year\",\n",
+    "    \"Name/Nom\": \"name\",\n",
+    "    \"Frequency/Fréquence\": \"freq\",\n",
+    "    \"Year\": \"year\",\n",
+    "    \"Name\": \"name\",\n",
+    "    \"Frequency\": \"freq\",\n",
+    "}\n",
+    "df_f = df_f.rename(columns=rename_map)\n",
+    "df_m = df_m.rename(columns=rename_map)\n",
+    "\n",
+    "df_f[\"gender\"] = \"Female\"\n",
+    "df_m[\"gender\"] = \"Male\"\n",
+    "\n",
+    "df = pd.concat([df_f, df_m], ignore_index=True)\n",
+    "\n",
+    "df[\"name\"] = df[\"name\"].astype(str).str.strip().str.upper()\n",
+    "\n",
+    "top5_f = (\n",
+    "    df[df[\"gender\"]==\"Female\"]\n",
+    "    .groupby(\"name\")[\"freq\"].sum()\n",
+    "    .nlargest(5)\n",
+    ")\n",
+    "top5_m = (\n",
+    "    df[df[\"gender\"]==\"Male\"]\n",
+    "    .groupby(\"name\")[\"freq\"].sum()\n",
+    "    .nlargest(5)\n",
+    ")\n",
+    "\n",
+    "with open(\"top5_names.txt\", \"w\", encoding=\"utf-8\") as f:\n",
+    "    f.write(\"Female Top 5:\\n\")\n",
+    "    f.write(\"\\n\".join(top5_f.index.tolist()))\n",
+    "    f.write(\"\\n\\nMale Top 5:\\n\")\n",
+    "    f.write(\"\\n\".join(top5_m.index.tolist()))\n",
+    "\n",
+    "df_out = df[[\"year\", \"gender\", \"name\", \"freq\"]].copy()\n",
+    "df_out.to_csv(\"names_clean.csv\", index=False)\n",
+    "\n",
+    "print(\"Saved: names_clean.csv and top5_names.txt\")\n",
+    "print(top5_f)\n",
+    "print(top5_m)\n"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "visualization-env",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.13"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/02_activities/assignments/visualizaton-2.png b/02_activities/assignments/visualizaton-2.png
diff --git a/02_activities/assignments/wordcloud_gender.png b/02_activities/assignments/wordcloud_gender.png