Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
79 changes: 79 additions & 0 deletions 02_activities/assignments/assignment3.ipynb

Large diffs are not rendered by default.

28 changes: 14 additions & 14 deletions 02_activities/assignments/assignment_2.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,24 +10,24 @@
- For each visualization (good and bad):
- Explain (with reference to material covered up to date, along with readings and other scholarly sources, as needed) why you classified that visualization the way you did.
```
Your answer...







Good visualization: "10 Year Trends in Prescription Opioid Use in Commercially Insured by Health Care Cost Institute" (https://public.tableau.com/app/profile/health.care.cost.institute/viz/Opioid10YearTrendsBlogInteractiveTool070919/Figure210YearTrends)
The visualization “10 Year Trends in Prescription Opioid Use in the Commercially Insured” by the Health Care Cost Institute is a good example of effective data visualization because it presents complex information in an interpretable manner. One of the main strengths of this visualization is its use of colour. The colours are defined but not too bold, which helps distinguish between different opioid types without distracting the viewer (e.g. not using red or neon colours). This makes it easier to focus on the overall trends rather than being distracted. The use of line graphs is also appropriate, as it clearly shows how prescription opioid use changes over time. Since data covers a ten-year period, the continuous trend of line graphs makes it simple to observe fluctuations and long-term patterns. Another reason this visualization is good is that it incorporates a lot of information without feeling overwhelmed. The interactive features are a major advantage when looking at the different opioid trends plotted by individual states and at a national level. This makes the visualization useful for both broad comparisons and more location-specific analysis. In addition, the visualization considers different ways opioid prescriptions are measured, such as dosage types and terminology, thereby ensuring the data is presented in a more complete and accurate way. Overall, this visualization is effective because it clearly communicates trends, supports comparison, and is accessible to a wide audience.
Bad visualization: "my 2024 health stats by Arjen Groeneveld" (https://public.tableau.com/app/profile/arjen.groeneveld/viz/24healthstatsfeb25/Dashboard1)
The visualization “My 2024 Health Stats” by Arjen Groeneveld is not effective data visualization because it is difficult to understand and does not clearly communicate useful insights. While the dashboard separates data into different health categories such as energy, steps, weight, workouts, and sleep, this structure alone does not make the data easy to interpret. The chart on the right-hand side is most problematic for me because it is very difficult to identify meaningful trends, patterns, or comparisons, thereby limiting the main purpose of data visualization being quick interpretation.
The main issue of the chart is that does not allow the viewer to easily identify trends or patterns. The axes and visual elements are unclear and there is no clear way to compare values over time, making it difficult to understand whether health metrics are improving, declining, or staying the same. The visualization forces the reader to spend time trying to decode what they are looking at and defying the main purpose of data visualization, which is to make information easier to understand.
Additionally, the visualization lacks context because there are no reference points or benchmarks to help readers determine whether the metrics presented are good or bad. When the reader does not have adequate context, this makes the data meaningless and hard to draw conclusions for any potential lifestyle changes. Overall, although the idea of tracking personal health data is useful, but this visualization does not present the information in a clear way.
```
- How could this data visualization have been improved?
```
Your answer...





Your answer...
While the opioid visualization is already good, there are a few ways it could be improved to make it even more effective. One improvement would be adding more context directly to the graph, such as explaining why certain changes happen. For instance, if there were a major policy change or new prescribing guideline, this could have influenced opioid use during this time. Adding short annotations or markers on this line graph can be added to provide context for audiences to understanding what factors could be mediating these fluctuations.
The visualization could also benefit from a short summary of key takeaways at the bottom of the figure. Including a brief takeway section highlighting major findings of this analysis would make the data easier to understand at a glance. These additions would support the viewer in interpreting the data more confidently.

The “My 2024 Health Stats” visualization could be improved by focusing more on clarity and simplicity. One of the biggest improvements would be changing the chart to a more digestible format, e.g. a line graph, because it can illustrate changes health patterns across time.
Clearer labels and explanations would also be useful. Each chart should show what is metric is being plotted and their corresponding units. Legends, colours, and icons should be clear and consistent across the plot (e.g. dashes vs. circles). Including short descriptions explaining what each section means would help viewers understand the data without misinterpretation. Separating each health metric into their own chart instead of combining them all into a single chart would reduce clutter.
Including context regarding the rationale for each metric selection via a short description would make the reader understand the importance and purpose of the metrics presented. Also, adding benchmarks or suggestions about recommended sleep ranges, step goals, or workout frequency would help viewers understand how their data compares to healthy standards. By simplifying the design and focusing on what the viewer needs to understand and the rationale for each element, this visualization could become a clearer and more informative visualization of personal health data.

```
- Word count should not exceed (as a maximum) 500 words for each visualization (i.e.
Expand Down
69 changes: 69 additions & 0 deletions 02_activities/assignments/assignment_3_ontario.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# Data Visualization

## Assignment 3: Final Project

### Requirements:
- We will finish this class by giving you the chance to use what you have learned in a practical context, by creating data visualizations from raw data.
- Choose a dataset of interest from the [City of Toronto’s Open Data Portal](https://www.toronto.ca/city-government/data-research-maps/open-data/) or [Ontario’s Open Data Catalogue](https://data.ontario.ca/).
- Using Python and one other data visualization software (Excel or free alternative, Tableau Public, any other tool you prefer), create two distinct visualizations from your dataset of choice.
- For each visualization, describe and justify:
> What software did you use to create your data visualization? I used Microsoft Excel to create my data visualization.

> Who is your intended audience? My intended audience are public health policy makers, medical professionals, and general public.

> What information or message are you trying to convey with your visualization? Significant variations in COVID-19 cases depending on the vaccination status.

> What aspects of design did you consider when making your visualization? How did you apply them? With what elements of your plots? I selected a grouped bar chart to compare the different vaccination statuses side by side. By categorizing the different groups with discernible colours, labelling the axes clearly, and limiting crowding, this allowed for a clear presentation of the data set.

> How did you ensure that your data visualizations are reproducible? If the tool you used to make your data visualization is not reproducible, how will this impact your data visualization? Excel is much less reproducible than using a code tool, e.g. Python, because it requires manual updating of the dataset. However, if you manually update the data in the same Excel file, the data visualization will produce an updated verison.

> How did you ensure that your data visualization is accessible? I ensured that my data visualization is accessible by preventing overcrowing of the vaccination groups and choosing colours that are clearly different from one another making identification of trends easier to assess.

> Who are the individuals and communities who might be impacted by your visualization? The individuals and communities who might be impacted by my visualization are Ontario citizens considering taking the vaccine and public health policymakers responsible for educating the public on vaccination efficacy.

> How did you choose which features of your chosen dataset to include or exclude from your visualization? The features I included in my visualization are age group, vaccination status, and COVID-19 case rate. This provided a bird eye view of age and vaccination status effects COVID-19 transmission.

> What ‘underwater labour’ contributed to your final data visualization product? The ‘underwater labour’ that contributed to my final data visualization product was choosing the best plot type that would illustrate this data set in the most clear, quick to understand, and accessible way.

- This assignment is intentionally open-ended - you are free to create static or dynamic data visualizations, maps, or whatever form of data visualization you think best communicates your information to your audience of choice!
- Total word count should not exceed **(as a maximum) 1000 words**

### Why am I doing this assignment?:
- This ongoing assignment ensures active participation in the course, and assesses the learning outcomes:
* Create and customize data visualizations from start to finish in Python
* Apply general design principles to create accessible and equitable data visualizations
* Use data visualization to tell a story
- This would be a great project to include in your GitHub Portfolio – put in the effort to make it something worthy of showing prospective employers!

### Rubric:

| Component | Scoring | Requirement |
|-------------------|----------|-----------------------------------------------------------------------------|
| Data Visualizations | Complete/Incomplete | - Data visualizations are distinct from each other<br>- Data visualizations are clearly identified<br>- Different sources/rationales (text with two images of data, if visualizations are labeled)<br>- High-quality visuals (high resolution and clear data)<br>- Data visualizations follow best practices of accessibility |
| Written Explanations | Complete/Incomplete | - All questions from assignment description are answered for each visualization<br>- Explanations are supported by course content or scholarly sources, where needed |
| Code | Complete/Incomplete | - All code is included as an appendix with your final submissions<br>- Code is clearly commented and reproducible |

## Submission Information

🚨 **Please review our [Assignment Submission Guide](https://github.com/UofT-DSI/onboarding/blob/main/onboarding_documents/submissions.md)** 🚨 for detailed instructions on how to format, branch, and submit your work. Following these guidelines is crucial for your submissions to be evaluated correctly.

### Submission Parameters:
* Submission Due Date: `23:59 - 02/02/2026`
* The branch name for your repo should be: `assignment-3`
* What to submit for this assignment:
* A folder/directory containing:
* This file (assignment_3.md)
* Two data visualizations
* Two markdown files for each both visualizations with their written descriptions.
* Link to your dataset of choice.
* Complete and commented code as an appendix (for your visualization made with Python, and for the other, if relevant)
* What the pull request link should look like for this assignment: `https://github.com/<your_github_username>/visualization/pull/<pr_id>`
* Open a private window in your browser. Copy and paste the link to your pull request into the address bar. Make sure you can see your pull request properly. This helps the technical facilitator and learning support staff review your submission easily.

Checklist:
- [ ] Create a branch called `assignment-3`.
- [ ] Ensure that the repository is public.
- [ ] Review [the PR description guidelines](https://github.com/UofT-DSI/onboarding/blob/main/onboarding_documents/submissions.md#guidelines-for-pull-request-descriptions) and adhere to them.
- [ ] Verify that the link is accessible in a private browser window.

If you encounter any difficulties or have questions, please don't hesitate to reach out to our team via our Slack. Our Technical Facilitators and Learning Support staff are here to help you navigate any challenges.
69 changes: 69 additions & 0 deletions 02_activities/assignments/assignment_3_toronto.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# Data Visualization

## Assignment 3: Final Project

### Requirements:
- We will finish this class by giving you the chance to use what you have learned in a practical context, by creating data visualizations from raw data.
- Choose a dataset of interest from the [City of Toronto’s Open Data Portal](https://www.toronto.ca/city-government/data-research-maps/open-data/) or [Ontario’s Open Data Catalogue](https://data.ontario.ca/).
- Using Python and one other data visualization software (Excel or free alternative, Tableau Public, any other tool you prefer), create two distinct visualizations from your dataset of choice.
- For each visualization, describe and justify:
> What software did you use to create your data visualization? I used Python to create my data visualization. I used panda library to clean the data and matplotlib for plotting and presenting the data.

> Who is your intended audience? My intended audience is for city planners, government policy makers, and general public.

> What information or message are you trying to convey with your visualization? Housing demands have been persistent over time.

> What aspects of design did you consider when making your visualization? How did you apply them? With what elements of your plots? I chose a line graph to show the trend of how housing demands varied across time. I chose teal as a neutral colour. I added circle symbols to identify each date recording. I reduced the number of labels on x-axis to prevent crowding. Axis and graph titles were labelled clearly.

> How did you ensure that your data visualizations are reproducible? If the tool you used to make your data visualization is not reproducible, how will this impact your data visualization? I ensured that data visualizations are reproducible because I used a Python script to generate the illustrations via a specific set of instructions. These instructions will continue to output the same design despite the data being updated as long as the same file name is used.

> How did you ensure that your data visualization is accessible? I ensured the data visualization is accessible by selecting a clear and identifiable colour that has high contrast with its background. The font size is also large and there is limited clutter ensuring fast and easy data interpretation.

> Who are the individuals and communities who might be impacted by your visualization? The individuals and communities who might be impacted by your visualization are individuals that rely on social housing such as newcomers and low-income households.

> How did you choose which features of your chosen dataset to include or exclude from your visualization? My visualization focuses on the overall macroscopic view of total active waiting list counts across time. This provides high-level overview of the housing crisis without being biased by a specific time point.

> What ‘underwater labour’ contributed to your final data visualization product? The 'underwater labour’ that contributed to my final data visualization product is removing unfilled data points, determining which columns to include/remove, and trying different colours and font sizes.

- This assignment is intentionally open-ended - you are free to create static or dynamic data visualizations, maps, or whatever form of data visualization you think best communicates your information to your audience of choice!
- Total word count should not exceed **(as a maximum) 1000 words**

### Why am I doing this assignment?:
- This ongoing assignment ensures active participation in the course, and assesses the learning outcomes:
* Create and customize data visualizations from start to finish in Python
* Apply general design principles to create accessible and equitable data visualizations
* Use data visualization to tell a story
- This would be a great project to include in your GitHub Portfolio – put in the effort to make it something worthy of showing prospective employers!

### Rubric:

| Component | Scoring | Requirement |
|-------------------|----------|-----------------------------------------------------------------------------|
| Data Visualizations | Complete/Incomplete | - Data visualizations are distinct from each other<br>- Data visualizations are clearly identified<br>- Different sources/rationales (text with two images of data, if visualizations are labeled)<br>- High-quality visuals (high resolution and clear data)<br>- Data visualizations follow best practices of accessibility |
| Written Explanations | Complete/Incomplete | - All questions from assignment description are answered for each visualization<br>- Explanations are supported by course content or scholarly sources, where needed |
| Code | Complete/Incomplete | - All code is included as an appendix with your final submissions<br>- Code is clearly commented and reproducible |

## Submission Information

🚨 **Please review our [Assignment Submission Guide](https://github.com/UofT-DSI/onboarding/blob/main/onboarding_documents/submissions.md)** 🚨 for detailed instructions on how to format, branch, and submit your work. Following these guidelines is crucial for your submissions to be evaluated correctly.

### Submission Parameters:
* Submission Due Date: `23:59 - 02/02/2026`
* The branch name for your repo should be: `assignment-3`
* What to submit for this assignment:
* A folder/directory containing:
* This file (assignment_3.md)
* Two data visualizations
* Two markdown files for each both visualizations with their written descriptions.
* Link to your dataset of choice.
* Complete and commented code as an appendix (for your visualization made with Python, and for the other, if relevant)
* What the pull request link should look like for this assignment: `https://github.com/<your_github_username>/visualization/pull/<pr_id>`
* Open a private window in your browser. Copy and paste the link to your pull request into the address bar. Make sure you can see your pull request properly. This helps the technical facilitator and learning support staff review your submission easily.

Checklist:
- [ ] Create a branch called `assignment-3`.
- [ ] Ensure that the repository is public.
- [ ] Review [the PR description guidelines](https://github.com/UofT-DSI/onboarding/blob/main/onboarding_documents/submissions.md#guidelines-for-pull-request-descriptions) and adhere to them.
- [ ] Verify that the link is accessible in a private browser window.

If you encounter any difficulties or have questions, please don't hesitate to reach out to our team via our Slack. Our Technical Facilitators and Learning Support staff are here to help you navigate any challenges.
Loading