This repository showcases a compact data-analysis and visualization project exploring how Discourse Markers (DM) and Not DM are distributed across age groups (Youth, Adults, Seniors). I created this project as part of a broader linguistic study, and I’m using the repo to demonstrate my skills in:
- R programming
- Data wrangling (tidyverse)
- Statistical visualization (ggplot2)
- Reproducible project structure
- Working with linguistic/categorical data
It serves both as a component of my linguistic research and as a portfolio sample of my analytical workflow.
The project includes:
- A structured dataset of coded discourse-marker functions
- Clean, reproducible R scripts
- Three publication-style figures
- A documented workflow suitable for academic or applied linguistic settings
The focus is on:
- Comparing DM vs Not DM across age groups
- Identifying functional categories most used by each group
- Producing clear and interpretable visual insights
aging-discourse-markers/
│
├─ data/
│ └─ dm_coding_template.xlsx # template for DM/Not DM coding
│
├─ scripts/
│ └─ dm_notdm_agegroup_plots.R # generates all figures
│
├─ figures/
│ ├─ figure1_dm_vs_notdm.png
│ ├─ figure2_dm_functions_age.png
│ └─ figure3_notdm_functions_age.png
│
└─ Aging Discourse Markers.Rproj # RStudio project file
100% stacked bar chart comparing marker distribution across age groups.
Top functional categories of discourse markers used by each age group.
Functional categories for Non-DM items across age groups.
- R (tidyverse, ggplot2, scales)
- Data manipulation: pivoting, factor reordering, category harmonization
- Visual design: custom themes, color palettes, percentage scaling
- Reproducibility: RStudio project structure, clean script separation
This demonstrates my ability to:
- Build small but robust analytical pipelines
- Work with linguistic feature coding
- Produce publication-ready visualizations
- Organize projects cleanly for collaboration or peer review
Open the project:
Aging Discourse Markers.RprojRun the script:
scripts/dm_notdm_agegroup_plots.RThe figures will regenerate inside figures/.
This analysis was originally prepared for one of my university term papers on language and aging, but I’ve structured and published it here as a portfolio-friendly demonstration of my:
- Linguistic data analysis abilities
- Visualization design
- R workflow discipline
- Academic research tooling
If you’d like to see more of my work, feel free to reach out or explore other repositories.


