Skip to content

jwitcher3/data-analyst-starter-kit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸŽ’ Data Analyst Starter Kit

This repo helps Excel-native analysts get started with Python and Databricks for everyday analysis.
It walks through loading data, doing pivot-like summaries, calculating KPIs, joining datasets, and saving outputs β€” all using Python and Pandas.


πŸ“‚ Contents

  • notebooks/: A walkthrough .ipynb notebook for analysts
  • scripts/: Reusable helper functions in .py format
  • data/: Sample retail data in CSV format

πŸ§ͺ Sample Skills Covered

  • Reading and exploring CSVs
  • Groupby summaries and filtering
  • Calculating metrics like AOV
  • Joining datasets (like Excel VLOOKUP)
  • Exporting clean results

🧠 Who it's for

  • Analysts new to Databricks, Python, or SQL
  • Teams transitioning from Excel-based workflows
  • Data folks looking to document best practices

πŸ“¦ Quick Start

from scripts.analysis_helpers import load_csv, null_summary, aov_summary

df = load_csv('data/retail_sample_data.csv')
print(null_summary(df))
print(aov_summary(df))

πŸ“« Created by

James Witcher
LinkedIn

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published