🎒 Data Analyst Starter Kit

This repo helps Excel-native analysts get started with Python and Databricks for everyday analysis.
It walks through loading data, doing pivot-like summaries, calculating KPIs, joining datasets, and saving outputs — all using Python and Pandas.

📂 Contents

notebooks/: A walkthrough .ipynb notebook for analysts
scripts/: Reusable helper functions in .py format
data/: Sample retail data in CSV format

🧪 Sample Skills Covered

Reading and exploring CSVs
Groupby summaries and filtering
Calculating metrics like AOV
Joining datasets (like Excel VLOOKUP)
Exporting clean results

🧠 Who it's for

Analysts new to Databricks, Python, or SQL
Teams transitioning from Excel-based workflows
Data folks looking to document best practices

📦 Quick Start

from scripts.analysis_helpers import load_csv, null_summary, aov_summary

df = load_csv('data/retail_sample_data.csv')
print(null_summary(df))
print(aov_summary(df))

📫 Created by

James Witcher
LinkedIn

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data		data
notebooks		notebooks
scripts		scripts
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎒 Data Analyst Starter Kit

📂 Contents

🧪 Sample Skills Covered

🧠 Who it's for

📦 Quick Start

📫 Created by

About

Uh oh!

Releases

Packages

Languages

jwitcher3/data-analyst-starter-kit

Folders and files

Latest commit

History

Repository files navigation

🎒 Data Analyst Starter Kit

📂 Contents

🧪 Sample Skills Covered

🧠 Who it's for

📦 Quick Start

📫 Created by

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages