This project demonstrates how I build a real-world, terminal-driven data analytics pipeline using Python and Bash.
The goal is to simulate an end-to-end analytics workflow where raw country-level socioβeconomic data is:
- extracted,
- cleaned and transformed (ETL),
- analyzed using Python,
- and exported as analytics-ready outputs for BI tools such as Tableau or Power BI.
This project focuses on practical data analyst skills, not just theory.
Raw global development datasets often come in messy formats and are not directly usable for analysis or dashboards.
This project solves that problem by:
- automating data cleaning and transformation,
- generating summary statistics and visual insights,
- and producing clean CSV outputs ready for reporting and visualization.
- Python (Pandas, NumPy, Matplotlib)
- Bash / Shell scripting
- ETL pipeline design (Extract β Transform β Load)
- Data cleaning & preprocessing
- Exploratory Data Analysis (EDA)
- Terminal-based automation
- Analytics-ready data preparation for Tableau
terminal_practice_project1/
β
βββ run_analysis.sh # One-command pipeline execution
βββ etl_pipeline.py # Data extraction, cleaning, transformation
βββ analysis.py # EDA and visualization logic
βββ output/ # Cleaned datasets and plots
βββ README.md
βββ .gitignore
Clone the repository and run the pipeline from the terminal:
git clone https://github.com/aswathappaswetha-tech/terminal_practice_project1.git
cd terminal_practice_project1
bash run_analysis.shThis single command:
- Loads the raw dataset
- Cleans and transforms the data
- Performs analysis
- Saves cleaned CSV files and visualizations
-
Cleaned, analytics-ready CSV files
-
Summary statistics of socio-economic indicators
-
Visualizations such as:
- GDP vs Life Expectancy
- Country-level comparisons
An interactive Tableau dashboard was built using the cleaned output from this pipeline.
π View the dashboard here:
Global Development & Health Insights Dashboard
- GDP vs Life Expectancy analysis
- Child Mortality comparison across countries
- Health & economic development insights
- Interactive filtering by country and indicators
π The dashboard uses CSV outputs generated directly from this ETL pipeline.
This project reflects how data analysts work in real environments:
- using the terminal,
- automating workflows,
- and preparing data for decision-making tools.
It demonstrates my ability to move from raw data β insights β business-ready outputs.
Swetha Gowribidanur Aswathappa MSc Data Analytics | Python | SQL | ETL | Data Visualization Berlin, Germany
- Add logging and error handling
- Parameterize input datasets
- Extend analysis with clustering or regression models
- Connect pipeline directly to Tableau extracts