Intern: Omokhoa Oshose Tosayoname
Intern ID: CA/DF1/71570
Duration: 20th May 2026 – 20th June 2026
This project analyses unemployment trends across Indian states from 2019 to 2020, with a focused investigation into the impact of the Covid-19 pandemic and the nationwide lockdown (25 March 2020) on employment. The analysis covers rural vs urban unemployment dynamics, state-level vulnerability, geographic zone comparisons, and the relationship between unemployment and labour participation rates.
Business/Policy Question: How did the Covid-19 lockdown affect unemployment across India's states and regions, and which areas were most vulnerable?
Data Loading & Cleaning --> EDA --> Time Series Analysis
--> Covid-19 Impact Analysis --> Regional Analysis --> Policy Insights
CodeAlpha_UnemploymentAnalysis/
├── data/
│ ├── Unemployment_in_India.csv # Dataset 1: 2019-2020, Rural/Urban
│ ├── Unemployment_Rate_upto_11_2020.csv # Dataset 2: 2020 with geo-coordinates
│ └── *.png # All generated visualisations
├── notebooks/
│ └── unemployment_analysis.ipynb # Main notebook (fully executed)
├── requirements.txt
└── README.md
| Dataset | Records | Period | Key Feature |
|---|---|---|---|
| Unemployment_in_India.csv | 740 | 2019–2020 | Rural/Urban area breakdown, 28 states |
| Unemployment_Rate_upto_11_2020.csv | 267 | Jan–Nov 2020 | Geographic coordinates, zone classification |
Features: Region, Date, Unemployment Rate (%), Estimated Employed, Labour Participation Rate (%), Area (Rural/Urban), Geographic Zone, Coordinates.
| Phase | Avg Unemployment Rate |
|---|---|
| Pre-Covid | 9.23% |
| Lockdown (Mar–Jun 2020) | 16.74% |
| Recovery (Jul–Nov 2020) | 9.22% |
| National Peak | 23.24% (May 2020) |
Unemployment surged 7.5 percentage points from Pre-Covid to Lockdown.
Most affected state: Puducherry (+37.4 pp increase)
The lockdown phase shows a clearly right-shifted distribution with extreme outlier values representing the worst-affected states in April and May 2020.
Urban unemployment was higher and more volatile than rural unemployment throughout the observation period, reflecting greater exposure to formal sector disruptions during lockdown.
The unemployment rate spiked sharply after the lockdown began on 25 March 2020, peaking at 23.24% in May 2020. Recovery was rapid after Unlock Phase 1 in June 2020.
Pre-pandemic unemployment was relatively stable around 9%. The Covid-19 shock stands out as a clear structural break in the time series.
Urban areas showed a sharper and more prolonged spike during the lockdown. Rural unemployment recovered faster, likely driven by continued agricultural activity.
The lockdown phase shows both a significantly higher mean and a much wider spread in unemployment rates, reflecting uneven impact across states.
Most states saw large increases in unemployment. A small number of predominantly rural or agricultural states showed resilience or even slight decreases.
Puducherry, Jharkhand, and Bihar were among the hardest hit. States with strong agricultural or informal economies showed more resilience.
Significant variation exists across states. States like Haryana and Tripura showed the highest average unemployment rates throughout 2020.
All zones spiked sharply during the lockdown, but Central and North zones showed the highest peak values and more erratic recovery patterns.
The heatmap reveals the April–May 2020 lockdown period (deep red columns) as a clear shock across almost all states, with visible recovery by July 2020.
As unemployment surged, the Labour Participation Rate dropped simultaneously, indicating many workers stopped seeking employment entirely during the lockdown — a discouraged worker effect.
The lockdown phase occupies a distinct cluster with high unemployment and low participation, confirming the structural shock to the labour market.
- India's unemployment rate surged from ~9% to a peak of 23.24% in May 2020, a 7.5 percentage point increase driven almost entirely by the lockdown.
- Urban workers were disproportionately affected compared to rural workers, reflecting greater exposure to the formal and services sectors.
- The discouraged worker effect was significant: as unemployment rose, labour participation fell sharply, understating the true scale of employment disruption.
- Recovery was rapid once Unlock Phase 1 began in June 2020, with rates returning to near pre-pandemic levels by August 2020.
- Puducherry saw the most extreme lockdown impact (+37.4 pp), while agricultural states showed relative resilience.
- Policy interventions targeting urban informal workers and migrant labour would have had the greatest impact during the lockdown period.
-
Clone the repository:
git clone https://github.com/Tosa9/CodeAlpha_UnemploymentAnalysis.git cd CodeAlpha_UnemploymentAnalysis -
Install dependencies:
pip install -r requirements.txt
-
Launch the notebook:
jupyter notebook notebooks/unemployment_analysis.ipynb
Unemployment in India — Kaggle
CodeAlpha Data Science Internship | Task 2
#CodeAlpha #DataScience #UnemploymentAnalysis #Covid19 #Python #EDA













