A comprehensive data analysis project exploring Netflix's content catalog using Python, focusing on various trends and patterns in movies and TV shows.
This project analyzes Netflix's content library using data visualization and statistical analysis to uncover insights about movie durations, content distribution, and production trends. The analysis is presented through an interactive Streamlit dashboard (https://netflixmoviesanalysis.streamlit.app/)
- Interactive filters (year range, genre)
- Distribution plots for movie durations
- Genre distribution bar chart
- Top countries producing content with percentage labels
- Boxplots of duration by genre
- Scatter plot showing relation between release year and cast size
- Word cloud of common words in titles
- Data table for exploration
- Create and activate a virtual environment (recommended):
Windows (PowerShell):
python -m venv .venv; .\\.venv\\Scripts\\Activate.ps1- Install dependencies:
python -m pip install -r requirements.txt- Start the Streamlit app:
streamlit run app.py- Open the browser at:
http://localhost:8501
This project uses Python 3.8+ and the following packages (see requirements.txt):
- streamlit
- pandas
- matplotlib
- numpy
- seaborn
- wordcloud
app.py— main Streamlit applicationnetflix.csv— dataset used by the app (should sit in the repo root)assets/— containsnetflix_logo.pngand optional Lottie JSON
- If the app fails to start due to missing packages, run the install command above again.
- If images don't load, confirm the
assets/directory exists and containsnetflix_logo.png. - The repository previously contained an unrelated README; this file now documents the Netflix Movies Analysis project.
Contributions welcome — open an issue or submit a PR.
For questions: hemant0hack@gmail.com