This project focuses on analyzing Netflix's movie and TV show catalog using MySQL. By writing SQL queries, we explore the structure and trends within the platform’s content. The project answers real-world business questions to gain actionable insights.
• Compare the number of Movies vs TV Shows on Netflix.
• Find the most common content ratings.
• Analyze content by release year, country, and duration.
• Filter and classify shows based on keywords, genres, and other custom conditions.
• Generate summaries and trends for content growth, popular actors, and more.
Source: Kaggle
• Dataset Name: Netflix Movies and TV Shows
• Dataset Link: Movie Dataset
CREATE TABLE netflix_data (
show_id VARCHAR(20),
type VARCHAR(20),
title TEXT,
director TEXT,
cast TEXT,
country TEXT,
date_added VARCHAR(50),
release_year INT,
rating VARCHAR(20),
duration VARCHAR(20),
listed_in TEXT,
description TEXT
);select type, count(*) as total_content
from netflix_data
group by type;SELECT rating, COUNT(*) AS total
FROM netflix_data
GROUP BY rating
ORDER BY total DESC
LIMIT 1;SELECT *
FROM netflix_data
WHERE type = 'Movie' AND release_year = 2020;SELECT country, COUNT(*) AS total
FROM netflix_data
GROUP BY country
ORDER BY total DESC
LIMIT 5;select * from netflix_data
where
type = 'Movie'
and
duration = (select max(duration) from netflix_data);SELECT *
FROM netflix_data
WHERE STR_TO_DATE(date_added, '%M %d, %Y') >= CURDATE() - INTERVAL 5 YEAR;SELECT *
FROM netflix_data
WHERE director = 'Rajiv Chilaka'; SELECT *
FROM netflix_data
WHERE type = 'TV Show'
AND CAST(SUBSTRING_INDEX(duration, ' ', 1) AS UNSIGNED) > 5;SELECT listed_in AS genre, COUNT(*) AS total
FROM netflix_data
GROUP BY listed_in
ORDER BY total DESC;SELECT release_year, COUNT(*) AS total
FROM netflix_data
WHERE country LIKE '%India%'
GROUP BY release_year
ORDER BY total DESC
LIMIT 5;SELECT *
FROM netflix_data
WHERE type = 'Movie' AND listed_in LIKE '%Documentary%';SELECT *
FROM netflix_data
WHERE director IS NULL OR director = '';SELECT COUNT(*) AS total
FROM netflix_data
WHERE type = 'Movie'
AND `cast` LIKE '%Salman Khan%'
AND release_year >= YEAR(CURDATE()) - 10;SELECT actor, COUNT(*) AS total_movies
FROM (
SELECT TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(`cast`, ',', n.n), ',', -1)) AS actor
FROM netflix_data
JOIN (
SELECT a.N + b.N * 10 + 1 AS n
FROM (SELECT 0 AS N UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) a,
(SELECT 0 AS N UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) b
) n
ON n.n <= 1 + LENGTH(`cast`) - LENGTH(REPLACE(`cast`, ',', ''))
WHERE type = 'Movie' AND country LIKE '%India%'
) AS actor_list
GROUP BY actor
ORDER BY total_movies DESC
LIMIT 10;15. Categorize the content based on the presence of the keywords 'kill' and 'violence' in the description field. Label content containing these keywords as 'Bad' and all other content as 'Good'. Count how many items fall into each category.
SELECT
CASE
WHEN description LIKE '%kill%' OR description LIKE '%violence%' THEN 'Bad'
ELSE 'Good'
END AS category,
COUNT(*) AS total
FROM netflix_data
GROUP BY category;SELECT release_year, COUNT(*) AS total_titles
FROM netflix_data
WHERE release_year IS NOT NULL
GROUP BY release_year
ORDER BY release_year ASC;SELECT country, COUNT(*) AS total_titles
FROM netflix_data
WHERE country IS NOT NULL
GROUP BY country
ORDER BY total_titles DESC
LIMIT 10;SELECT release_year, COUNT(*) AS total_titles
FROM netflix_data
GROUP BY release_year
ORDER BY total_titles DESC
LIMIT 10;SELECT title, COUNT(*) AS duplicate_count
FROM netflix_data
GROUP BY title
HAVING COUNT(*) > 1
ORDER BY duplicate_count DESC;SELECT director, COUNT(*) AS total_titles
FROM netflix_data
WHERE director IS NOT NULL
GROUP BY director
ORDER BY total_titles DESC
LIMIT 10;• Content Variety: Netflix offers a wide mix of movies and TV shows, covering many genres and ratings.
• Popular Ratings: The most frequent ratings reveal who the main audience is.
• Regional Trends: India and other top countries show where Netflix focuses its content.
• Content Types: Grouping shows by keywords helps us understand the themes and nature of the library.
Overall, this analysis gives a clear picture of Netflix’s content and can guide future choices about what to add or promote.
This project showcases my MySQL expertise through comprehensive analysis of real-world Netflix data, highlighting my ability to extract meaningful insights.
• Connect with me on LinkedIn
