Welcome to the Data Warehousing and Business Intelligence for IMDB Datasets GitHub repository!
- Objective: The main objective of this project was to design and implement a data warehouse for IMDB datasets, enabling efficient storage, retrieval, and analysis of data for business intelligence purposes.
- Dataset: The project involved working with the IMDB dataset, which contains structured and unstructured data related to movies, TV shows, and people involved in the entertainment industry.
- Tools Used: Alteryx was utilized for extracting structured and unstructured data from different sources and ingesting more than 75 million records into a staging schema. Additionally, Tableau and Power BI were used for creating dashboards and visualizations for business analysis.
- Data Warehouse Design: Designed and implemented a data warehouse architecture tailored to the needs of the IMDB dataset, ensuring efficient storage and retrieval of data.
- Data Extraction and Loading: Utilized Alteryx to extract data from various sources and load it into the staging schema, processing more than 75 million records.
- Data Transformation: Implemented techniques such as Slowly Changing Dimensions (SCD) and rejection logics to ensure data consistency and integrity.
- Business Intelligence: Developed BI schemas for Movies, TV shows, and People, and created over 15 dashboards using Tableau and Power BI for in-depth business analysis and visualization.
- Documentation: Contains documentation related to the project, including design specifications, data dictionaries, and user manuals.
- Scripts: Includes scripts used for data extraction, transformation, and loading processes.
- Dashboards: Contains files for the dashboards created using Tableau and Power BI for visualizing and analyzing the IMDB dataset.
To utilize the resources and materials provided in this repository:
- Clone this repository to your local machine.
- Explore the documentation, scripts, and dashboards to understand the project's architecture, processes, and insights derived from the IMDB dataset.
If you have any questions, suggestions, or feedback regarding this project, feel free to contact the project lead or contributors:
Thank you for your interest in our Data Warehousing and Business Intelligence project for IMDB Datasets! I hope you find the resources valuable for your own data analysis endeavors.