Click here to view the interactive Power BI dashboard
A comprehensive data analysis and interactive dashboard project examining 119,390 hotel booking records from two Portuguese hotels (Resort and City) over a 26-month period (July 2015 - August 2017). This project validates three key business hypotheses and provides actionable recommendations to improve revenue and operational efficiency.
This project analyses hotel booking data containting guests demographics, reservation details and cancellation details from two portugues hotels during July 2015 to August 2017. The dataset has been cleaned, transformed with new calculated columns that have been created to specifically analyse booking lead times, guest type groups and seasonal patterns. This analysis focuses on three key business hypotheses:
Early booking problem: Bookings made far in advance cancel 2.4 more times more often than last minute bookings.
Hidden High value Group: Family bookings generate 35% more revenue per stay than single couple bookings.
The seasonal imbalance: We make 72% of our money in just three months.
The purpose of this engagment is to use data analysis to support two leadinh hotel companies in making informed, evidence based desicions that improve revenue performance and operational efficiency. As a Data Analyst my role is to identify patterns, trends and actionable insights that will help the business determine the most effective approach to meeting the requirements:
- Reduce revenue loss resulting from booking cancellations by a minimum of 10%.Analyse customer behviour across booking channels,customer group and lead times to identify key drivers of cancellations and recommend appropriate stratergies.
- Increase revenue generated by family and group bookings.Examine customer groups,booking behaviour, length of stay and price sensitivity to identify oppertunities to attract, retain and grow the family and group customer segments.
- Optimise seasonal staff levels. Assess historical occupency rates, demand patterns and staffing utalisation to ensure levels are align with seasonal demands, improving costs efficiency while maintaining service quality.
Phase 1 :
- Extract Data
- Clean Data
- Create calculated columns
- Prepare Data for Power BI importation and transformation
Phase 2:
- Set up Power BI
- Create measuring tools.
- Build 4 core visualisations (Monthly booking trends, cancellation anaylsis by time, Revenue by guest type, Hotel comparisons)
- Build a score card for key reults.
Phase 3:
- Test Hypotheses
- Document findings.
- Forumlate actionable recomendations.
- Add colour scehme for formatting.
- Create dashboard.
- Export Dashboard for documentation.
Dataset: Hotel Booking Demand Dataset from Kaggle. https://www.kaggle.com/datasets/jessemostipak/hotel-booking-demand/discussion?sort=hotness
Scope: 119,390 booking records for a Resort Hotel and a City Hotel.
Period: July 2015 to August 2017.
Contents: Guest demographics, reservation details, booking status, and pricing.
- 90+ day bookings: 44% cancellation rate
- 0-7 day bookings: 18% cancellation rate
- Risk multiplier: 2.4x higher for early bookings
- Business implication: Early bookings cancel significantly more, requiring policy intervention
- Families spend: $420 average per stay
- Couples spend: $310 average per stay
- Revenue premium: $110 (35% higher)
- Market opportunity: Families represent only 15% of current bookings
- Resort Hotel: 72% of revenue in summer (June-August)
- City Hotel: 48% of revenue in same period
- Seasonal gap: 24 percentage point difference
- Operational impact: Extreme peak-trough cycles for Resort hotels
- Sampling: 5,000 random bookings (4.2% of 119,390 total)
- Feature Engineering: Created calculated columns:
lead_time_category(0-7, 8-30, 31-90, 90+ days)guest_type(Family, Couple, Solo, Group)estimated_revenue(ADR × total nights)
- Tools Used: Python (pandas, numpy), Jupyter Notebook
- Used descriptive statistics for hypothesis testing
- Calculated confidence intervals for key metrics
- Verified sample representativeness against full dataset
- Cancellation Analysis:
- Bar chart: Cancellation rate by lead time category
- Trend line: Increasing risk with longer lead times
- Revenue Segmentation:
- Bar chart: Average revenue by guest type
- Pie chart: Booking share distribution
- Seasonal Patterns:
- Line chart: Monthly bookings comparison (City vs Resort)
- Bar chart: Summer concentration percentage
(See analysis_results.png for visualization outputs)
Page 1: Strategy Dashboard (Executive View)
- KPI Cards: Total Bookings (5969k), Total Revenue (€XXM), Cancellation Rate (XX%)
- Trend Visualization: Monthly booking volume comparison - City vs Resort Hotel
- Hypothesis Validation Cards: Three visual confirmations with direct business actions
- Purpose: 30-second leadership overview.
Page 2: Cancellation Risk Deep Dive
- Primary Visual: Cancellation rate by lead time category (44% at 90+ days vs 18% at 0-7 days)
- Secondary Visual: Cancellation rate by market segment (Groups: 64%, Online TA: 37%)
- Geographic Context: Top 10 booking countries (Portugal: 41%)
- Insight Layer: Critical findings with quantified recommendations
- Purpose: Identify specific intervention points
Page 3: Revenue Strategy & Segmentation
- Value Proof: Average revenue by guest type (Families: €420, Couples: €310 - 35% premium)
- Market Opportunity: Current booking mix (Families: 15% of bookings)
- Seasonality Analysis: Summer revenue concentration (Resort: 72%, City: 48%)
- Strategic Recommendations: Quantified targets with projected impact (+€5.0M opportunity)
- Purpose: Growth strategy formulation
Data Modeling:
- Created separate measures table for cleaner organization
- Established calendar table for time intelligence (not fully utilized in v1.0)
- Set default summarization to "None" for all ID/lookup fields
- Performance Optimization:
- Sampled 5,000 rows (4.2%) during development for faster iteration
- Verified sample representativeness against full dataset
- Removed unused columns from report view
UI/UX Choices:
- Three-page structure matches three business hypotheses
- Action-oriented insight cards instead of raw metrics
- Consistent typography (Segoe UI, 12pt body, 20pt headers)
- Removed pie charts entirely (hard to read, replaced with bar/column charts)
- Strategic color coding guides attention to what matters
Validation & Limitations
- Sample Validation:
- 5,000 row sample (4.2% of 119,390 total)
- Verified distribution across hotels, guest types, and seasons
- Confidence intervals calculated for key metrics
- Limitations:
- 2015-2017 data; post-pandemic booking behavior may differ
- Revenue calculated from ADR × nights; actual transaction data not available
- Staffing optimization based on occupancy patterns; internal cost data not available Next Analysis Recommendations:
- Integrate competitor pricing data for family segment price sensitivity analysis
- A/B test deposit policies on 90+ day bookings
- Build predictive model for cancellation risk at booking moment
- Data Processing: Python 3.8 (pandas, numpy) in Jupyter Notebook
- Data Visualization: Power BI Desktop
- Version Control: Manual backup system (v1, v2, FINAL) - Git planned for v2.0
- Documentation: Markdown / README.
- DeepSeek AI Assistant Real-time consultation on dashboard layout, DAX optimisation, color strategy, and README documentation.
KPI cards, monthly booking trends, and hypothesis validation cards
Primary visual displays cancellation rate by lead time category, secondary visuals show cancellation rate by market segment and top booking countries, providing actionable intervention points
44% cancellation rate at 90+ days, Groups segment at 64% cancellation
Family €420 vs Couple €310 (35% premium), Summer revenue concentration (72% vs 48%)
This project transformed raw hotel booking data into actionable business intelligence through:
- Hypothesis-driven analysis: Starting with clear business questions, not just "what's interesting?"
- Iterative design: Three major dashboard revisions based on clarity, not complexity
- Action-first mindset: Every insight paired with "what do we do about it?"
- User empathy: Designing for an executive with 30 seconds, not an analyst with 30 minutes
- Biggest lesson: A great insight hidden in a cluttered dashboard is the same as no insight at all. Visualization is not decoration, it's communication.
- Biggest lesson: A great insight hidden in a cluttered dashboard is the same as no insight at all. Visualization is not decoration, it's communication!
Technical:
- Power BI: Dashboard design, DAX measures, data modeling, visual formatting, mobile layout
- Data Cleaning: Python (pandas), handling nulls, feature engineering, sampling validation
- Data Analysis: Hypothesis testing, descriptive statistics, pattern identification
- Git: Version control basics, README documentation, project publishing
Business:
- Requirements Translation: From business questions → technical analysis → actionable insights
- Executive Communication: Designing for 30-second decision-making, not data exploration
- Storytelling: Structuring insights around narrative (problem → evidence → recommendation)
- ROI Quantification: Translating data findings into revenue impact (£5.0M opportunity identified)
Soft:
- Iterative Design: Three major dashboard revisions based on feedback
- Self-Management: Project planning, deadline adherence, independent problem-solving
- Resilience: Transformed frustration and clutter into clarity and confidence
- Attention to Detail: Consistent formatting, color strategy, typography Portfolio Ready: This project is now interview-ready and demonstrates end-to-end analytics capability.
Dataset:
- Source: Hotel Booking Demand Dataset on Kaggle
- Creator: Nuno Antonio, Ana Almeida, Luis Nunes
- Citation: Antonio, N., de Almeida, A., & Nunes, L. (2019). "Hotel booking demand datasets." Data in Brief, 22, 41-49.
- License: CC0: Public Domain
Inspiration & Learning:
- Power BI community forums (DAX patterns)
- SQLBI.com (measure technique references)
- Storytelling with Data (Cole Nussbaumer Knaflic) - dashboard layout principles
Tools:
- Analysis completed using Power BI Desktop (free version)
- Data cleaning: Python (pandas, numpy) in Jupyter Notebook
- Documentation: Markdown / GitHub
Technical Guidance & Support:
- Special thanks to an AI assistant from DeepSeek for real-time dashboard design consultation and DAX troubleshooting.
- This included: color strategy recommendations, layout restructuring, README documentation refinement, and technical deployment support.
This project was developed as a portfolio piece to demonstrate data analysis, visualization, and business storytelling capabilities.
I extend my sincere gratitude to Code Institute for providing comprehensive data analytics education that formed the foundation of this project. The skills and methodologies learned throughout the course were instrumental in developing this data visualization analysis.
Special thanks to my tutors, Vasilica Pavaloi and Mark Briscoe, for their patient guidance and invaluable support in addressing numerous technical questions and challenges encountered during development.
I also wish to acknowledge my peers for their collaborative spirit, motivation, and shared insights that significantly contributed to the successful completion of this work.
