Skip to content

nabilasherif/Airline-Booking

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

108 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Airline-Booking

This project is a comprehensive Data Science and AI pipeline developed in three milestones, progressing from traditional Machine Learning prediction to Knowledge Graph construction, and finally to an end-to-end Graph-RAG (Retrieval-Augmented Generation) application for an Airline Travel Assistant.

Prerequisites

  • Neo4j: Install Neo4j Desktop or Community Server locally.
  • Python Libraries:
    pip install neo4j

MS 1: Airline Customer Holiday Booking (Predictive Modeling)

Goal: Develop a machine learning pipeline to predict passenger satisfaction based on customer feedback and flight data.

Key Features:

  • Data Engineering: Cleaning datasets and performing Sentiment Analysis on review content using the Vader library.
  • Exploratory Data Analysis (EDA): Visualizing top flight routes, booking distributions, and rating patterns by traveler type.
  • Predictive Modeling: Implementing a Statistical ML model or Shallow Feed-Forward Neural Network (FFNN) to classify passengers as "Satisfied" (Rating ≥ 5) or "Dissatisfied".
  • Model Explainability (XAI): Using SHAP and LIME to interpret model predictions and identify influential features.

Deliverables:

  • Jupyter Notebook with reproducible workflow.
  • Analytical Report & Visualizations.
  • Predictive Model & XAI plots.

MS 2: Knowledge Graph Construction (Neo4j)

Goal: Transition from tabular data to a structured Neo4j Knowledge Graph (KG) that models relationships between passengers, flights, airports, and journeys.

Schema:

  • Nodes: Passenger, Journey, Flight, Airport.
  • Relationships: (:Passenger)-[:TOOK]->(:Journey), (:Journey)-[:ON]->(:Flight), (:Flight)-[:DEPARTS_FROM]->(:Airport), (:Flight)-[:ARRIVES_AT]->(:Airport) .

Key Tasks:

  • KG Construction: A Python script (Create_kg.py) to ingest CSV data and build the graph adhering to a strict schema.
  • Scoring Rule Implementation: Calculating a weighted overall_satisfaction_score based on food, delay, legs, and miles to identify passengers above a specific threshold.
  • Cypher Analytics: Developing queries to answer business questions (e.g., busiest routes, average delays, food satisfaction by generation).

MS 3: End-to-End Graph-RAG Travel Assistant

Goal: Build a Graph Retrieval-Augmented Generation (Graph-RAG) system that uses the Neo4j Knowledge Graph from Milestone 2 as a grounding mechanism for an LLM-based assistant.

System Architecture:

  1. Input Preprocessing:
    • Intent Classification: Routing queries (e.g., "Search" vs. "Recommend").
    • Entity Extraction: Using NER to identify airports, flights, and dates from user input.
  2. Graph Retrieval Layer:
    • Baseline: Structured Cypher templates populated with extracted entities.
    • Embeddings: Semantic similarity search using vector embeddings (Node or Feature embeddings).
  3. LLM Layer:
    • Combines retrieved graph context with a structured prompt (Context, Persona, Task) to generate accurate answers.
    • Comparison of at least 3 different LLMs (e.g., Llama, Gemma, Mistral).
  4. User Interface:
    • A Streamlit UI to visualize the retrieved KG context, executed Cypher queries, and the final LLM response.

Install requirements

cd ms3
pip install -r requirements.txt

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors