Skip to content

Ravevx/GlassBoxDriver-Post-Hoc-XAI-for-Autonomous-Vehicle-Actions

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GlassBoxDriver: Post-Hoc XAI for Autonomous Vehicle Actions (Explainable End-to-End Autonomous Driving)


Project Overview

GlassBoxDriver is a post-hoc XAI system for autonomous vehicle decision making, built on EfficientNet-B0 and nuScenes Mini. It analyses driving footage frame-by-frame, predicts actions in real-time, and explains each decision via Grad-CAM heatmaps and a steering arc overlay showing predicted direction and turn degree.

GlassBoxDriver features a closed-loop human feedback pipeline- uncertain frames are auto-flagged, a human corrects them via the Streamlit UI, and the model retrains on those corrections, continuously improving from real-world errors.

The full pipeline:

  • Audit: AI analyses every frame of a driving video/ or Images
  • Flag: Low-confidence frames are automatically flagged
  • Review: Human corrects AI mistakes through the UI
  • Retrain: Model learns from human corrections
  • Repeat: System continuously improves with use

Streamlit UI Pages

UI_RunAudit UI_HumanReview

Page Description
Home Project overview and system architecture
Run Audit Upload dashcam video/Images, get annotated output with Grad-CAM steering arc
Review Flags View flagged uncertain frames and correct AI mistakes
Feedback Retrain Merge human corrections into training data and retrain
Session Logs View past audit sessions, action distribution charts, trust over time

Live Screen Inference (screen_ai.py)

Real-time predictions overlaid on game footage with steering arc and probability bars: (ignore label bug [left-right] on overlay)

Brake PredictionGo Straight Prediction Turn Left PredictionTurn Right Prediction Brake Prediction 2

5 Predicted Actions:

Action Trigger Condition
Go Straight Default- no strong signal
Brake brake_switch active OR brake > 5
Accelerate throttle > 200 AND speed > 5
Turn Left steering > 0.3 rad
Turn Right steering < -0.3 rad

Installation

1. Clone the Repository

git clone https://github.com/Ravevx/GlassBoxDriver-Post-Hoc-XAI-for-Autonomous-Vehicle-Actions.git
cd GlassBoxDriver-Post-Hoc-XAI-for-Autonomous-Vehicle-Actions

2. Create Conda Environment

conda create -n agent-local python=3.10
conda activate agent-local

3. Install Dependencies

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install streamlit opencv-python timm matplotlib tqdm mss pillow

Dataset Setup

Step 1 - Download nuScenes Mini Dataset

  1. Go to: https://www.nuscenes.org/nuscenes
  2. Go to the Download page
  3. Download nuScenes Mini (approx 4GB)
    • File: v1.0-mini.tgz

Step 2 - Download CAN Bus Data

  1. On the same download page
  2. Download CAN bus expansion for mini
    • File: can_bus.zip

Step 3 - Extract and Arrange Files

Extract both downloads and arrange exactly like this:

data/
└── nuscenes/
    ├── can_bus/
    │   ├── scene-0061_steeranglefeedback.json
    │   ├── scene-0061_vehicle_monitor.json
    │   ├── scene-0553_steeranglefeedback.json
    │   └── ... (all scene JSON files)
    ├── sweeps/
    │   ├── CAM_FRONT/
    │   │   └── *.jpg
    │   ├── CAM_FRONT_LEFT/
    │   │   └── *.jpg
    │   └── CAM_FRONT_RIGHT/
    │       └── *.jpg
    └── samples/
        ├── CAM_FRONT/
        │   └── *.jpg
        ├── CAM_FRONT_LEFT/
        │   └── *.jpg
        └── CAM_FRONT_RIGHT/
            └── *.jpg

Step 4 - Update Data Path

Open dataset.py and update line 6 to your local nuScenes path:

NUSCENES_ROOT = r"C:\your\path\to\data\nuscenes"

Project Structure

GlassBoxDriver/
│
├── app.py                  # Streamlit UI - main entry point
├── analyse.py              # XAI video audit engine
├── dataset.py              # nuScenes data extractor + labeller
├── train.py                # Model training script
├── screen_ai.py            # Live screen capture inference
├── balance_dataset.py      # Undersample classes to equal size
│
├── src/
│   ├── decision.py         # EfficientNet-B0 model definition + ACTIONS
│   ├── gradcam.py          # Grad-CAM heatmap generator
│   ├── flagging.py         # Uncertain frame flagging logic
│   └── feedback.py         # Human-in-the-loop retraining
│
├── data/
│   ├── nuscenes/           # put downloaded dataset here
│   ├── train/              # Auto-generated by dataset.py
│   │   ├── Go Straight/
│   │   ├── Brake/
│   │   ├── Accelerate/
│   │   ├── Turn Left/
│   │   └── Turn Right/
│   ├── flagged/            # Auto-generated during audit
│   └── video/              # Uploaded videos via Streamlit
│
├── models/
│   └── driving_cnn.pth     # Auto-saved after training
│
├── output/                 # Annotated audit videos saved here
├── logs/                   # Session CSV logs saved here
└── README.md
└── utils/
    └──balance_dataset.py   #Undersample all classes to equal size
    └──check_canbus.py      #run this to see structure
    └──check_dataset.py     #Verify images + labels are correctly paired
    └──fix_cleanup.py      #Delete all augmented files (keep only originals)
    └──aug_data.py         #Flip images to augment
    └──review_app.py       #Human review UI for flagged frames
     

Running the Project

Follow these steps in order:

Step 1 - Extract Training Data

python dataset.py

Expected output:

Total can_bus records: 19722
Processing sweeps/CAM_FRONT: 1938 images
Processing sweeps/CAM_FRONT_LEFT: 1940 images
Processing sweeps/CAM_FRONT_RIGHT: 1934 images
Processing samples/CAM_FRONT: 404 images
...
Total images extracted : ~6400
Class distribution:
  Go Straight : 2200
  Brake       : 1900
  Accelerate  :  650
  Turn Left   :  750
  Turn Right  :  620

Step 2 - (Optional) Balance Classes

python balance_dataset.py

Undersamples all classes to match the smallest class count for perfectly balanced training.

Step 3 - Train the Model

python train.py

Step 4.1 - Run on hardcoded video path (without Ui)

python analyse.py

Step 4.2 - Launch Streamlit UI (upload,audit,xai,humanFeedback,retrain,full logs)

streamlit run app.py

Step 4.3 - Live Screen Inference

python screen_ai.py

Captures your full screen in real-time and predicts driving actions live. Press Q to quit.


Model Architecture

Input Image (224x224x3)
        |
EfficientNet-B0 Backbone (pretrained ImageNet)
        |
       1280 features
    /              \
action_head          steering_head
Linear(1280->256)     Linear(1280->1)
ReLU + Dropout(0.3)
Linear(256->5)
        |                 |
5 class probs        steering angle
(softmax)            (tanh * 30 deg)

How It Works

dataset.py   -->  Extracts frames from nuScenes cameras
                  Labels each frame using CAN bus sensor timestamp matching
                  Saves labelled images to data/train/<action>/

train.py     -->  Loads labelled images
                  Fine-tunes EfficientNet-B0 on 5 driving action classes
                  Uses WeightedRandomSampler to handle class imbalance
                  Saves best model weights to models/driving_cnn.pth

analyse.py   -->  Loads trained model
                  Reads video frame by frame
                  Runs inference on each frame
                  Generates Grad-CAM heatmap every 5 frames
                  Computes Trust Score
                  Draws steering arc overlay on frame
                  Saves annotated video + CSV log
                  Flags uncertain frames for human review

app.py       -->  Streamlit UI that ties all above together
                  Allows video upload, audit, review, and retraining

Trust Score Formula

Trust = (Confidence + Heatmap Concentration + (1 - Entropy)) / 3

Where:
  Confidence           = max class probability
  Heatmap Concentration = max(heatmap) - mean(heatmap)
  Entropy              = -sum(p * log(p)) / log(num_classes)

Score > 0.5  -->  High trust, model is confident and focused
Score < 0.5  -->  Low trust, frame flagged for human review

How Frames Get Labelled (dataset.py)

Image filename contains timestamp:
n015-2018-07-24__CAM_FRONT__1532402927612460.jpg
                              ^
                              Extract this number

Find nearest CAN bus record within 2 seconds of timestamp

CAN bus record contains:
  steering_rad, brake, brake_switch, throttle, speed

Apply get_label() rules:
  brake_switch in (2,3) OR brake > 5  -->  Brake
  steering > 0.3 rad                  -->  Turn Left
  steering < -0.3 rad                 -->  Turn Right
  throttle > 200 AND speed > 5        -->  Accelerate
  else                                -->  Go Straight

🤝 Open for Collaboration

GlassBoxDriver is an ongoing project and still has known limitations we are actively trying to solve. If you are interested in improving this system, collaborating, or building on top of it — contributions are very welcome!

Known Issues & Open Problems

Problem Current State What's Needed
Model predicts Go Straight too often Domain shift from nuScenes to real/game footage More diverse training data
Only 10 nuScenes Mini scenes used ~6400 images, too small for generalization Full nuScenes dataset (1000 scenes)
Labels come from CAN bus rules Rigid threshold-based labelling Learned or smoother label generation
Left/Right arc overlay is flipped Known visual bug in steering arc Fix in draw_steering_overlay()
No temporal context Each frame predicted independently Add LSTM or temporal attention
Game footage not in training data Model never saw rendered graphics Add GTA VC / BeamNG / sim data

Areas to Contribute

  1. Larger Dataset: Integrate full nuScenes (1000 scenes) or BDD100K for broader driving coverage and better generalization
  2. Model Architecture: Experiment with temporal models (LSTM, Transformer) that use sequences of frames instead of single frames for richer context
  3. Better Labelling: Replace hard threshold rules in get_label() with smoother or learned labels from steering angle regression
  4. Sim-to-Real Transfer: Add synthetic driving data from simulators like CARLA, BeamNG, or GTA V to improve game footage predictions
  5. Grad-CAM Improvements: Replace Grad-CAM with GradCAM++, EigenCAM or SHAP for more accurate and stable heatmaps
  6. Active Learning: Smarter flagging strategy that selects the most informative uncertain frames for human review
  7. Steering Arc Bug Fix: Left and Right labels on the overlay arc are currently inverted — needs a sign correction in draw_steering_overlay()
  8. Also open to any other Contributions

Get in Touch

If you want to collaborate, raise an issue or start a discussion on GitHub: ⭐ Star the repo if you find it useful- it helps others discover the project!

Built with curiosity and a lot of debugging. —@Ravevx

Credits

About

Post-Hoc Explainable AI (XAI) system for autonomous driving using EfficientNet-B0 and nuScenes Mini. Generates Grad-CAM visual explanations, steering arc overlays, and supports human-in-the-loop feedback with a Streamlit interface for auditing and retraining driving decisions.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages