CS781: Formal Methods in Machine Learning — Autumn 2025
Instructor: Prof. Supratik Chakraborty
Authors:
- Aman Moon (22B1216)
- Arin Weling (22B1230)
This project explores iterative re-training of a Convolutional Neural Network (CNN) for robust car detection using:
- CARLA Simulator (for generating semantic scene variations)
- VerifAI Toolkit (probabilistic scene specification & falsification-driven sampling)
- Alpha-CROWN (auto_LiRPA) (formal verification of neural network robustness)
The central goal is to show that data generated from adversarial or low-confidence regions, discovered via semantic-level falsification, improves local robustness of the CNN over iterations.
- Train a baseline CNN to classify between number of cars on road scenes (no-car, one-car or two-cars) .
- Use VerifAI to sample scenes with semantic variations:
- Lighting
- Weather
- Car distance
- Orientation
- Number of Cars
- Identify samples on which the model performs poorly (semantic counterexamples).
- Retrain the CNN on these samples.
- After each iteration, use Alpha-CROWN to compute local robustness:
- Radius of the largest Lₚ-ball around a given input image
- For which the predicted class remains unchanged
- Observe robustness improvement over iterations.
We initially attempted verification using alpha-beta-CROWN (branch-and-bound), but:
- It requires very high VRAM for CNN verification
- Even small models caused memory overflows on available GPUs
Thus, we switched to:
- Alpha-CROWN
- Implemented using the auto_LiRPA library
- Produces looser bounds, but scales better
- Still sufficient for local robustness metrics
VerifAI samples scenes and retrieves rendered images from CARLA.
Each sampled image is passed through the CNN:
- Misclassified samples
- Low-confidence predictions
- Samples with small robustness bounds
These are collected for retraining.
Each iteration adds the newly collected counterexamples to the training set.
For a chosen norm (we used L2) and selected images:
- Compute the local robustness radius
- Track the change over iterations
- The CNN should become more robust to semantic perturbations.
- Alpha-CROWN bounds showed a monotonic increase across iterations.
- Falsification-driven data augmentation yields better generalization compared to random sampling.
- Chosen Lp norm:
L2for simpler bound formulation and stable convergence in Alpha-CROWN. - Model architecture: A lightweight CNN, chosen specifically to keep verification computationally feasible.
- Iterative robustness improvement: After 4-5 re-training iterations, we observed a clear increase in local robustness radii computed by Alpha-CROWN.
- Effectiveness of semantic sampling: VerifAI’s semantic-level sampler successfully exposed rare and challenging corner-case scenes (e.g., occluded vehicles, extreme lighting, odd car poses). These counterexamples notably improved generalization and robustness when added to the training set.
-
VerifAI Toolkit
Berkeley LearnVerify Group. VerifAI: A Toolkit for the Design and Analysis of Autonomous Systems.
GitHub: https://github.com/BerkeleyLearnVerify/VerifAI -
α/β-CROWN (Alpha-Beta CROWN)
General and Efficient Robust Neural Network Verification via α, β-CROWN.
GitHub: https://github.com/Verified-Intelligence/alpha-beta-CROWN -
auto_LiRPA (Alpha-CROWN / LiRPA Library)
auto_LiRPA: A Unified Library for Training and Verifying Neural Networks with Certified Robustness.
GitHub: https://github.com/KaidiXu/auto_LiRPA -
CARLA Autonomous Driving Simulator
CARLA: An Open Urban Driving Simulator.
Website: https://carla.org/
We thank Prof. Supratik Chakraborty for his continuous support and guidance throughout CS781 — Formal Methods in Machine Learning (Autumn 2025).
We also acknowledge the creators and maintainers of VerifAI, CARLA, alpha-beta-CROWN, and auto_LiRPA, whose open-source tools made this project possible.