Early occupant chest-response prediction from public NHTSA crash-test data
This project is a public-data passive-safety machine-learning demonstrator.
It uses NHTSA frontal crash-test data to extract driver chest displacement and vehicle crash-pulse features, then builds models for occupant chest-response prediction and early high-response flagging.
The project was developed as a portfolio/research demonstrator for vehicle safety, crash-pulse analysis, occupant response prediction, and AI-based decision support.
The main question is:
Can early vehicle crash-pulse information help identify frontal crash tests with elevated driver chest response?
The final early classifier uses only vehicle metadata and the first 0–30 ms of crash-pulse information.
This project does not claim to predict real medical injury risk.
The target used here is:
peak_abs_driver_chest_displacement_mm
This is a crash-test dummy response metric, not a direct injury label.
The project is:
- not a validated injury-prediction model,
- not a validated restraint-control algorithm,
- not based on proprietary OEM data,
- an open-data demonstrator for signal extraction, feature engineering, ML modeling, and conceptual decision support.
NHTSA frontal crash-test data
→ TDMS signal extraction
→ driver chest displacement target
→ vehicle crash-pulse feature engineering
→ regression modeling
→ high chest-response classification
→ early 0–30 ms conceptual decision-support layer
| Task | Final model | Feature set | Main result |
|---|---|---|---|
| Chest displacement regression | Random Forest | Advanced vehicle metadata + 0–30 ms pulse features | MAE ≈ 3.72 mm |
| Early high-response classification | Tuned Logistic Regression | Vehicle metadata + simple 0–30 ms pulse features | Recall ≈ 0.71, precision ≈ 0.22 |
| Decision support | Tuned early 0–30 ms decision layer | Threshold = 0.45 | High-response flag group ≈ 6.1× higher high-response rate than standard monitoring |
High chest response was defined as:
peak driver chest displacement >= 30 mm
The final tuned early model detected 35 out of 49 high chest-response cases using only the first 0–30 ms of crash-pulse information.
The final decision-level rates were:
| Decision level | Tests | Actual high cases | High-response rate |
|---|---|---|---|
| Standard monitoring | 255 | 9 | 3.53% |
| Elevated monitoring | 60 | 5 | 8.33% |
| Early high-response flag | 162 | 35 | 21.60% |
This means the early high-response flag group had approximately 6.1× higher high-response rate than the standard-monitoring group.
Suggested structure:
CrashPulse-AI/
├── data/
│ └── processed/
├── notebooks/
│ └── CrashPulse_AI_GitHub_Clean_Notebook.ipynb
├── results/
│ ├── tables/
│ └── figures/
├── README.md
├── requirements.txt
└── .gitignore
Large raw TDMS/zip files should not be uploaded to GitHub.
The cleaned notebook is:
notebooks/CrashPulse_AI_GitHub_Clean_Notebook.ipynb
It focuses on the final cleaned project story:
- project idea,
- dataset overview,
- chest displacement signal-quality check,
- regression result,
- high-response classification,
- final tuned early decision-support layer,
- limitations and future work.
Possible next steps:
- Scale from 500 attempted tests to a larger NHTSA frontal-test dataset.
- Use grouped validation by vehicle model/family.
- Extract richer restraint-system, airbag, belt, dummy, seat, and occupant-position metadata.
- Test raw-pulse time-series models once the dataset is larger.
- Explore deep learning models such as 1D CNNs, TCNs, or tabular foundation models when enough data is available.
- Apply the same workflow to HBM simulation data if such data becomes available.
- Crash-test data handling
- NHTSA API/TDMS workflow
- Signal extraction and filtering sensitivity checks
- Crash-pulse feature engineering
- Regression modeling
- Imbalanced classification
- Threshold tuning
- Early decision-support logic
- Passive-safety interpretation