Data scientist with 20+ years of experience modelling complex, noisy systems — originally in astrophysics, now applied to real-world data problems.
My work focuses on machine learning, statistical inference, and time series analysis, with an emphasis on extracting signal from difficult datasets and supporting decision-making under uncertainty.
-
Fraud Detection (Machine Learning)
Developed supervised models to identify rare events in highly imbalanced data, achieving strong precision while maintaining a low alert rate. Focused on threshold optimisation, uncertainty, and real-world trade-offs between detection and operational cost.
-
A/B Testing Toolkit (Statistical Inference)
Built reusable Python tools for comparing groups using confidence intervals and hypothesis testing. Designed to support practical decision-making in experimentation workflows.
-
Time Series Forecasting Toolkit
Created an interactive framework for comparing forecasting models (ARIMA, Holt-Winters, Prophet) with built-in backtesting and error analysis to evaluate real-world performance.
-
Deep Learning for Regression
Implemented neural network models for continuous parameter estimation from high-dimensional data, including full pipelines for preprocessing, training, validation, and uncertainty assessment.
Python (pandas, NumPy, scikit-learn, TensorFlow), statistical modelling, machine learning, time series forecasting, hypothesis testing, data visualisation.
| Technical Skills | Soft Skills | Python | Other languages | Documentation |
|---|---|---|---|---|
| Data Analysis | Team Leadership | dash | C | HTML |
| Machine Learning | Project Management | jupyter | IDL | Latex |
| Neural Networks | Teaching & Supervision | matplotlib | PHP | Markdown |
| Data Visualisation | Science Communication | numpy | SQL | dashboards |
| Statistical Analysis | Public Speaking | pandas | Shell scripting | Office |
| Scientific Research | TV and Radio | scikit-learn | Pgplot | |
| Simulations | International Collaboration | tensorflow | Gnuplot |
