Nadia Rozman NadiaRozman

Hi, I'm Nadia! ☺️

I'm a data enthusiast with hands-on experience in Machine Learning, Deep Learning, Natural Language Processing (NLP), Data Analysis, and Data Visualization.

My background is in clinical research, and I bring that domain knowledge directly into my data science projects — bridging healthcare expertise with rigorous analytical methods using Python, SQL, TensorFlow/Keras, NLP libraries, and Tableau.

🌟 Featured Project

🧬 Clinical Trials Data Analysis — End-to-End Pipeline

End-to-end data science pipeline on 5,000 ClinicalTrials.gov records — EDA, NLP, XGBoost, SHAP, and SQL analytics.

This is my most comprehensive project, combining my clinical research background with a full data science workflow:

Notebook	Focus
📊 EDA & Data Acquisition	HuggingFace streaming, XML parsing, feature engineering
📝 NLP & Text Analytics	TF-IDF, Sentence Transformers, BART zero-shot, NER
🤖 Machine Learning	XGBoost + hyperparameter tuning (ROC-AUC: 0.68)
🗄️ SQL Analytics	SQLite, window functions, multi-CTE sponsor scorecard

Key results: Predicted clinical trial completion from registration metadata alone; SHAP explainability identified phase and collaborator presence as the strongest completion signals. NLP baseline (TF-IDF) achieved ROC-AUC of 0.69 from free-text summaries alone.

Tech: Python · XGBoost · SHAP · HuggingFace Transformers · Sentence Transformers · SQLite · Plotly · scikit-learn

🔹 Other Projects

Machine Learning

Cat vs Dog CNN Image Classifier — End-to-end CNN with TensorFlow/Keras for binary image classification, including data augmentation and evaluation.
Bank Customer Churn Prediction (ANN) — Artificial Neural Network predicting customer churn for retention strategy insights.
Vehicle Market Segmentation — K-Means & Hierarchical Clustering to segment vehicles by specification.
Startup Profit Prediction — Compared multiple regression models to predict startup profitability.
Drug Classification — Classification models for predicting drug effectiveness.

Analytics & NLP

Employee Attrition & Retention Analysis — End-to-end HR analytics with EDA, statistical analysis, Tableau dashboards, and predictive modelling.
Customer Sentiment Analysis — NLP project on hotel reviews using VADER, TF-IDF, and neural networks.
Python Data Analysis Project — Exploring datasets with Python to uncover patterns and trends.
SQL Data Analysis Project — SQL-based analysis on real-world datasets for actionable insights.

Visualization

Tableau Dashboards — Interactive Tableau dashboards visualizing sales performance.
Seattle Airbnb Analysis — Analyzing Airbnb listings and pricing trends.

🔹 Skills

Languages & Tools: Python · SQL · TensorFlow · Keras · Tableau · NLTK · Pandas · NumPy · Scikit-learn · XGBoost · SHAP · HuggingFace Transformers · Matplotlib · Seaborn · Plotly · SQLite · WordCloud

Techniques: Regression · Classification · Clustering · Artificial Neural Networks · Convolutional Neural Networks · NLP · Sentiment Analysis · SHAP Explainability · Data Cleaning · Data Visualization · ETL Pipelines

📌 All projects are fully reproducible with notebooks and environment files included. Explore my repositories for the full workflow.

🔗 Connect: LinkedIn · GitHub

Provide feedback

Saved searches

Use saved searches to filter your results more quickly