This repository is a comprehensive showcase of machine learning algorithms, data engineering workflows, and deep learning implementations. It transitions from fundamental Pythonic data structures to complex neural network architectures using TensorFlow and Keras.
- MLP Classification: Building and tuning Multi-Layer Perceptrons to predict health outcomes (Diabetes dataset).
- Deep Regression: Implementing neural networks for price prediction using the Boston Housing dataset, focusing on Root Mean Squared Error (RMSE) optimization.
- Architecture Design: Experimenting with layer depth, dropout, and activation functions for optimized convergence.
- Support Vector Machines (SVM): A mathematical deep-dive into SVMs. Includes custom optimization using
scipy.minimizeand visualization of decision boundaries on the Iris dataset. - Linear Regression: Statistical modeling of relationships (e.g., Salary vs. Experience) using both
statsmodelsandsklearn.
- Scaling & Standardization: Detailed comparisons between
StandardScaler,MinMaxScaler, and Robust scaling techniques to handle variance and outliers. - Advanced Pythonics: High-performance data manipulation using List Comprehensions and optimized NumPy operations.
- Outlier Management: Systematic detection and replacement strategies to ensure model stability.
- Computer Vision: Basic image processing and edge detection using OpenCV (cv2), specifically implementing Canny Edge Detection.
- Information Retrieval: Implementation of an Inverted Index—the foundational logic behind search engines—mapping terms to document IDs.
- Core: Python 3.x, NumPy, Pandas
- Modeling: Scikit-Learn, TensorFlow, Keras, Statsmodels
- Visualization: Matplotlib, Seaborn
- Utilities: OpenCV, Scipy (Optimization)
| Component | Technique | Dataset |
|---|---|---|
| Classification | MLP / SVM | Iris, Diabetes |
| Regression | Neural Networks / OLS | Boston Housing, Salary Data |
| Computer Vision | Canny Edge Detection | Image Buffers |
| Indexing | Inverted Indexing | Text Corpora |
- MLOM_Lab02 (A/B): Data cleaning, handling missing values, and OpenCV basics.
- MLOM_Lab03/04: Deep dive into Neural Network classification and regression with Keras.
- SVM Lab: Mathematical implementation of maximal margin classifiers.
- Indexing Guide: Logic for term mapping and text processing.
Each folder/notebook is self-contained. To replicate the results:
- Ensure
tensorflowandscikit-learnare installed. - Follow the sequence from Lab_001 (Fundamentals) to MLOM_Lab04 (Deep Learning).