MLFlow_demo

MLOps Pipeline Project

Overview

This project implements a complete MLOps pipeline for a machine learning scikit-learn model using Python. The pipeline includes the following steps:

Data Gathering: Fetches data from the UCI ML repository.
Data Analysis: Performs exploratory data analysis (EDA) to understand the dataset.
Data Versioning: Saves the dataset for version control. (TODO wiwth DVC)
Data Preparation: Prepares the data for modeling through feature engineering and data splitting.
Model Training & Development: Trains a RandomForestClassifier on the prepared data.
Model Validation: Validates the model using accuracy metrics and other evaluation tools.
Model Serving: Saves the trained model for deployment. (TODO)
Model Monitoring: Logs predictions and tracks model performance over time. (TODO)

Project Structure

ml_pipeline/
├── data/
│   ├── raw/
│   └── prepared/
├── src/
│   ├── data/
│   └── model/
├── artifacts/
├── requirements.txt
└── README.md

Installation

Clone the repository:

git clone https://github.com/burna680/MLFlow_demo.git
cd MLFlow_demo

Create a virtual environment (optional but recommended):
```
python3 -m venv venv
source venv/bin/activate
```
Install the dependencies:
```
pip install -r requirements.txt
```
Start the MLflow server (if you haven't already):
```
mlflow ui
```

Usage

Running the Pipeline

Run the main script to execute the entire pipeline:

python main.py

This will perform all steps of the pipeline, from data gathering to model training. Outputs, logs, and saved models will be stored in the appropriate directories under model/, and artifacts/.

Start the model server

To use the mlflow model, use the serve command to start the model server. The command depends on your available ports in your local machine and the specific model run you want to serve:

mlflow model serve --model-uri <model_uri> --port=<available_port> --no-conda

Exploring the Project

Data Modules: Located under src/data/, these modules handle everything from gathering and preparing data to versioning it using MLflow.
Model Modules: Located under src/model/, these modules are responsible for training, validating, serving, monitoring, and retraining the model.
Utilities: Common and useful functions for the MLFlow project can be placed in src/utils.py to keep the code DRY.

Future Work

CI/CD Integration: Add continuous integration and continuous deployment pipelines.
Model Deployment: Implement model deployment using tools like Flask or FastAPI.
Advanced Monitoring: Incorporate advanced monitoring and alerting mechanisms.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
artifacts/d21c6fa8b1aa4b5280532e4a6371c3fd/artifacts		artifacts/d21c6fa8b1aa4b5280532e4a6371c3fd/artifacts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
mlflow-workshop.png		mlflow-workshop.png
model_registry.py		model_registry.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MLFlow_demo

MLOps Pipeline Project

Overview

Project Structure

Installation

Usage

Running the Pipeline

Start the model server

Exploring the Project

Future Work

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

burna680/MLFlow_demo

Folders and files

Latest commit

History

Repository files navigation

MLFlow_demo

MLOps Pipeline Project

Overview

Project Structure

Installation

Usage

Running the Pipeline

Start the model server

Exploring the Project

Future Work

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages