This project implements an automated machine learning pipeline capable of handling classification, regression, and clustering tasks. It provides a streamlined process for data ingestion, transformation, model training, and evaluation.
- Supports classification, regression, and clustering problems
- Automated data ingestion and preprocessing
- Exploratory Data Analysis (EDA) report generation
- Automated feature engineering and selection
- Model training with hyperparameter tuning
- Model evaluation and comparison
- Interactive web interface using Streamlit
-
Clone the repository: git clone https://github.com/DPRASAD-dp/Automated-ML-Pipeline.git cd automated-ml-project
-
Create a virtual environment (optional but recommended): python -m venv venv source venv/bin/activate # On Windows, use venv\Scripts\activate
-
Install the required packages: pip install -r requirements.txt
-
Run the Streamlit app: streamlit run app.py
-
Upload your CSV file, select the problem type, and specify the target column.
-
Click "Run Analysis" to start the automated ML pipeline.
-
View the results, including the EDA report, model comparisons, and the best performing model.
src/: Contains the main source codecomponents/: Individual pipeline components (data ingestion, transformation, model training)exception.py: Custom exception handlinglogger.py: Logging configurationutils.py: Utility functionsapp.py: Streamlit web applicationsetup.py: Project setup and package informationrequirements.txt: List of required Python packages
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.