This project predicts house prices based on key features such as size, location, and age. It leverages advanced machine learning techniques, including XGBoost, to deliver accurate price estimates. These insights can assist real estate agents, buyers, and sellers in making informed decisions.
- Predictive Model: Built using XGBoost, optimized for tabular data.
- Feature Engineering: Created custom features like house age and area ratios to boost model performance.
- Performance Metrics: Achieved competitive MAE and RMSE scores.
- End-to-End Workflow: Includes data preprocessing, exploratory analysis, modeling, and deployment.
- Mean Absolute Error (MAE): 15,734
- Root Mean Squared Error (RMSE): 125
The model provides reliable price predictions, reducing uncertainty in property valuation.
House_price_prediction/
├── house/ # Dataset folder
├── notebooks/ # Jupyter notebooks
│ ├── Visual.ipynb # Exploratory Data Analysis
│ └── XGB_regress.ipynb # Model training and evaluation
├── models/ # Saved models
│ ├── xgb_model.pkl # Main XGBoost model
│ └── trained_model.pkl # Alternative trained model
├── scripts/ # Utility scripts
│ ├── save_model.py # Script for saving models
│ └── deploy_model.py # Script for deployment
├── requirements.txt # Dependencies
└── README.md # Project overview
The dataset contains key features influencing house prices:
- Size: Total area in square footage.
- Bedrooms & Bathrooms: Count of each.
- Location: Neighborhood information.
- Year Built: Construction year.
- Sale Price: Target variable for prediction.
- Imputation for missing values.
- Encoding for categorical variables.
- Scaling for numerical features.
- Chosen for its speed and performance on tabular datasets.
- Trained using advanced hyperparameters:
- Learning Rate: 0.05
- Max Depth: 4
- Subsample: 0.8
- Colsample by Tree: 0.2
- n_estimators: 200
- Applied cross-validation to ensure consistent results.
Install required libraries:
pip install pandas numpy scikit-learn xgboost matplotlib seaborn- Clone the repository:
git clone https://github.com/NasdormML/House_price_try.git cd House_price_try - Install dependencies:
pip install -r requirements.txt
- EDA: Run
Visual.ipynbto explore the dataset and trends. - Model Training: Use
XGB_regress.ipynbto train and evaluate the predictive model.
- Accurate Pricing: Helps set realistic property prices, increasing transaction efficiency.
- Market Insights: Identifies key factors driving property value.
- Risk Reduction: Assists buyers in avoiding overpayment.
Have questions or feedback? Reach out via email: nasdorm.ml@inbox.ru