Energy-Efficiency-Statistical-Analysis

Project Proposal: Estimating heating and cooling loads based on building characteristics

1. Dataset Description

Dataset Link: https://archive.ics.uci.edu/dataset/242/energy+efficiency

The dataset for this analysis comes from the UCI Machine Learning Repository and contains heating load and cooling load requirements of buildings (that is, energy efficiency) as a function of building parameters. The dataset has 8 variables, 2 target values, and 768 observations with no missing values. The data was collected from performing energy analysis using 12 different building shapes simulated in Ecotect. The target variables are heating load (HL) and cooling load (CL), representing the energy requirements for maintaining thermal comfort within buildings.

Variables:

X1 - Relative Compactness: A measure of the building’s shape efficiency
X2 - Surface Area: The total surface area of the building
X3 - Wall Area: The area of the walls, contributing to heat transfer
X4 - Roof Area: The area of the roof, affecting thermal insulation
X5 - Overall Height: Building height, impacting air flow and heat transfer
X6 - Orientation: Cardinal direction of the buildings facade
X7- Glazing Area: Total window area, influencing natural light and insulation
X8 - Glazing Area Distribution: Spread of window area on each facade

Y1 (Response Variable) - Heating Load: Energy required for Heating.
Y2 (Response Variable) - Cooling Load: Energy required for Cooling.

2. Objectives

The main objective of this project is to analyze the dataset in terms of the variables in order to develop a predictive model that will depict the most efficient heating and cooling loads based on the building characteristics. Identify and interpret the influence of each building feature on energy efficiency, providing insights that can inform sustainable design practices. Optimize model performance by experimenting with different regression techniques and feature selection methods.

3. Plan of Analysis

First, explore the dataset to determine whether the dataset is normally distributed or not. Furthermore, evaluate regression models by splitting the data into training, test sets, and computing prediction errors in order to assess model performance. Utilizing the following:

Multiple Linear Regression
Multinomial Logistic Regression
Decision Tree Regression
Correlation Matrix

For data analysis to find the accuracy within each model, we will use:

Mean Absolute Error (MAE)
RMSE (Root Mean Squared Error)

Given that there are 2 response targets, separate models will be ran in order to compare r² and prediction errors.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
decision_tree		decision_tree
images		images
multiple_linear		multiple_linear
ENB2012_data.xlsx		ENB2012_data.xlsx
LICENSE		LICENSE
README.md		README.md
data_exploration.rmd		data_exploration.rmd
final_report1.pdf		final_report1.pdf
final_report1.rmd		final_report1.rmd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Energy-Efficiency-Statistical-Analysis

1. Dataset Description

2. Objectives

3. Plan of Analysis

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Energy-Efficiency-Statistical-Analysis

1. Dataset Description

2. Objectives

3. Plan of Analysis

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages