Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Empty file modified README.md
100644 → 100755
Empty file.
Empty file modified RTE - clean data 2012-2014 15-1-16.csv
100644 → 100755
Empty file.
74 changes: 74 additions & 0 deletions RTEAnalytics-Alex-01.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
---
title: "Exercise Set 2: A $300 Billion Strategy"
author: "Alex, Goajun, Sergey, Bastien"
output: html_document
---

<br>

```{r echo=FALSE, eval=TRUE, comment=NA, warning=FALSE,error=FALSE, message=FALSE, prompt=FALSE}
#load packages from helpers.R
source("helpers.R")
```

###Background
We have performed analysis of electricity data provided by the French Distribution Network (RTE)

####Analysis
1. Comparison between supply and demand
2. Evolution of energy mix
3. Correlation between time of the day and solar energy production
4. Correlation between time of the day and wind energy
5. Correlation between supply and demand vs. import/export
6. Correlation between consumption and weather

####Source

Open data: https://www.data.gouv.fr/fr/datasets/electricite-consommation-production-co2-et-echanges/

```{r eval=TRUE, echo=FALSE, comment=NA, warning=FALSE, message=FALSE,results='asis',fig.align='center', fig=TRUE}
DataSet<-read.csv("RTE - clean data 2012-2014 15-1-16.csv", header=TRUE)
# create vectors with year, month, and day
DataSet$Year = Year(as.Date(DataSet$Date,format='%m/%d/%Y'))
DataSet$Month = Month(as.Date(DataSet$Date,format='%m/%d/%Y'))
DataSet$Day = Day(as.Date(DataSet$Date,format='%m/%d/%Y'))
#Hour = Hour(as.Date(DataSet$Date,format='%m/%d/%Y'))
```
The file has the following structure. It has `r nrow(DataSet)` rows and `r ncol(DataSet)` columns.

###Part I: Comparison between supply and demand
`r n<-16`
I am a `r n`J



###Part II: Evolution of energy mix


```{r, echo=FALSE, message=FALSE, prompt=FALSE, results='asis'}
energyMixYear <- group_by(DataSet, Year) %>% summarise(Fuel = sum(Fuel/4/1000), Coal = sum(Coal/4/1000), Gas = sum(Gas/4/1000),Nuclear = sum(Nuclear/4/1000), Wind = sum(Wind/4/1000), Solar = sum(Solar/4/1000), Hydro = sum(Hydro/4/1000), Pumping = sum(Pumping/4/1000), Bioenergy = sum(Bioenergy/4/1000))

#conversion to String required to use vector as x-axis in Google Charts
energyMixYear$Year=as.character(energyMixYear$Year)

#excluded pumping for now
print(gvisSteppedAreaChart(energyMixYear, xvar = "Year", yvar = c("Fuel", "Coal", "Gas","Nuclear", "Wind", "Solar", "Hydro", "Bioenergy"), options=list(isStacked=TRUE,width = 1000, height = 500, vAxis="{format:'#,###GWh'}")), 'chart')
```

####??

###Part III: Correlation between time of the day and solar energy production

####??

###Part IV: Correlation between time of the day and wind energy (test Sergey)

####??

###Part V: Correlation between supply and demand vs. import/export (test Gaojun)

####??

###Part VI: Correlation between consumption and weather


31 changes: 26 additions & 5 deletions RTEAnalytics.Rmd → RTEAnalytics-Bastien-01.Rmd
100644 → 100755
Original file line number Diff line number Diff line change
@@ -1,9 +1,16 @@
---
Author: "Alex, Goajun, Sergey, Bastien"
title: "Exercise Set 2: A $300 Billion Strategy"
author: "Alex, Goajun, Sergey, Bastien"
output: html_document
---

<br>

```{r echo=FALSE, eval=TRUE, comment=NA, warning=FALSE,error=FALSE, message=FALSE, prompt=FALSE}
#load packages from helpers.R
source("helpers.R")
```

###Background
We have performed analysis of electricity data provided by the French Distribution Network (RTE)

Expand All @@ -21,7 +28,15 @@ Open data: https://www.data.gouv.fr/fr/datasets/electricite-consommation-product

```{r eval=TRUE, echo=FALSE, comment=NA, warning=FALSE, message=FALSE,results='asis',fig.align='center', fig=TRUE}
DataSet<-read.csv("RTE - clean data 2012-2014 15-1-16.csv", header=TRUE)
<<<<<<< HEAD
Consumption<-DataSet$Consumption
=======
# create vectors with year, month, and day
Year = Year(as.Date(DataSet$Date,format='%m/%d/%Y'))
Month = Month(as.Date(DataSet$Date,format='%m/%d/%Y'))
Day = Day(as.Date(DataSet$Date,format='%m/%d/%Y'))
#Hour = Hour(as.Date(DataSet$Date,format='%m/%d/%Y'))
>>>>>>> 1fa45162ed8f37f96c10ebb5f95a527bbf849b5d
```
The file has the following structure. It has `r nrow(DataSet)` rows and `r ncol(DataSet)` columns.<br>

Expand All @@ -35,6 +50,16 @@ I am a `r n`J

###Part II: Evolution of energy mix


```{r, echo=FALSE, message=FALSE, prompt=FALSE, results='asis'}
list_of_sources=colnames(DataSet[9:16])
#will try to use list_of_sources as argument to generate graphs

energyMixYear <- group_by(DataSet, Year) %>% summarise(Wind = sum(Wind), Coal = sum(Coal))

#plot(gvisSteppedAreaChart(energyMixYear, yvar = c("Wind", "Coal"), options=list(isStacked=TRUE)))
```

####??

###Part III: Correlation between time of the day and solar energy production
Expand All @@ -51,8 +76,4 @@ I am a `r n`J

###Part VI: Correlation between consumption and weather

Test for Alex
=======
#test de push by ALex

Test for Bastien 4:35pm
133 changes: 133 additions & 0 deletions RTEAnalytics-Sergey-01.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
---
title: "RTEAnalytics-Sergey-01.Rmd"
author: "Sergey Efimenko"
date: "29 Jan 2016"
output: html_document
---

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see <http://rmarkdown.rstudio.com>.

When you click the **Knit** button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:








##################

Seasonality test for wind power generation.


```{r, echo=FALSE}
library("stringr")
library("googleVis")
DataSet<-read.csv("RTE - clean data 2012-2014 15-1-16.csv", sep =",", header=TRUE)
month_data = sapply(1:length(DataSet$Date), function(i) ifelse(str_length(DataSet$Date[i]) > 6, as.numeric(str_split(DataSet$Date[i], "/")[[1]][1]), NA))
DataSet$month_data = month_data

season_data = sapply(month_data, function(i){
if (i %in% c(11,12,1,2)) res = 1 #"Winter"
if (i %in% c(3,4,5)) res = 2 #"Spring"
if (i %in% c(6,7,8)) res = 3 #"Summer"
if (i %in% c(9,10)) res = 4 #"Fall"
res
})
DataSet$season_data = season_data # CAN CREATE DUMMIES IF NEEDED! JUST ADD NEW COLUMNS
```


# check this
table(DataSet$season_data)

# Make sure the regression data only has numeric variables (and dummies). the lm input is a data.frame
# Make sure the regression data only has numeric variables (and dummies). the lm input is a data.frame
```{r, echo=FALSE}
regression_data = data.frame(
Cons = as.numeric(DataSet$Consumption),
Fuel = suppressWarnings(as.numeric(DataSet$Fuel)), # I folled google advice
Coal = as.numeric(DataSet$Coal),
Gas = as.numeric(DataSet$Gas),
Nuclear = as.numeric(DataSet$Nuclear),
Wind = as.numeric(DataSet$Wind),
Solar = as.numeric(DataSet$Solar),
Hydro = as.numeric(DataSet$Hydro),
Pumping = as.numeric(DataSet$Pumping),
Bio = as.numeric(DataSet$Bioenergy),
Phys = as.numeric(DataSet$Physical.delivery),
CO2 = as.numeric(DataSet$CO2.emission),
Trade.UK = as.numeric(DataSet$Trade.with.UK),
Trade.ES = as.numeric(DataSet$Trade.with.Spain),
Trade.IT = as.numeric(DataSet$Trade.with.Italy),
Trade.SW = as.numeric(DataSet$Trade.with.Switzerland),
Trade.DE_BG = as.numeric(DataSet$Trade.with.Germany...Belgium),
Winter.d = as.numeric(ifelse(DataSet$season_data == 1, 1, 0)),
Spring.d = as.numeric(ifelse(DataSet$season_data == 2, 1, 0)),
Summer.d = as.numeric(ifelse(DataSet$season_data == 3, 1, 0)),
Fall.d = as.numeric(ifelse(DataSet$season_data == 4, 1, 0)),
Morning.d = as.numeric(ifelse(DataSet$Time %in% c("6:00", "12:00"), 1, 0)),
Noon.d = as.numeric(ifelse(DataSet$Time %in% c("12:00", "18:00"), 1, 0)),
Evening.d = as.numeric(ifelse(DataSet$Time %in% c("18:00", "0:00"), 1, 0)),
Night.d = as.numeric(ifelse(DataSet$Time %in% c("00:30", "6:00"), 1, 0)),
#We dont need the following parameters for correlation tables
Monthdata = as.numeric(DataSet$month_data),
Time = as.numeric(DataSet$Time),
season_data = DataSet$season_data

)
```


m1<-gvisTable(regression_data,options=list(showRowNumber=TRUE,width=1920, height=min(400,27*(nrow(regression_data)+1)),allowHTML=TRUE,page='disable'))
print(m1,'chart')

```

#Graf correlation matrix
library(corrplot)
corrplot(cor(regression_data[1:25]), method = "color", type="upper", order="original", tl.col="black", tl.srt=70)


#Table correlation matrix
View(cor(regression_data[1:25]))


#Regression1: consumption vc season
regression_formula_Cons.Seas = as.formula("Cons ~ Winter.d + Spring.d + Summer.d")
Regression.Cons.Seas = lm(regression_formula_Cons.Seas, regression_data)
Regression.Cons.Seas$coefficients


#Regression2: CO2 vc Gen.Mix
regression_formula_CO2 = as.formula("CO2 ~ Fuel + Coal + Gas + Nuclear -1")
Regression.CO2 = lm(regression_formula_CO2, regression_data)
Regression.CO2$coefficients
summary(Regression.CO2)


#Regression3: Wind vc Season
According to corretalition table there is a high correlation between wind farm output and season.
Here are the results of regression analysis with dummy variables:
```{r, echo=FALSE}
regression_formula_Wind = as.formula("Wind ~ Winter.d + Spring.d + Summer.d")
Regression.Wind = lm(regression_formula_Wind, regression_data)
Summary.3 <- summary(Regression.Wind)
Fall_coef.3 <-round(coef(summary(Regression.Wind))["(Intercept)","Estimate"])
Summer_coef.3 <-round(coef(summary(Regression.Wind))["Summer.d","Estimate"])
Spring_coef.3 <-round(coef(summary(Regression.Wind))["Spring.d","Estimate"])
Winter_coef.3 <-round(coef(summary(Regression.Wind))["Winter.d","Estimate"])
```
$Wind = `r Fall_coef.3`*Fall + `r Summer_coef.3`*Summer +`r Spring_coef.3`*Spring +`r Winter_coef.3`*Winter$




#Regression4: Solar vc Season
regression_formula_Solar = as.formula("Solar ~ Winter.d + Spring.d + Summer.d")
Regression.Solar = lm(regression_formula_Solar, regression_data)
Regression.Solar$coefficients
(Regression.Solar)


132 changes: 67 additions & 65 deletions RTEAnalytics.html → RTEAnalytics-Sergey-01.html

Large diffs are not rendered by default.

13 changes: 0 additions & 13 deletions RTEAnalytics.Rproj

This file was deleted.

18 changes: 18 additions & 0 deletions helpers.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# list of packages required to run RTEAnalytics.Rmd

get_libraries <- function(filenames_list) {
lapply(filenames_list,function(thelibrary){
if (do.call(require,list(thelibrary)) == FALSE)
do.call(install.packages,list(thelibrary))
do.call(library,list(thelibrary))
})
}

libraries_used=c("corrplot","stringr","gtools","foreign","reshape2","digest","timeDate","devtools","knitr","graphics",
"grDevices","xtable","sqldf","stargazer","TTR","quantmod","shiny",
"Hmisc","vegan","fpc","GPArotation","FactoMineR","cluster",
"psych","stringr","googleVis", "png","ggplot2","googleVis", "gridExtra","RcppArmadillo","xts","DescTools", "dplyr")

get_libraries(libraries_used)

options(stringsAsFactors=FALSE)