diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..5b6a065 --- /dev/null +++ b/.gitignore @@ -0,0 +1,4 @@ +.Rproj.user +.Rhistory +.RData +.Ruserdata diff --git a/Class 7 Files.RData b/Class 7 Files.RData new file mode 100644 index 0000000..8f1958a Binary files /dev/null and b/Class 7 Files.RData differ diff --git a/Class 7 Instructions.Rmd b/Class 7 Instructions.Rmd index 5ae641a..41cadfc 100644 --- a/Class 7 Instructions.Rmd +++ b/Class 7 Instructions.Rmd @@ -56,9 +56,11 @@ The spread function requires the following input: instructor_data <- spread(data_long, variables, measure) ``` -##Now we have a workable instructor data set!The next step is to create a workable student data set. Upload the data "student_activity.csv". View your file once you have uploaded it and then draw on a piece of paper the structure that you want before you attempt to code it. Write the code you use in the chunk below. (Hint: you can do it in one step) +##Now we have a workable instructor data set!The next step is to create a workable student data set. Upload the data "student_activity.csv". View your file once you have uploaded it and then draw on a piece of paper the structure that you want before you attempt to code it. Write the code you use in the chunk below. (Hint: you can do it in one step) ... ```{r} +data_wide_2 <- read.table("student_activity_wide.csv", sep = ",", header = TRUE) +> student_data_1 <- spread(data_wide_2,variable, measure) ``` @@ -75,7 +77,7 @@ student_data_2 <- dplyr::filter(student_data, date == 20160204) Now subset the student_activity data frame to create a data frame that only includes students who have sat at table 4. Write your code in the following chunk: ```{r} - +student_data_3 <- dplyr::filter(student_data_1, table == 4) ``` ##Make a new variable @@ -89,7 +91,7 @@ instructor_data <- dplyr::mutate(instructor_data, total_sleep = s_deep + s_light Now, refering to the cheat sheet, create a data frame called "instructor_sleep" that contains ONLY the total_sleep variable. Write your code in the following code chunk: ```{r} - +instructor_data<- dplyr::mutate(instructor_data,instructor_sleep = total_sleep) ``` Now, we can combine several commands together to create a new variable that contains a grouping. The following code creates a weekly grouping variable called "week" in the instructor data set: @@ -100,7 +102,7 @@ instructor_data <- dplyr::mutate(instructor_data, week = dplyr::ntile(date, 3)) Create the same variables for the student data frame, write your code in the code chunk below: ```{r} - +student_data_1 <- dplyr::mutate(student_data_1, week = dplyr::ntile(date, 3)) ``` ##Sumaraizing @@ -117,7 +119,8 @@ student_data %>% dplyr::group_by(date) %>% dplyr::summarise(mean(motivation)) Create two new data sets using this method. One that sumarizes average motivation for students for each week (student_week) and another than sumarizes "m_active_time" for the instructor per week (instructor_week). Write your code in the following chunk: ```{r} - +student_data_4<- dplyr::mutate(student_data_1 %>% dplyr::group_by(week) %>% dplyr::summarise(mean(motivation))) +instructor_week<- dplyr::mutate(instructor_data_1 %>% dplyr::group_by(week) %>% dplyr::summarise(mean(motivation))) ``` ##Merging @@ -131,7 +134,11 @@ merge <- dplyr::full_join(instructor_week, student_week, "week") Visualize the relationship between these two variables (mean motivation and mean instructor activity) with the "plot" command and then run a Pearson correlation test (hint: cor.test()). Write the code for the these commands below: ```{r} - +plot(instructor_week,student_data_4) +plot(student_data_4,instructor_week) +x_motivation<-c(1.666667,1.851852,1.851852) +y_active_time<-c(6913.25,6240.286,5956.143) +cor.test(x_motivation,y_active_time) ``` Fnally save your markdown document and your plot to this folder and comit, push and pull your repo to submit. diff --git a/Data Frames.RData b/Data Frames.RData new file mode 100644 index 0000000..1a1f371 Binary files /dev/null and b/Data Frames.RData differ diff --git a/Rplot.pdf b/Rplot.pdf new file mode 100644 index 0000000..d6c3711 Binary files /dev/null and b/Rplot.pdf differ