This exercice help us to use our new skills, to clean a dataset by using R programming.
If the data zip file doesn't exist, run_analysis() download and unzip it on the computeur.
The function run_analysis() read and take from the texts files, the X variables, the Y variables, the names's variables and the subject's ID, of each sample of participant ( test and train ). A new column 'sample' is create for differenciate the two samples.
At the moment where the test and train datasets are clean, we merge its in one.
Within the merged dataset, we search and take the column which have the words 'means' or 'std'. We get 'sample', 'subject' and 'activity' column too, in the purpose to create our first principale and tidy dataset. During this procedure, the variables' names are cleanning. The class of subjects and the activity variable are upgrade too.
The tidy dataset is cached into dataset_1
We begin by create a empty list called 'new_data'. Afterwards, we take each subject ID. For each subject ID, we subset data from dataset_1. With the subset object, we calculate the mean of each column and by each activity. Each compute is store in 'new_data' list.
Our list is cached into dataset_2