-
Notifications
You must be signed in to change notification settings - Fork 3
Expand file tree
/
Copy pathSolution_Rmarkdown.Rmd
More file actions
119 lines (75 loc) · 3.12 KB
/
Solution_Rmarkdown.Rmd
File metadata and controls
119 lines (75 loc) · 3.12 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
---
title: "Solution Rmarkdown"
author: "Sten Willemsen"
date: "2024-06-06"
output:
html_document:
code_folding: hide
---
## Setup
```{r setup}
dat <- read.csv("Data/R_data2.csv")
```
## Introduction
## Data
We perform the following data transformation steps:
* We define the variable `pregnancy_length` by adding together.
* We define the variable `BMI_cat` by dividing the variable BMI into categories: <18.5 ("Underweight"), 18.5 - 24.9 ("Healthy weight"), 25 - 29.9 ("Overweight"), and >30 (Obesity).
* We log transform `homocysteine` and `vitaminB12`.
* We transform the categorical variables to factors.
* We remove the original variables `pregnancy_length_weeks`, `pregnancy_length_days`, `BMI` and `homocysteine` and `vitaminB12` from the data set.
```{r transformations, echo=FALSE, message=FALSE}
dat$pregnancy_length <- dat$pregnancy_length_weeks *7 +
dat$pregnancy_length_days
dat$BMI_cat <- cut(dat$BMI, breaks=c(-Inf, 18.5, 24.9, 29.9, Inf),
labels=c("Underweight", "Healthy weight", "Overweight", "Obesity"))
dat$log_homocysteine <- log(dat$homocysteine)
dat$log_vitaminB12 <- log(dat$vitaminB12)
for(i in 1:length(names(dat))){
if(class(dat[[i]]) == "character"){
dat[[i]] <- as.factor(dat[[i]])
}
}
dat$Status <- factor(dat$Status,
levels = c("normal brain development", "intellectual disability"))
dat <- dat[ , !(names(dat) %in% c("pregnancy_length_weeks", "pregnancy_length_days", "BMI", "homocysteine", "vitaminB12"))]
```
## Analysis and Results
We show descriptives of all variables in the data set.
### Descriptives
```{r descriptives}
for(i in 1:length(names(dat))){
if(class(dat[[i]]) == "numeric"){
print(paste("Mean of", names(dat)[i], ":", mean(dat[[i]])))
print(paste("Standard deviation of", names(dat)[i], ":", sd(dat[[i]])))
} else if(class(dat[[i]]) == "factor"){
print(paste("Frequency of", names(dat)[i], ":"))
print(table(dat[[i]]))
}
}
```
For the continuous ones we also make a histogram.
```{r}
for(i in 1:length(names(dat))){
if(class(dat[[i]]) == "numeric"){
hist(dat[[i]], main = paste("Histogram of", names(dat)[i]))
}
}
```
### Unadjusted Analysis
We compare the mean of the logarithm of the Vitamin B12 for the two levels of `Status` (normal brain development or intellectual disability).
```{r}
t.test(log_vitaminB12 ~ Status,data = dat)
```
## Adjusted analysis
We now perform logistic regression analysis to investigate the association between `Status` and log `Vitamin B12` while adjusting for `medication`, `smoking` and `alcohol`.
```{r}
glm1_adjusted <- glm(Status ~ log_vitaminB12 + medication + smoking + alcohol, data = dat, family = binomial)
summary(glm1_adjusted)
coef(glm1_adjusted)
confint(glm1_adjusted)
```
## Conclusion and Discussion
**Main points:**
* In the unadjusted analysis, we could not show that the mean of the logarithm of the Vitamin B12 is significantly different for the two levels of `Status`.
* In the adjusted analysis, we found that the log `Vitamin B12` is not significantly associated with `Status` while adjusting for `medication`, `smoking` and `alcohol`.