-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathREADME.Rmd
More file actions
134 lines (96 loc) · 3.19 KB
/
README.Rmd
File metadata and controls
134 lines (96 loc) · 3.19 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# ZIPG
<!-- badges: start -->
<!-- badges: end -->
We provide R code for Zero-inflated Poisson-Gamma Model (ZIPG) with an application to longitudinal microbiome count data.
## Installation
You can install the development version of ZIPG like so:
``` r
devtools::install_github("roulan2000/ZIPG")
```
## Example
### Load Data
Load dietary data. Complete Dietary data can be found in "Daily sampling reveals personalized diet-microbiome associations in humans." (Johnson et al. 2019)
```{r}
library(ZIPG)
library(ggplot2)
data("Dietary")
dat = Dietary
taxa_num = 100
dat$taxa_name[taxa_num] # taxa name
W = dat$OTU[,taxa_num] # taxa count
M = dat$M # sequencing depth
ggplot(NULL)+
geom_boxplot(aes(
x = as.factor(dat$COV$ALC01),y=log((W+0.5)/M)))+
labs(title = dat$taxa_name[taxa_num],
x = 'ALC',y='Relative Abundance')+
theme_bw()
```
## ZIPG Wald
Use function `ZIPG_main()` to run our ZIPG model.
**Input :**
`W` : Observed taxa count data.
`M` : Sequencing depth, ZIPG use log(M) as offset by default.
`X`, `X_star` : Covariates of interesting of differential abundance and differential varibility, input as formula.
**Output list:**
`ZIPG_res$init` : pscl results, used as initialization.
`ZIPG_res$res` : ZIPG output evaluated at last EM iteration.
`ZIPG_res$res$par` : ZIPG estimation for $\Omega = (\beta,\beta^*,\gamma)$.
`ZIPG_res$wald_test` : ZIPG Wald test
`ZIPG_res$logli` : ZIPG log-likelihood
```{r}
ZIPG_res <- ZIPG_main(data = dat$COV,
X = ~ALC01+nutrPC1+nutrPC2, X_star = ~ ALC01,
W = W, M = M)
res = ZIPG_summary(ZIPG_res)
```
## ZIPG bWald
Set the bootstrap replicates `B` in `bWald_list` to conduct ZIPG-bWald, results and covariance matrix can be find in `ZIPG_res$bWald`.
```{r}
set.seed(123)
# Set bootstrap replicates B
bWald_list = list(B = 100)
# Wait for a wile
ZIPG_res1 = ZIPG_main(
data = dat$COV,
X = ~ALC01+nutrPC1+nutrPC2, X_star = ~ ALC01,
W = W, M = M,
bWald_list = bWald_list)
res = ZIPG_summary(ZIPG_res1,type = 'bWald')
res = ZIPG_CI(ZIPG_res1,type='bWald',alpha = 0.05)
```
To test more complicated hypothesis, you may use the covariance matirx driven from bootstrap.
```{r}
round(ZIPG_res1$bWald$vcov,3)
```
## ZIPG pbWald
Set bootstrap replicates `B` and the null hypothesis by formula `X0` and `X_star0` in `pbWald_list` to conduct ZIPG-pbWald, results can be find in ZIPG_res\$pbWald
```{r}
# test beta1star, the 6th parameter
#
pbWald_list = list(
X0 = ~ALC01 + nutrPC1+nutrPC2,
X_star0 = ~ 1,
B = 100
)
ZIPG_res2 = ZIPG_main(
data = dat$COV,
X = ~ALC01+nutrPC1+nutrPC2, X_star = ~ ALC01,
W = W, M = M,
pbWald_list= pbWald_list)
res = ZIPG_summary(ZIPG_res2,type ='pbWald')
```
# References
* Jiang, Roulan, Xiang Zhan, and Tianying Wang. ["A Flexible Zero-Inflated Poisson-Gamma Model with Application to Microbiome Sequence Count Data."](https://doi.org/10.1080/01621459.2022.2151447) Journal of the American Statistical Association (2023): 1-13.