-
Notifications
You must be signed in to change notification settings - Fork 11
Description
Hej @vals !
Thanks for the really nice post about modeling single cell compositional data (201127-cell-count-glm)!
I have a small question regarding the choice of model for this type of count data.
I used your standard binomial GLM with a logit link to compare cell type compositions between two conditions and got some good results that match ones observed when simply plotting the relative cell type abundance in these two conditions:

I then tried to repeat it using a negative binomial glm from the MASS package using the glm.nb() function with a log-link:

The exact function call is
library(MASS) formula = count ~ cluster * time model2 <- glm.nb(formula = formula, link = "log",data = df)
where the df data frame has the same structure as the one posted in your example.
The calculated means for the odds.ratios between both models are the same but the negative binomial model has a much larger spread. I was wondering if you have any intuitive explanation for why the binomial GLM is a better choice, or vice-versa?
Thanks a lot!
//Angelo