Skip to content

Commit 7d313ad

Browse files
authored
Update README.md
1 parent 10c4f93 commit 7d313ad

1 file changed

Lines changed: 2 additions & 2 deletions

File tree

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
# statgen-advanced
22

3-
These notes are for trainees with quantitative backgrounds but without formal training in statistical genetics, who have encountered these methods in the literature but have not yet worked with them hands-on. For statisticians wanting to catch up on genetics applications, these notes provide the conceptual foundations and key assumptions that geneticists make when modeling data.
3+
These notes are for trainees with quantitative backgrounds but without formal training in statistical genetics, who have encountered these methods in the literature but have not yet worked with them hands-on. For statisticians wanting to catch up on genetics applications, these notes also provide essential conceptual foundations and key assumptions that geneticists make when modeling data.
44

55
These notes are not organized by method, by paper, or by software tool. Instead, we organize by scientific question. For each question, we focus on what problem we are trying to solve, what assumptions we are making, and what generative model most naturally describes how the data arise. Once these foundations are clear, existing methods become natural solutions, and their limitations become obvious.
66

77
Think of it like building a Lego model to represent something in the real world. The statistical building blocks (likelihoods, priors, latent variables, hierarchical structures) are the pieces available. Our goal is to focus on designing the blueprint that captures the essential features of the biological reality, while keeping the available blocks in mind. The details of assembling specific kits will inevitably be discussed, but they are not the focus. When one understands what the design requires and what connections matter, one will know how to select and combine blocks to satisfy those requirements. With this foundation, one can read new methods papers and recognize the same underlying ideas, and feel comfortable adapting or extending existing approaches for new problems.
88

9-
As an example, consider allele-specific expression (ASE) QTL analysis. Total expression reflects the sum of transcripts from both haplotypes; ASE measures their difference within heterozygotes. The same genetic effect parameter underlies both, appearing as dosage effect $(0, 1, 2)$ in total expression and haplotype difference $(-1, 0, +1)$ in ASE. Because sum and difference are conditionally independent, ASE adds information about genetic effects beyond total expression from the same samples, effectively increasing sample size. The within-individual comparison also cancels individual-level confounders (which affect both haplotypes equally), and haplotype difference in ASE provides different correlation (LD) patterns than conventional genotype dosage, thus improving fine-mapping resolution. These advantages motivate incorporating ASE into QTL analysis. [RASQUAL](https://www.nature.com/articles/ng.3467) implemented a rigorous generative model with Negative Binomial total counts and Beta-Binomial allele-specific counts sharing genetic effect parameters; [mixQTL](https://www.nature.com/articles/s41467-021-21592-8) later achieved scalability through Gaussian approximations and [WASP](https://github.com/bmvdgeijn/WASP) preprocessing, trading some modeling rigor for computational efficiency suitable for large-scale analysis. One can extend this framework further by adding local ancestry modeling and fine-mapping, following the same approach to motivation and generative modeling.
9+
As an example, consider allele-specific expression (ASE) QTL analysis. Total expression reflects the sum of transcripts from both haplotypes; ASE measures their difference within heterozygotes. The same genetic effect parameter underlies both, appearing as dosage effect $(0, 1, 2)$ in total expression and haplotype difference $(-1, 0, +1)$ in ASE. Because sum and difference are conditionally independent, ASE adds information about genetic effects beyond total expression from the same samples, effectively increasing sample size. The within-individual comparison also cancels individual-level confounders (which affect both haplotypes equally), and haplotype difference in ASE provides different correlation (LD) patterns than conventional genotype dosage, thus improving fine-mapping resolution. These advantages motivate incorporating ASE into QTL analysis. [RASQUAL](https://www.nature.com/articles/ng.3467) implemented a rigorous generative model with Negative Binomial total counts and Beta-Binomial allele-specific counts sharing genetic effect parameters; [mixQTL](https://www.nature.com/articles/s41467-021-21592-8) later achieved scalability through Gaussian approximations and [WASP](https://github.com/bmvdgeijn/WASP) preprocessing, trading some modeling rigor for computational efficiency suitable for large-scale analysis. One can extend this framework further by adding local ancestry modeling and fine-mapping under different likelihoods, following the same approach to motivation and generative modeling.
1010

1111
## Overview of Topics
1212

0 commit comments

Comments
 (0)