Skip to content

Commit 36688ea

Browse files
committed
Use simpler prior by default
1 parent 612111f commit 36688ea

3 files changed

Lines changed: 4 additions & 8 deletions

File tree

code/cis_analysis/cis_workhorse.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1014,7 +1014,7 @@
10141014
"parameter: pip_cutoff = 0.05\n",
10151015
"parameter: coverage = [0.95, 0.7, 0.5]\n",
10161016
"# prior can be either of [\"mixture_normal\", \"mixture_normal_per_scale\"]\n",
1017-
"parameter: prior = \"mixture_normal_per_scale\"\n",
1017+
"parameter: prior = \"mixture_normal\"\n",
10181018
"parameter: max_SNP_EM = 100\n",
10191019
"# Max scale is such that 2^max_scale being the number of phenotypes in the transformed space. Default to 2^10 = 1024. Don't change it unless you know what you are doing. Max_scale should be at least larger than 5.\n",
10201020
"parameter: max_scale = 10\n",

pipeline/command_spliter.ipynb

Lines changed: 0 additions & 1 deletion
This file was deleted.

website/nature_protocol/output_markdown.md

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -99,8 +99,6 @@ Quality control and normalization are performed on output from the leafcutter an
9999
We use a gene coordinate annotation pipeline based on [`pyqtl`, as demonstrated here](https://github.com/broadinstitute/gtex-pipeline/blob/master/qtl/src/eqtl_prepare_expression.py). This adds genomic coordinate annotations to gene-level molecular phenotype files generated in `gct` format and converts them to `bed` format for downstreams analysis.
100100

101101

102-
A collection of methods for the imputation of missing omics data values are included in our pipelinle. Imputation is optional of eQTL analysis, but necessary for other QTLs. We use `flashier`, a Empirical Bayes Matrix Factorization model, to impute missing values. Other imputation methods include missForest, XGBoost, k-nearest neighbors, soft impute, mean imputation, and last observed data.
103-
104102
We include a collection of workflows to format molecular phenotype data. These include workflows to separate phenotypes by chromosome, by user-provided regions, a workflow to subset bam files and a workflow to extract samples from phenotype files.
105103

106104
##### B. Covariate Data Preprocessing
@@ -306,12 +304,11 @@ Timing <1 min
306304
```
307305

308306

309-
##### ii. Phenotype Imputation
310-
Timing X min
311307

312308
```
313-
sos run xqtl-pipeline/pipeline/phenotype_imputation.ipynb flash \
314-
--container oras://ghcr.io/cumc/omics_imputation_apptainer:latest
309+
sos run phenotype_imputation.ipynb EBMF \
310+
--container .containers/factor_analysis.sif \
311+
315312
```
316313

317314

0 commit comments

Comments
 (0)