GitHub - ChristophH/Inferelator: The official Inferelator repository maintained by current or former Bonneau lab members

ChristophH / Inferelator Public

Notifications You must be signed in to change notification settings
Fork 7
Star 9

The official Inferelator repository maintained by current or former Bonneau lab members

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
R_scripts		R_scripts
input		input
jobs		jobs
LICENSE		LICENSE
README		README
inferelator.R		inferelator.R

Repository files navigation

Call the inferelator script from the base directory (the one containing this 
README) with a job config file as argument. 

Example call: Rscript inferelator.R jobs/dream4_cfg.R



--------------------------------------------------------------------------------
Default parameters and a brief explanation of each one
--------------------------------------------------------------------------------

PARS$input.dir <- 'input/dream4'  # path to the input files

PARS$exp.mat.file <- 'expression.tsv'  # required; see definition below
PARS$tf.names.file <- 'tf_names.tsv'  # required; see definition below
PARS$meta.data.file <- 'meta_data.tsv'  # assume all steady state if NULL
PARS$priors.file <- 'gold_standard.tsv'  # no priors if NULL
PARS$gold.standard.file <- 'gold_standard.tsv'  # no evaluation if NULL
PARS$leave.out.file <- NULL  # file with list of conditions that will be ignored
PARS$randomize.expression <- FALSE  # whether to scramble input expression

PARS$job.seed <- 42  # random seed; can be NULL
PARS$save.to.dir <- file.path(PARS$input.dir, date.time.str)  # output directory
PARS$num.boots <- 20  # number of bootstraps; no bootstrapping with a value of 1
PARS$max.preds <- 10  # max number of predictors based on CLR to pass to model
                      # selection method
PARS$mi.bins <- 10  # number of bins to use for mutual information calculation
PARS$cores <- 8  # number of cpu cores

PARS$delT.max <- 110  # max number of time units allowed between time series 
                      # conditions
PARS$delT.min <- 0  # min number of time units allowed between time series 
                    # conditions
PARS$tau <- 45  # constant related to half life of mRNA (see Core model)

PARS$perc.tp <- 0  # percent of true priors that will be used; can be vector
PARS$perm.tp <- 1  # number of permutations of true priors
PARS$perc.fp <- 0  # percent of false priors (100 = as many false priors as 
                   # there are true priors); can be vector
PARS$perm.fp <- 1  # number of permutations of false priors
PARS$pr.sel.mode <- 'random'  # prior selection mode: 'random' or 'tf'
                              # if 'random', the true priors are randomly chosen
                              # from all priors edges, if 'tf', 
                              # PARS$perc.tp is interpreted as the percent of
                              # TFs to use for true priors and all interactions
                              # for the chosen TFs will be used

PARS$eval.on.subset <- FALSE  # whether to evaluate only on the part of the 
                              # network that has connections in the gold
                              # standard; if TRUE false priors will only be 
                              # drawn from that part of the network

PARS$method <- 'BBSR'  # which method to use; either 'MEN' or 'BBSR'
PARS$prior.weight <- 1  # the weight for the priors; has to be larger than 1
                        # for priors to have an effect

PARS$use.tfa <- FALSE  # whether to estimate transcription factor activities and
                       # use those in the regression models
                       # if TRUE, interactions in priors file shoud be signed,
                       # i.e. -1 for repression and +1 for activation
PARS$prior.ss <- FALSE # whether to also sub-sample from the prior matrix during
                       # each bootstrap; if TRUE, priors are sampled randomly with 
                       # replacement; if FALSE, all priors are used as is

PARS$output.summary <- TRUE  # write a summary tsv and RData file of network

PARS$output.report <- TRUE  # create html network report

PARS$output.tf.plots <- TRUE  # create png files with plots of TFs and targets

--------------------------------------------------------------------------------
Required Input Files
--------------------------------------------------------------------------------

expression.tsv
--------------
expression values; must include row (genes) and column (conditions) names

tf_names.tsv
------------
one TF name on each line; must be subset of the row names of the expression data



--------------------------------------------------------------------------------
Optional Input Files
--------------------------------------------------------------------------------

meta_data.tsv
-------------
the meta data describing the conditions; must include column names;
has five columns:
isTs: TRUE if the condition is part of a time-series, FALSE else
is1stLast: "e" if not part of a time-series; "f" if first; "m" middle; "l" last
prevCol: name of the preceding condition in time-series; NA if "e" or "f"
del.t: time in minutes since prevCol; NA if "e" or "f"
condName: name of the condition

priors.tsv
----------
matrix of 0 and 1 indicating whether we have prior knowledge in 
the interaction of one TF and a gene; one row for each gene, one column for 
each TF; must include row (genes) and column (TF) names

gold_standard.tsv
-----------------
needed for validation; matrix of 0 and 1 indicating whether there is an 
interaction between one TF and a gene; one row for each gene, one column for 
each TF; must include row (genes) and column (TF) names



--------------------------------------------------------------------------------
Output Files
--------------------------------------------------------------------------------

One or more betas_frac_tp_X_perm_X--frac_fp_X_perm_X_X.RData files. One file
per true and false prior and prior weight combination. Each RData file contains
two lists of length PARS$num.boots where every entry is a matrix of betas and
confidence scores (rescaled betas) respectively.

One or more combinedconf_frac_tp_X_perm_X--frac_fp_X_perm_X_X.RData files with
one matrix each. The matrix is the rank-combined version of the confidence
scores of all bootstraps.

A params_and_input.RData file with data objects holding the user set parameters,
and input and input-derived objects.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

License

ChristophH/Inferelator

Folders and files

Latest commit

History

Repository files navigation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages