-
Notifications
You must be signed in to change notification settings - Fork 1
Clade Statistics
Mathias Witte Paz edited this page Sep 30, 2022
·
2 revisions
Evidente allows the user to select a clade of the phylogenetic tree and run an enrichment analysis that identifies enriched GO-terms that are linked to genes with SNPs. The computation of further statistic measurements will be included in the future.
For this functionality, it is required that the user uploads a genome annotation file and a GO-term list (see Input files for more information.)
There are two ways of running an enrichment analysis.
- By selecting a clade: Any clade in the phylogenetic tree can be selected from their root to run an enrichment analysis on it ("compute statistics on clade"). For the moment, there is only the possibility of running a Gene Ontology enrichment. When this computation is started, Evidente will extract all GO-terms that might be affected by a point-mutation within this clade, i.e. all GO-terms that are associated to genes with a SNP on them. Since the GO-terms are part of an hierarchical ontology, the tool GOATOOLS is used to retrieve all ancestors For every GO-term, a confusion matrix as the one shown next is computed and a right-sided Fisher's exact test is computed. After all GO-terms have been tested, a Bonferroni correction is used to correct for multiple testing and divides the p-values by the number of tested GO-terms.

- By running an automatic analysis: For this, the user should go to the menu "Statistical Analysis" to find clades with enriched GO-terms. Only clades with at least one supporting SNP are tested. Each clade is then tested as explained in Point 1.
The results are visualized in a pop-modal. Here, the user is able to select an enriched GO-term and visualize the SNPs that are found within genes associated with the term. Furthermore, a list of enriched GO-terms with p-values can be downloaded as a CSV file.