Continuing from #1654
After reviewing the information Michael Stadler sent to Val, it sounds like our problem is that simply pattern-matching the consensus sequences onto the genome is too restrictive to account for all the binding sites (which makes me wonder what their consensus sequences represent in the first place but I'm not an expert).
They got their hit counts, which are different from ours, by using an R package which I must admit I don't fully understand what it does (the sequence scanning functionality is documented here). But from what I can tell, it doesn't seem to give the positions at which it detects the hits, only the number of hits for each motif.
The methods from Merle's paper only says "The universalmotif package version 1.16.0 was used to calculate motif similarities (compare_motifs function with parameters method = "PCC", tryRC=TRUE, min.overlap=4, min.mean.ic=0.25, normalise.scores=TRUE) and to scan sequences for motif hits (scan_sequences function with parameters threshold=1e-4, threshold.type="pvalue", RC=TRUE)."
One alternative they suggested for us is to use the positional frequency matrices rather than the consensus sequences.
I found an online tool which does that https://epd.expasy.org/pwmtools/pwmscan.php and allows to download the results as a bed file. I tried with the Ace2 motif and it found 313 hits, which is a bit more than our previous 109. So maybe we could use this tool to generate bed files for all the motifs and load them into JBrowse...
Continuing from #1654
After reviewing the information Michael Stadler sent to Val, it sounds like our problem is that simply pattern-matching the consensus sequences onto the genome is too restrictive to account for all the binding sites (which makes me wonder what their consensus sequences represent in the first place but I'm not an expert).
They got their hit counts, which are different from ours, by using an R package which I must admit I don't fully understand what it does (the sequence scanning functionality is documented here). But from what I can tell, it doesn't seem to give the positions at which it detects the hits, only the number of hits for each motif.
The methods from Merle's paper only says "The universalmotif package version 1.16.0 was used to calculate motif similarities (compare_motifs function with parameters method = "PCC", tryRC=TRUE, min.overlap=4, min.mean.ic=0.25, normalise.scores=TRUE) and to scan sequences for motif hits (scan_sequences function with parameters threshold=1e-4, threshold.type="pvalue", RC=TRUE)."
One alternative they suggested for us is to use the positional frequency matrices rather than the consensus sequences.
I found an online tool which does that https://epd.expasy.org/pwmtools/pwmscan.php and allows to download the results as a bed file. I tried with the Ace2 motif and it found 313 hits, which is a bit more than our previous 109. So maybe we could use this tool to generate bed files for all the motifs and load them into JBrowse...