You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
|`*_stats.tab`| Tab-delimited run summary includes statistics like NSC (normalized strand coefficient), RSC (relative strand coefficient) and VSN (virtual S/N ratio).|
139
+
|`*.pdf`| A multipage figure summarizing the run: cross-correlation curves (naïve and MaSC) versus shift, with the inferred fragment length highlighted.|
140
+
|`*_cc.tab`| Naïve strand cross-correlation coefficients by shift (rows). Columns are `shift` (in bp), `whole` (all chromosomes), followed by per-chromosome values. |
141
+
|`*_mscc.tab`| Mappability-sensitive cross-correlation (MSCC) coefficients by shift with the same layout as `*_cc.tab`. Produced only when a mappability BigWig is supplied. |
142
+
|`*_nreads.tab`| Positive/negative strand read counts reported as `pos-neg` pairs for `whole` and per chromosome. The `raw` row reports number of reads. If mappability is supplied, numbers of reads in doubly mappable positions at each shift are also reported. |
117
143
118
-
##### --disable-progress
119
-
Disable progress bars.
120
-
Note that progress bar will be disabled automatically if stderr is not connected to terminal.
144
+
Additionaly, PyMaSC generates a JSON file as cache for mappability analyses. See the `--mappability-stats` option for details.
121
145
122
-
##### --color {TRUE,FALSE}
123
-
Switch coloring log output. (Default: auto; enable if stderr is connected to terminal)
|`-v, --log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL}`| Set logging message level. |`INFO`|
151
+
|`--disable-progress`| Disable progress bars. Note that progress bar will be disabled automatically if stderr is not connected to terminal. | auto |
152
+
|`--color {TRUE,FALSE}`| Switch coloring log output. | auto (enable if stderr is connected to terminal) |
153
+
|`--version`| Show program's version number and exit ||
128
154
129
155
#### Processing settings
130
156
131
-
##### -p / --process [int]
132
-
Set number of worker process. (Default: 1)
133
-
For indexed BAM file, PyMaSC parallel process each reference (chromosome).
134
-
135
-
#### --successive
136
-
Calc with successive algorithm instead of bitarray implementation (Default: false)
137
-
Bitarray implementation is recommended in most situation. See `Computation details`
138
-
for more information.
139
-
140
-
##### --skip-ncc
141
-
Both `-m/--mappability` and `--skip-ncc` specified, PyMaSC skips calculate naïve cross-correlation
142
-
and calculates only mappability-sensitive cross-correlation. (Default: False)
143
-
144
-
##### --skip-plots
145
-
Skip output figures. (Default: False)
146
-
157
+
| Option & Argument | Description | Default |
158
+
|----------------------|-------------|---------|
159
+
|`-p/--process [int]`| Set number of worker process. For indexed BAM file, PyMaSC parallel process each reference (chromosome). | 1 |
160
+
|`--successive`| Calc with successive algorithm instead of bitarray implementation Bitarray implementation is recommended in most situation. See `Computation details` for more information. | off (use bit array implementation) |
161
+
|`--skip-ncc`| Both `-m/--mappability` and `--skip-ncc` specified, PyMaSC skips calculate naïve cross-correlation and calculates only mappability-sensitive cross-correlation. | off |
162
+
|`--skip-plots`| Skip output figures. | off |
147
163
148
164
#### Input alignment file settings
149
165
150
-
##### -r / --read-length [int]
151
-
Specify read length explicitly. (Default: get representative by scanning)
152
-
PyMaSC needs representative value of read length to plot figures and to calc
153
-
mappability-sensitive cross-correlation. By default, PyMaSC scans input file
154
-
read length to get representative read length. If read length is specified, PyMaSC
155
-
skips this step.
156
-
Note that this option must be specified to treat unseekable input (like stdin).
|`-r, --read-length [int]`| Specify read length explicitly (Default: get representative by scanning). PyMaSC needs representative value of read length to plot figures and to calc mappability-sensitive cross-correlation. By default, PyMaSC scans input file read length to get representative read length. If read length is specified, PyMaSC skips this step. Note that this option must be specified to treat unseekable input (like stdin). | auto |
169
+
|`--readlen-estimator {MEAN,MEDIAN,MODE,MIN,MAX}`| Specify how to get representative value of read length.| median |
170
+
|`-l, --library-length`| Specify expected fragment length. PyMaSC supplies additional NSC and RSC values calculated from this value. | None |
165
171
166
172
#### Input mappability file settings
167
173
168
-
##### -m / --mappability [BigWig file]
169
-
Specify mappability (alignability, uniqueness) track to calculate mappability-sensitive
170
-
cross-correlation.
171
-
Input file must be BigWig format and each track's score should indicate mappability
172
-
in [0, 1] (1 means uniquely mappable position).
173
-
If BigWig file is not supplied, PyMaSC will calculate only naïve cross-correlation.
174
-
175
-
##### --mappability-stats [json file]
176
-
Read and save path to the json file which contains mappability region statistics.
177
-
(Default: same place, same base name as the mappability BigWig file)
178
-
If there is no statistics file for specified BigWig file, PyMaSC calculate total
179
-
length of doubly mappable region for each shift size automatically and save them
180
-
to reuse for next calculation and faster computing.
181
-
`pymasc-precalc` performs this calculation for specified BigWig file (this is not
|`-m, --mappability [BigWig file]`| Specify mappability (alignability, uniqueness) track to calculate mappability-sensitive cross-correlation. Input file must be BigWig format and each track's score should indicate mappability in [0, 1] (1 means uniquely mappable position). If BigWig file is not supplied, PyMaSC will calculate only naïve cross-correlation. ||
177
+
|`--mappability-stats [json file]`| Read and save path to the json file which contains mappability region statistics. If there is no statistics file for specified BigWig file, PyMaSC calculate total length of doubly mappable region for each shift size automatically and save them to reuse for next calculation and faster computing. `pymasc-precalc` performs this calculation for specified BigWig file (this is not necessary, of course). | auto (same place, same base name as the mappability BigWig file) |
184
178
185
179
#### Input file filtering arguments
186
180
187
-
##### -q / --mapq [int]
188
-
Input reads which mapping quality less than specified score will be discarded. (Default: 1)
189
-
MAPQ >= 1 is recommended because MAPQ=0 contains multiple hit reads.
190
-
191
-
##### -i / --include-chrom [pattern ...]
192
-
Specify chromosomes to calculate. Unix shell-style wildcards (`.`, `*`, `[]` and `[!]`)
193
-
are acceptable. This option can be declared multiple times to re-include chromosomes
194
-
specified in a just before -e/--exclude-chrom option. Note that this option is case-sensitive.
195
-
196
-
##### -e / --exclude-chrom [pattern ...]
197
-
As same as the -i/--include-chrom option, specify chromosomes to exclude from calculation.
198
-
This option can be declared multiple times to re-exclude chromosomes specified in
|`-q, --mapq [int]`| Input reads which mapping quality less than specified score will be discarded. MAPQ >= 1 is recommended because MAPQ=0 contains multiple hit reads. | 1 |
184
+
|`-i, --include-chrom [pattern ...]`| Specify chromosomes to calculate. Unix shell-style wildcards (`.`, `*`, `[]` and `[!]`) are acceptable. This option can be declared multiple times to re-include chromosomes specified in a just before `-e/--exclude-chrom` option. Note that this option is case-sensitive. ||
185
+
|`-e, --exclude-chrom [pattern ...]`| As same as the `-i/--include-chrom` option, specify chromosomes to exclude from calculation. This option can be declared multiple times to re-exclude chromosomes specified in a just before `-i/--include-chrom` option. ||
201
186
202
187
#### Analysis Parameters
203
188
204
-
##### -d / --max-shift [int]
205
-
PyMaSC calculate cross-correlation with shift size from 0 to this value. (Default: 1000)
206
-
207
-
##### --chi2-pval [float]
208
-
P-value threshold to check strand specificity. (Default: 0.05)
209
-
PyMaSC performs chi-square test between number of reads mapped to positive- and negative-strand.
210
-
211
-
##### -w / --smooth-window [int]
212
-
Before mean fragment length estimation, PyMaSC applies moving average filter to
213
-
mappability-sensitive cross-correlation. This option specify filter's window size.
214
-
(Default: 15)
215
-
216
-
##### --mask-size [int]
217
-
If difference between a read length and the estimated library length is equal or
218
-
less than the length specified by this option, PyMaSC masks correlation coefficients
219
-
in the read length +/- specified length and try to estimate mean library length again.
220
-
(Default: 5, Specify < 1 to disable)
221
-
222
-
##### --bg-avr-width [int]
223
-
To obtain the minimum coefficients of cross-correlation, PyMaSC gets the median
224
-
of the end of specified bases from calculated cross-correlation coefficients.
|`-d, --max-shift [int]`| PyMaSC calculate cross-correlation with shift size from 0 to this value. | 1000 |
192
+
|`--chi2-pval [float]`| P-value threshold to check strand specificity. PyMaSC performs chi-square test between number of reads mapped to positive- and negative-strand. | 0.05 |
193
+
|`-w, --smooth-window [int]`| Before mean fragment length estimation, PyMaSC applies moving average filter to mappability-sensitive cross-correlation. This option specify filter's window size. | 15 |
194
+
|`--mask-size [int]`| If difference between a read length and the estimated library length is equal or less than the length specified by this option, PyMaSC masks correlation coefficients in the read length +/- specified length and try to estimate mean library length again. Specify < 1 to disable. | 5 |
195
+
|`--bg-avr-width [int]`| To obtain the minimum coefficients of cross-correlation, PyMaSC gets the median of the end of specified bases from calculated cross-correlation coefficients. | 50 |
226
196
227
197
#### Output options
228
198
229
-
##### -o / --outdir [path]
230
-
Specify output directory. (Default: current directory)
231
-
232
-
##### -n / --name [NAME...]
233
-
By default, output files are written to `outdir/input_file_base_name`. This option
|`-n, --name [NAME...]`| By default, output files are written to `outdir/input_file_base_name`. This option overwrite output file base name. | (input_file_base_name) |
235
203
204
+
---
236
205
237
206
### `pymasc-precalc` command
238
207
@@ -248,7 +217,7 @@ overwrite output file base name.
248
217
[-r MAX_READLEN]
249
218
250
219
#### Usage example
251
-
Calculate total length of doubly mappable region.
220
+
Calculate total length of doubly mappable regions.
252
221
`wgEncodeCrgMapabilityAlign36mer_mappability.json` will be write.
0 commit comments