Crash in `modelNFR` when correcting subtracted fragment-size distribution (no +ve values after subtraction)

We have encountered a crash in the `FragmentMixDistribution.modelNFR` method (in `Occupancy.py`) for two samples in a recent experiment, when running `nucleoatac` version 0.3.4. (The program runs successfully on all other samples in the same experiment.)

For both problem samples the command line is of the form:

    nucleoatac run --bed <broadpeaks> --bam <bamfile> --fasta <fastafile> --cores 8 --out <samplename>

where `broadpeaks` is output from MACS.

The crash produces the following (redacted) traceback:

    Traceback (most recent call last):
      File "XXXX/nucleoatac034/bin/nucleoatac", line 33, in <module>
        nucleoatac_main(args)
      File "XXXX/nucleoatac034/lib/python2.7/site-packages/nucleoatac/cli.py", line 56, in nucleoatac_main
        run_occ(occ_args)
      File "XXXX/nucleoatac034/lib/python2.7/site-packages/nucleoatac/run_occ.py", line 96, in run_occ
        fragment_dist.modelNFR()
      File "XXXX/nucleoatac034/lib/python2.7/site-packages/nucleoatac/Occupancy.py", line 65, in modelNFR
        nuc[nuc<=0]=min(min(nfr)*0.1,min(nuc[nuc>0])*0.001)
    ValueError: min() arg is an empty sequence

which I've identified as being due to the fact that in each case `nuc[nuc>0]` is empty, as all values in the `nuc` array are negative. As a result `min(nuc[nuc>0]` produces the `ValueError` exception above.

My understanding is that `nuc` is the "subtracted difference between the extrapolated nucleosome-free model and the observed fragment distribution" mentioned in the "Occupancy determination" section of the NucleoATAC paper. The specific line of code where the failure occurs appears to be a subsequent "correction" to the subtracted distribution which replaces any zero or negative values with a small positive value.

The patch below is a workaround to avoid the error, by trapping for the `ValueError` exception and setting the zero and negative values to `min(nfr)*0.1` (the other expression that appears in the `min` statement above):

    --- a/nucleoatac/Occupancy.py
    +++ b/nucleoatac/Occupancy.py
    @@ -62,7 +62,10 @@ class FragmentMixDistribution:
             nuc = np.concatenate((np.zeros(boundaries[1]-self.lower),
                                 self.fragmentsizes.get(boundaries[1],self.upper) -
                                 self.nfr_fit.get(boundaries[1],self.upper)))
    -        nuc[nuc<=0]=min(min(nfr)*0.1,min(nuc[nuc>0])*0.001)
    +        try:
    +            nuc[nuc<=0]=min(min(nfr)*0.1,min(nuc[nuc>0])*0.001)
    +        except ValueError:
    +            nuc[nuc<=0]=min(nfr)*0.1
             self.nuc_fit = FragmentSizes(self.lower, self.upper, vals = nuc)
         def plotFits(self,filename=None):
             """plot the Fits"""

With this patch the program is then able to continue to completion for both samples.

We would have two questions:

1. Is this workaround a sensible fix for the problem? (i.e. can we still trust output from a version of `nucleoatac` which uses this patch?)
2. Are there any specific conditions or properties of the input data which might trigger the problem with the subtracted differences all being less than zero? (e.g. could it be caused by the inclusion of a specific "problem" region, or by a particular distribution of fragment length sizes etc.)

We're aware that the program is no longer actively maintained. However we would welcome any comments on the above - thanks for your help!

(Also adding @IanCodes to this issue)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Crash in `modelNFR` when correcting subtracted fragment-size distribution (no +ve values after subtraction) #98

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Crash in modelNFR when correcting subtracted fragment-size distribution (no +ve values after subtraction) #98

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Crash in `modelNFR` when correcting subtracted fragment-size distribution (no +ve values after subtraction) #98