Skip to content

Inefficiency In Overlaps Call #127

@jgaupp-accuragen

Description

@jgaupp-accuragen

Overlaps performs much poorer than Overlap when running a tree with many small intervals.

The performance problem was found when calling Overlaps against a tree of ~450,000 intervals typically around 400 units wide. The problem was not obvious when calling Overlaps against a tree of ~5,000 intervals typically around 100,000 units wide.

When calling overlaps against the tree of 450K intervals, a large set of queries completed in ~15 minutes, while overlap took ~12 seconds.

cProfile identified the time spent at -

731923 1414.870 0.002 1414.905 0.002 /<venv path>/python3.10/site-packages/intervaltree/intervaltree.py:616(<genexpr>)

where column 2 is total time (1414.870) and the the operation referenced at intervaltree.py:616 is -

return any(
    self.overlaps_point(bound)
    for bound in self.boundary_table
    if begin < bound < end
) 

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions