Overlaps performs much poorer than Overlap when running a tree with many small intervals.
The performance problem was found when calling Overlaps against a tree of ~450,000 intervals typically around 400 units wide. The problem was not obvious when calling Overlaps against a tree of ~5,000 intervals typically around 100,000 units wide.
When calling overlaps against the tree of 450K intervals, a large set of queries completed in ~15 minutes, while overlap took ~12 seconds.
cProfile identified the time spent at -
731923 1414.870 0.002 1414.905 0.002 /<venv path>/python3.10/site-packages/intervaltree/intervaltree.py:616(<genexpr>)
where column 2 is total time (1414.870) and the the operation referenced at intervaltree.py:616 is -
return any(
self.overlaps_point(bound)
for bound in self.boundary_table
if begin < bound < end
)
Overlaps performs much poorer than Overlap when running a tree with many small intervals.
The performance problem was found when calling Overlaps against a tree of ~450,000 intervals typically around 400 units wide. The problem was not obvious when calling Overlaps against a tree of ~5,000 intervals typically around 100,000 units wide.
When calling overlaps against the tree of 450K intervals, a large set of queries completed in ~15 minutes, while overlap took ~12 seconds.
cProfile identified the time spent at -
731923 1414.870 0.002 1414.905 0.002 /<venv path>/python3.10/site-packages/intervaltree/intervaltree.py:616(<genexpr>)where column 2 is total time (1414.870) and the the operation referenced at intervaltree.py:616 is -