Skip to content

Slow HDBSCAN per-user stop-detection test takes minutes #290

@paco-barreras

Description

@paco-barreras

One stop-detection test is taking minutes, and that looks like a performance problem worth investigating separately from the current HDBSCAN correctness work.

In the recent run, test_st_hdbscan_per_user_basic was the slow one: its call phase took about 166 seconds. By contrast, the failing ground-truth HDBSCAN test was fast, around 1-2 seconds, so the long suite time is not caused by the assertion failure itself.

Suggested debugging path:

  • Profile test_st_hdbscan_per_user_basic directly rather than weakening the assertion.
  • Split timing around per-user grouping, _find_neighbors, core-distance computation, MST construction, hierarchy construction, and final stop summarization.
  • Check whether repeated per-user setup is rebuilding expensive spatial/time structures unnecessarily.
  • Compare single-user HDBSCAN runtime on the same fixture to determine whether the bottleneck is per-user orchestration or the HDBSCAN internals.

Expected outcome: keep the useful coverage, but make the test and/or implementation fast enough that the stop-detection test file does not spend several minutes on one passing case.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions