perf: replace bubble sort with efficient sorting algorithm #111
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR replaces the O(n²) bubble sort algorithm in the
_sortmethod with Python's built-insorted()usingfunctools.cmp_to_key, achieving O(n log n) time complexity. The method is used to order preprocessing labels based on pairwise ordering constraints.Also includes an efficiency report (
EFFICIENCY_REPORT.md) documenting 5 potential efficiency improvements found in the codebase for future reference.Review & Testing Checklist for Human
a + "#" + bis inlabel_order,ashould come beforeb. Test with actual preprocessing label sets to confirm.pytestto ensure no regressions in pipeline generation, as this method affects how preprocessing components are ordered.a#bnorb#aexists inlabel_order(comparator returns 0). The original bubble sort would preserve relative order; Python's sort is stable but behavior withcmp_to_keyshould be validated.Recommended test plan: Run the existing test suite and manually test pipeline generation with a dataset that uses multiple preprocessing steps to verify the ordering is correct.
Notes