[SPARK-56235][CORE] Add reverse index in TaskSetManager to avoid O(N) scans in executorLost#55030
Open
DenineLu wants to merge 1 commit intoapache:masterfrom
Open
[SPARK-56235][CORE] Add reverse index in TaskSetManager to avoid O(N) scans in executorLost#55030DenineLu wants to merge 1 commit intoapache:masterfrom
DenineLu wants to merge 1 commit intoapache:masterfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
This PR adds a reverse index
executorIdToTaskIds: HashMap[String, OpenHashSet[Long]]inTaskSetManagerto efficiently look up tasks by executor ID, replacing O(N) full scans overtaskInfosinexecutorLost()with O(K) direct lookups (K = tasks per executor).Changes:
executorIdToTaskIdsfield inTaskSetManager, populated at task launch inprepareLaunchingTask()executorLost()to iterate only over tasks on the lost executor via the reverse indexWhy are the changes needed?
In a production Spark job (Spark 3.5.1, dynamic allocation enabled, disable shuffle tracking) with a single stage containing 5 million tasks, we observed that near the end of the stage, the Spark UI showed the last few tasks stuck in "RUNNING" state for 1-2 hours.


However, checking executor thread dumps confirmed that no task threads were actually running — the tasks had already completed on the executor side, but the Driver had not processed their completion messages.
CPU profiling of the Driver JVM (5-minute snapshot) revealed that

TaskSetManager.executorLost()was consuming 99.5% of all CPU samples, due to O(N) full scans over thetaskInfosHashMap (N = 5,000,000 entries).The
executorLost()method scans the entiretaskInfosmap to find tasks on the lost executor:The blocking is amplified when the following conditions are present:
spark.dynamicAllocation.executorIdleTimeout(default 60s) to complete. Most executors have finished their work and sit idle, while these slow tasks are still running.After this PR, the same workload (5M tasks, 10K executors, dynamic allocation enabled) no longer exhibits the stall. Execution time reduced from 117 minutes to 45 minutes. At the end of the Stage, the optimization has eliminated the previous
executorLosthotspot issue.Memory overhead (measured with
jmap -histo:live):taskInfosoverhead (829 MB)Does this PR introduce any user-facing change?
No.
How was this patch tested?
Added 2 tests in
TaskSetManagerSuiteWas this patch authored or co-authored using generative AI tooling?
No.