Skip to content

[EPIC] Experiment with some different shuffle implementations #3778

@andygrove

Description

@andygrove

What is the problem the feature request solves?

Following on from the refactors to move shuffle code into a separate crate in #3749 and #3772, and the new standalone shuffle benchmark binary in #3752, I propose that we start evaluating some different shuffle approaches to better understand the trade offs in terms of throughput and memory usage of different approaches.

This is a very open-ended epic. There are some ideas that I am planning on experimenting with, such as moving away from the current interleave batches approach, which keeps batches buffered for a long time, and moving to a more immediate scatter approach that removes the need to buffer the input batches for such a long time. I expect other contributors may also have ideas that they would like to explore.

There is a already a trait for the shuffle partition-and-write operation, so it should be trivial to support different implementations behind a config for testing and evaluation, and allow the standalone benchmark to compare performance characteristics across implementations.

Describe the potential solution

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:shuffleShuffle (JVM and native)enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions