Optional Disk for P2P shuffling

The P2P shuffling extension currently has a hard requirement on disk. Effectively the entire data is written to and read from disk at least once.

This works roughly by

1. Serialize a batch of shards on the transferring side
2. Deserialize batch of shards on the receiving side
3. Group those shards by output partition
4. Serialize those output partitions again
5. Write the output partitions to disk
6. Once required, read the output partition from disk again and deserialize

This allows naturally for larger than memory shuffles but introduces an unnecessary de-/serialization and disk IO step if the data fits (mostly) into memory.

The current implementation of the disk buffer ensures that the writing to disk is batched such that we do not suffer too large disk IO overhead but it requires to be flushed before we can read back any data.

To allow more efficient shuffling for dataset that fit mostly into memory we would like to avoid writing all of the data to disk.


## Additional notes

As an intermediate step it is very likely easier to keep the "redundant" serialization step as is and instead allow for a `DiskShardsBuffer.read` to consolidate data that's on disk and memory.
Avoiding the serialization step entirely for the in-memory case would require us to delay this serialization step until we are actually writing to disk. This is not impossible but needs to be implemented with some care to avoid blocking the event loop.


cc @wence- @hendrikmakait 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Optional Disk for P2P shuffling #7572

Additional notes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Optional Disk for P2P shuffling #7572

Description

Additional notes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions