Storage Partitioned Transfer Base #3340

jaschrep-msft · 2025-11-12T20:00:08Z

Implements two foundation components to implement partitioned upload and copy.

PartitionedStream: Consumes a Box<dyn SeekableStream> and converts it to a Stream<Item = Result<Bytes, Error>> where each Ok(Bytes) returned is a contiguously buffered partition to be used for a put block or equivalent request.
run_all_with_concurrency_limit(): Takes a sequence of async jobs (impl FnOnce() -> impl Future<Output = Result<(), Error>>). These will be sequences of put block operations or equivalent requests.

sdk/storage/azure_storage_blob/src/streams/partitioned_stream.rs

Copilot

Pull Request Overview

This PR implements foundational components for partitioned upload and copy operations in Azure Storage Blob SDK, introducing stream partitioning and concurrent operation execution capabilities.

Adds PartitionedStream that converts a SeekableStream into partitioned Bytes chunks for block operations
Implements run_all_with_concurrency_limit() for executing async operations with configurable concurrency
Includes comprehensive test coverage for both components

Reviewed Changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 16 comments.

Show a summary per file

File	Description
sdk/storage/azure_storage_blob/src/streams/partitioned_stream.rs	New implementation of `PartitionedStream` with `Stream` and `FusedStream` traits, plus test suite
sdk/storage/azure_storage_blob/src/streams/mod.rs	Module declaration for streams
sdk/storage/azure_storage_blob/src/partitioned_transfer/mod.rs	New implementation of `run_all_with_concurrency_limit()` with concurrency control logic and tests
sdk/storage/azure_storage_blob/src/lib.rs	Module declarations for new `partitioned_transfer` and `streams` modules
sdk/storage/azure_storage_blob/Cargo.toml	Added `bytes` and `futures` dependencies (and `rand` for tests)
Cargo.lock	Lock file updates reflecting new dependencies

sdk/storage/azure_storage_blob/src/streams/partitioned_stream.rs

sdk/storage/azure_storage_blob/src/partitioned_transfer/mod.rs

sdk/storage/azure_storage_blob/src/streams/partitioned_stream.rs

sdk/storage/azure_storage_blob/src/partitioned_transfer/mod.rs

sdk/storage/azure_storage_blob/src/streams/partitioned_stream.rs

sdk/storage/azure_storage_blob/src/partitioned_transfer/mod.rs

Accept useful generated comments. Co-authored-by: Copilot <[email protected]>

generated docs tried to write a doctest for a non-public function.

heaths · 2025-11-24T21:38:25Z

sdk/storage/azure_storage_blob/src/streams/partitioned_stream.rs

+    fn take(&mut self) -> Vec<u8> {
+        let mut ret = mem::replace(
+            &mut self.buf,
+            vec![0u8; std::cmp::min(self.partition_len, self.inner.len() - self.total_read)],
+        );
+        ret.truncate(self.buf_offset);
+        self.buf_offset = 0;
+        ret
+    }


You're already taking a dependency on Bytes. Use it. It already has all the functionality for this. At the very least, don't repeat the calculation of buf.len() that you'd already have computed during construction. Bytes has already been well-tested. I see potential faults here.

heaths · 2025-11-24T21:40:59Z

sdk/storage/azure_storage_blob/src/streams/partitioned_stream.rs

+                    Poll::Ready(Some(Ok(Bytes::from(ret))))
+                };
+            } else {
+                match ready!(pin!(&mut this.inner).poll_read(cx, &mut this.buf[this.buf_offset..]))


More consistent to use #[pin_project] like we do in our pager and pollers. Also, why are you matching on ready!(...)? I don't understand how that's useful. That's meant to produce output when implementing futures.

@heaths why is implementing a poll method on a stream so different from "implementing futures"? I'm definitely new to writing poll methods, but my understanding is that ready!(exp) expands to the following:

match exp { Poll::Ready(ret) = ret, Poll::Pending => return Poll::Pending, }

That seems like what I want to do here, right? If the read of the inner stream is pending, then I want to propagate that state. Is it some sort of style thing to write out the full match instead of the macro in some scenarios? I don't understand what I'd do instead in this scenario except not implement poll_next() in the first place.

I'll look into pin_project. New concept to me.

poll_read would return a Poll::Ready(_) or Poll::Pending, yes. But by wrapping it in a ready!(_) you're making it Poll::Ready. ready!(_) is to return a value as an already-ready state. Pager is probably a good one to look at, and after changes I'm making to manually do it for various reasons: #3372

I see the parts of your linked PR where you are matching the result of the poll and returning. Not to dig in, but I am looking at the docs and (apart from a variable name) the documented expansion of the macro looks character-for-character like what you're writing in your PR. I'd like to understand either how I am wrong in my comparison or why I am meant to manually write out the full expansion even if they are identical.

jaschrep-msft added 3 commits November 11, 2025 13:22

partitioned stream

7993413

partitioned transfer core (upload/copy)

51117f4

hookup moduled

071fbe8

github-actions bot added the Storage Storage Service (Queues, Blobs, Files) label Nov 12, 2025

jaschrep-msft added 2 commits November 13, 2025 13:20

pass spellcheck

30ab03e

caught some todos

b505b82

vincenttran-msft reviewed Nov 19, 2025

View reviewed changes

sdk/storage/azure_storage_blob/src/streams/partitioned_stream.rs Show resolved Hide resolved

jaschrep-msft marked this pull request as ready for review November 19, 2025 22:01

jaschrep-msft requested review from LarryOsterman, RickWinter and heaths as code owners November 19, 2025 22:01

Copilot AI review requested due to automatic review settings November 19, 2025 22:01

jaschrep-msft requested a review from jalauzon-msft as a code owner November 19, 2025 22:01

Copilot started reviewing on behalf of jaschrep-msft November 19, 2025 22:01 View session

Copilot finished reviewing on behalf of jaschrep-msft November 19, 2025 22:04

Copilot AI reviewed Nov 19, 2025

View reviewed changes

jaschrep-msft and others added 5 commits November 19, 2025 22:21

broke out some functions | comments

ef2833b

replace assertion with NonZero usage

7cb2cfb

Apply suggestions from code review

134c514

Accept useful generated comments. Co-authored-by: Copilot <[email protected]>

rename tests

e4f2492

removed doctest

ec5df02

generated docs tried to write a doctest for a non-public function.

heaths reviewed Nov 24, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Storage Partitioned Transfer Base #3340

Storage Partitioned Transfer Base #3340

jaschrep-msft commented Nov 12, 2025

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

heaths Nov 24, 2025

Uh oh!

heaths Nov 24, 2025

Uh oh!

jaschrep-msft Nov 25, 2025

Uh oh!

heaths Nov 25, 2025

Uh oh!

jaschrep-msft Nov 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Storage Partitioned Transfer Base #3340

Are you sure you want to change the base?

Storage Partitioned Transfer Base #3340

Conversation

jaschrep-msft commented Nov 12, 2025

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

heaths Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

heaths Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

jaschrep-msft Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

heaths Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

jaschrep-msft Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants