Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions Sources/AsyncAlgorithms/AsyncBufferedByteIterator.swift
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,24 @@ internal struct _AsyncBytesBuffer: @unchecked Sendable {
}
try Task.checkCancellation()
do {
// If two tasks have access to this iterator then the references on
// the storage will be non uniquely owned. This means that any reload
// must happen into it's own fresh buffer. The consumption of those
// bytes between two tasks are inherently defined as potential
// duplication by the nature of sending that buffer across the two
// tasks - this means that the brief period in which they may be
// sharing non reloaded bytes is to be expected; basically in that
// edge case of making the iterator and sending that across to two
// places to iterate is asking for something bizzare and the answer
// should not be crash, but it definitely cannot be consistent.
//
// The unique ref check is here to prevent the potentials of a crashing
// secnario.
if !isKnownUniquelyReferenced(&storage) {
// The count is not mutated across invocations so the access is safe.
let capacity = storage.buffer.count
storage = Storage(capacity: capacity)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So - if it doesn't crash, but it is an inconsistent state, perhaps we should always crash instead, by asserting? What do we expect the behavior to actually be if two tasks are using the same iterator?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every single assertion or fatal error is a risk to potentially crash someone's app in the wild. If we always crash then the type is not sendable.

The expected output here is that it is not deterministic on which consumer may get which slice of the bytes but the total bytes of all of the consumers should be the total bytes from the read. For example if you read a file the iterators are going to read all of that file split across N iterations; this should ensure that there is no duplication of reads.

Also we should be REALLY clear about the scenario of how this occurs:

func example(_ bytes: SomeByteSequence) {
  let iterator: AsyncBufferedByteIterator = bytes.makeAsyncIterator()
  Task {
     var iter = iterator 
     while let byte = try await iter.next() { }
  }
  Task {
     var iter = iterator 
     while let byte = try await iter.next() { }
  }
}

That very edge case usage should not prevent the type from being Sendable for other uses like being used with zip or other restricted Sendable algorithms. This fix alleviates that restriction and makes it at least fit the behavior of other AsyncSequence types that are Sendable; consume it in more than one spot and it acts structurally.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In raw technical terms this type is a move-only Sendable type. We don't have move only semantics yet.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, thanks for the additional info.

}
let readSize: Int = try await readFunction(storage.buffer)
if readSize == 0 {
finished = true
Expand Down