-
Notifications
You must be signed in to change notification settings - Fork 18k
proposal: spec: clone and splice, new channel primitives #26282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Are there any more details on how the clone works? Say that we have the following code: var parent, cl chan int
go func() {
for {
select {
case i := <-parent:
fmt.Println(i)
}
}
}()
go func() {
for {
select {
case i := <-cl:
fmt.Println("Clone", i)
if i == 3 { return }
}
}
}()
for i := 0; i <5; i++ {
if i == 2 {
cl = clone parent
}
parent <- i
} Would that print something like:
I.e. would writing Also, I have no idea what |
If I read it right, splice c1 c2 makes c1 point to c2 so a write to c1 can be read from c2. |
@kardianos |
edit: discussion moved to #26343 I'd like to add
except that The code it replaces follows a fairly simple pattern but, even if the compiler recognized it, it wouldn't be able to perform the same optimization as that relies on the certainty that This is less useful than Also, if the drained channel was created by |
Would |
I would expect that c1 and c2 would need identical channel types. |
@jimmyfrasche if drained := c <- someData; drained {
return
} |
Also, if the channel being drained is a clone, would the stream automatically stop sending to it? |
Never said this explicitly: It would only be safe to use drain when there was a single reader (possibly of a cloned chan) the same way it's only safe to close a chan when there's a single writer. @urandom I suppose you would need to detect a drained channel. There's something disquieting about making the send statement into an expression, but detection would need to be paired with a send. If you couldn't detect it, something like
would just spin indefinitely. Though that is the case now if you create a drain goroutine manually. If this example detected a drain, it could just return. That makes me wonder about something like
(I'm still waiting for someone smarter than me to explain why this is a terrible idea, though 😆 ) |
Channels currently pass data in only one direction. Being able to detect a drained channel would make that bidirectional. It seems simpler and more consistent for a drain to be precisely equivalent to |
This has become a discussion about an operation not even mentioned by the proposer. If you want to talk about drain, which is a fine thing to talk about, please open a new issue for that. |
@robpike sorry for the hijack. To return to the original proposal, the definition of clone seems to imply infinitely-buffered channels. I would have expected cloning a channel to behave something more like: For an unbuffered chan, For a buffered chan, |
What happens if I do this:
Do receives on c2 always succeed instantly getting a zero value unless there's a blocked send on c2? Does c2 close as well? Does it take c1 out of the equation leaving c2 as it was before the splice, effectively unsplicing the two? Does it just panic? |
Splice: Clone: Given a channel with an N item bluffer, the leading item can not be discarded until all the receiver pick it up, right? Would it make sense to add an operation to expand buffering? This is not strictly a clone operation since not all the functionality is cloned. It is more of a fanout. There can be a corresponding fanin (which sort of overlaps with the splice operation). Conceptually:
For fanout, cR can be a readonly channel. For fanin, cW can be a write only channel. An equally interesting operation would be the inverse of
The effect is as follows:
The reader and writer of c need not be aware of the existence of the filter tap. Any buffered data in c at the time of unsplice can be read from cIn by the filter (and not by the consumer of c), since a decision to insert a filter can be triggered based on received items from c. PS:
|
That does seem reasonable to me and a bit more user-friendly. I have been wondering about that restriction, too. Maybe there's a good reason for it that I'm not seeing.
That's true. Perhaps
This could be achieved by multiple calls to
Could you file a separate proposal for that?
The equivalent functionality is possible in user code, but it requires additional channels and goroutines. This proposal would avoid that overhead and make it simpler to build more complex topologies. |
I am just exploring features that may dovetail with this proposal at the moment.
What I was getting at is to be able to implement something similar in user code without needing additional channels or goroutines. The proposed changes seem non-trivial and are perhaps better suited as a library component of some sort. Plus I would like to be able to avoid context switching overhead for simple but useful operations. For example:
[yeah, I know this has problems; the code is just to illustrate the idea] IMHO a pipe is a better abstraction than an iterator in other languages and it would be nice if it is available without needing to use goroutines. Since this proposal started from a suggestion by Doug McIlroy, I couldn't help wondering if we can make Go channels as easy to use as pipes are in a shell! On the other hand I can't think of a lower level building block to do this as yet. |
Regarding
There are some ways in which this can be written to satisfy the non-blocking requirements but they tend to get hairy and in any case they (in general) require buffering at least one element (even if both channels are unbuffered) or creating dedicated "channel pump" goroutines. I am not sure if this is a common enough scenario, but I wanted to submit it for consideration anyway because, as I mentioned above, I found myself in this situation a couple of times. I am aware that refactoring the code would solve the problem: to my credit, in both cases, when I encountered this the channels were provided by libraries, so there wasn't much I could do. |
I've ran into the exact problem @CAFxX described, and indeed solved it by completely refactoring everything. But I think it'd be really useful to have completely unbuffered pipelines. May I suggest the following extension to splice:
Which would be of type
This essentially imlements a completely unbuffered |
Polymorphic functions (#15292) seem like a better fit for that than broad built-in operators. The operators that @robpike proposes interact with channel internals in an interesting way: they potentially allow the runtime to alias buffers between channels. A |
The splice operation is a dangerous notion for a type that's intended to be shared between goroutines (how can c1 become write-only without invalidating declarations?), and generally unnecessary, for example https://play.golang.org/p/YaqdWEwcxQa eliminates Copy by giving rat and channels thereof a common interface. On a related note, the operations on series as implemented don't need a split that spawns goroutines, at most one term of a source series needs to be "buffered" and the reads happen in a well-defined order. The common interface was inspired by a footnote in "Squinting at Power Series" and the inference about terms and order is from same (I feel indebted to the author, his work made objecting to this proposal one of the most challenging (the paper's well above my pay grade) and rewarding experiences I've had as a humble student of computer science). The clone operation as proposed makes a key property of channels, synchronization, ambiguous. If a shared buffer is what's desired, a channel is a great way to synchronize access to it, as the original implementation of Split does (where the buffer is basically a linked list of goroutine stacks guarded by channels used as mutexes). Arbitrarily large channel capacity and demand-based allocation of channel buffers could also be a useful alternative to clone, #20352 proposes explicit creation of channels of infinite capacity. My thanks again to the proposer and for powser[12].go. |
Presumably after executing What is the buffer size of the result of calling What happens if you write to the channel that is returned by calling |
Similar, but maybe not the exact same problem (re: splice) that's cropped up a few of my projects: I want to select {
case <-done:
case signal := <-control:
case i, v, ok := <-sliceOfIdenticallyTypedChans:
// i, v, ok are the same as returned by `reflect.Select`
// https://golang.org/pkg/reflect/#Select
} |
@jdef Thanks, but that is really a different problem that should be discussed on a different issue. This one is about |
How does one determine that c1 contains no buffered data? maybe the tieing of write ends should happen after buffered data made it out or be restricted to unbuffered channels (which would be simpler but more limited in applicable use cases). Also, for use case of temporarily pass-through copying a transformer/filter, one would need to undo the splice somehow. |
Actually, one could add
which would block for N elements passing through and then return with |
I find the term |
Hi @wsc1, Long before magnetic tape, splice was used for weaving the ends of two ropes together to become one: Splice the main brace! It seems apt, and is in use for a similar operation in Linux's splice(2). https://en.wikipedia.org/wiki/Splice |
If splice c1 c2 just functions as a return from a goroutine, I don't see how this could be a problem similarly if splice c1 c2 N blocks. |
As initially stated, yes, but I think @jimmyfrasche's take:
fixes the ambiguity (and potentially infinite buffer size) nicely. |
I had forgotten the nautical sense. Thanks, it does indeed seem like a good name in light of that. |
Given that these ideas for splice and clone are in part motivated by stream processing chains, perhaps it makes sense to consider them in conjunction with making channels have an extra parameter to specify a block size in addition to channel capacity? Would others consider this issue an appropriate place to discuss this? |
Here are a couple of suggestions made by Doug McIlroy, original author of test/chan/powser[12].go and instigator of pipes in Unix. They are intriguing.
In Doug's words:
====
splice c1 c2
where channel c1 contains no buffered data,
identifies the write end of channel c1 with that
of channel c2. Channel c1 becomes write-only.
clone c
makes a new read-only channel positioned at the
same place in the data stream as readable
channel c. Thereafter both channels read the
same data sequence from the stream at unrelated
rates.
Splice allows a filter to elide itself from a pipeline
when it has no further substantive work to do, rather than
going into a copy loop.
Clone enables (buffered) fanout of a data stream.
Buffered data items may be garbage-collected when they
have been delivered to all readers.
These two capabilities are of general utility in stream
processing. golang.org/test/chan/powser1.go is one
application that could profitably use them--to eliminate
the many instances of Copy and also the per-element
go-routines spawned for every Split. Some Unix variants
have offered pipe splicing, and fanout is a staple of
dataflow algorithms. The current workarounds consume an
awful lot of bootless machine cycles.
The text was updated successfully, but these errors were encountered: