More ergonomic API for constructing sockets

I started out wanting to be able to [bind a socket to a given address](https://idea.popcount.org/2014-04-03-bind-before-connect/) (for the purposes of using a particular network interface) before connecting, and quickly discovered the standard library does not provide methods for doing so. I then found `net2::TcpBuilder`, which does provide this feature, but found the API somewhat less ergonomic than I would have liked.

@sfackler pointed out that [RFC 1461](https://github.com/rust-lang/rfcs/pull/1461) (after the failed [RFC 1158](https://github.com/rust-lang/rfcs/pull/1158)) pulled in a couple of changes from `net2` [into the standard library](https://github.com/rust-lang/rust/issues/31766), and @seanmonstar that net2 is a "[Desired out-of-band evaluation](https://internals.rust-lang.org/t/rust-libz-blitz/5184#desired)" in the libz blitz, so I figured I'd send some feedback in the hopes that eventually something like `TcpBuilder` might land in the standard library too.

My suggestions follow below. I'd be happy to file a PR, but @sfackler suggested that given the backwards-incompatibility of these changes, filing a ticket first for discussion would be a good idea. Some of these build on the [Builders enable construction of complex values (C-BUILDER)](https://github.com/brson/rust-api-guidelines#builders-enable-construction-of-complex-values-c-builder) section from @brson's Rust API guidelines, whereas others are just because I like them better. Thoughts and feedback very welcome!

 - The methods on `TcpBuilder` that modify the socket should all take `&mut self`, and return that `&mut` in their `Result`. This is because, while it is *technically* true that a socket can be mutated safely from multiple places, it leaves the consumer of the API confused about whether there's a difference between the original builder and the returned one. For instance, it is *not* clear just from the API (nor from the resulting code) that `tcp` and `tcp2` are *the same socket*, bound to *the same port* in the following code:

   ```rust
   let tcp = TcpBuilder::new_v4().unwrap();
   let tcp2 = tcp.bind("0.0.0.0:4000");
   ```

   Using `&mut self` is also more semantically accurate, since it indicates that we are in fact modifying the underlying socket.

 - The methods on `TcpBuilder` that listen, connect, or otherwise "initiate" the socket, should take `self`, not `&self`. The current API implies that you can re-use a builder to construct a second socket after calling `listen` or `connect` on it once, but this is *not* true. The code will panic if you try to do this, presumably because a single socket cannot be re-used. In theory, the builder could remember the configuration, and lazily create and configure the socket when needed, which would enable this kind of API, but I'm not sure that's really better. I think it's fairly rare that you'll want to re-use a socket configuration.

 - The change above brings us back to the first point about `&mut self` vs `&self`. @brson's post says that:

   > Under the rubric of making easy things easy and hard things possible, all builder methods for a consuming builder should take and returned an owned `self`.

   Which suggests that we should in fact make all the builder methods take `self`. This unfortunately combines poorly with the fact that our methods can fail, and thus return `Result<Self, ...>`. If a method fails, you'd like to still be able to recover the builder, which is tricky if `self` was consumed. Maybe @brson can provide some insight here?

 - This is a more controversial one, but it might be that we should provide a `SocketBuilder` rather than a separate `TcpBuilder` and `UdpBuilder`. Under the hood, that's what the Berkeley Socket API provides, and it's unclear you want the distinction here either. Instead, you could imagine `SocketBuilder` being parameterized by UDP/TCP, which would still let us only expose the TCP related methods for TCP sockets. I personally think this would make the API more readable, and it would avoid the complete duplication of methods between the implementations. Along the same lines, I'd prefer to see `V4`/`V6` be an enum passed to `new`, but I feel less strongly about that particular point.

@seanmonster pointed out that this is unlikely to break existing code, since current implementations must already only be calling the finalizing methods once (since any other use panics). Making the methods take `&mut self` will also likely require minor or no changes, since they are unlikely to be using the aliased `&` pointer anyway. Moving to `self` will require more changes to user code, as the original element will no longer be available, but if the regular one-line builder pattern is used with `unwrap` (which is unfortunately common), everything will keep working as usual. Dealing with the error cases will be trickier though, and we need a story there.

<details>
<summary>Here's the original discussion with @seanmonstar, @habnabit, and @sfackler from <code>#rust</code>:</summary>
<pre>
jonhoo: why on earth do all the methods on TcpBuilder in net2 take &self
jonhoo: and return Result&TcpBuilder:
jonhoo: that's a really weird builder pattern
seanmonstar: jonhoo: cause you can mutate the socket without a mutable reference
jonhoo: true, but it's a strange pattern
jonhoo: what does the return value even mean?
jonhoo: for bind() for example, do I need to be using the returned value?
jonhoo: or the original?
jonhoo: or is the original somehow not modified?
sfackler: jonhoo: they're the same pointer
seanmonstar: that was because since all those operations could return an error (they're adjusting options on the socket, not something in rustland)
_habnabit: jonhoo, the return value is for chaining
seanmonstar: it wasn't desirable for you to call build() and get a EINVAL, and be left wondering "well crap, which of those methods had the invalid argument"
jonhoo: sure, but the classic way to do this is to then take `self`
sfackler: jonhoo: no standard library builders work that way iirc
jonhoo: being left with both pointers is weird, because it's not *clear* that they're the same
jonhoo: taking `self` is unambigious
_habnabit: jonhoo, how do you get the socket back out then?
seanmonstar: jonhoo: https://github.com/brson/rust-api-guidelines#builders-enable-construction-of-complex-values-c-builder
seanmonstar: jonhoo: you'd then need it to return ResultTcpStream, (io::Error, TcpStream):
jonhoo: hmm
jonhoo: I see
seanmonstar: otherwise its lost
jonhoo: right right
jonhoo: I still think I'd prefer it taking an `&mut self` even though it's not technically necessary, just to indicate that you are actually modifying the socket (and thus that you can use either the original or the returned)
jonhoo: just like in brson's pattern above
seanmonstar: its cause TcpStream is an odd duckling, and can be mutated through &self
jonhoo: mmm
seanmonstar: its safe to copy from a stream back onto itself
seanmonstar: so io::copy(&tcp, &tcp) works
jonhoo: oh, sure, I realize that, but for the builder pattern specifically it seems strange to not take &mut self even though *technically* you can
jonhoo: also, are you allowed to connect multiple times from the same builder?
jonhoo: because the current API allows that, right?
seanmonstar: nope, looks like it wil panic if tried again
jonhoo: yeah, I don't think the underlying sockets allows that
seanmonstar: i agree that the semantics of the api feel very not-normal
jonhoo: I guess *technically* the builder could allow that by lazily creating the socket on connect
jonhoo: which is the case for brson's example, and the reason his Command's build takes &self
jonhoo: but somehow it feels cleaner to me, especially for TcpBuilder, to take `self`
jonhoo: since there is just *one* socket that you're constructing
jonhoo: and then I think it follows that it should also be &mut
jonhoo: even though the syscall API allows &self
jonhoo: it'd be a pretty serious breaking change though
seanmonstar: would it though? the most common case is likely building it all at once
jonhoo: ah, you think most code actually conforms to the stricter api
jonhoo: yeah, probably
jonhoo: though it would require people to place `mut` in a couple of places if we made the methods `&mut self`
seanmonstar: i dunno, im sure you'd get push back if you did
seanmonstar: but i'd be silently rooting for you
jonhoo: haha, probably
jonhoo: thanks :p
arete: me too, I remember the API being pretty terrible =)
jonhoo: maybe I'll file a PR to revamp it and fix up the docs
jonhoo: worth filing an issue first to discuss you think?
sfackler: jonhoo: I would
sfackler: that builder is very old and barely used
sfackler: conventions around it are not super clear
sfackler: jonhoo: we moved a bunch of stuff from net2 into libstd a year or so ago, but left those builders out of it because we weren't super sure about the way they should work
jonhoo: sfackler: mmm, fair enough. okay, I'll file a ticket discussing this
</pre>
</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

More ergonomic API for constructing sockets #56

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

More ergonomic API for constructing sockets #56

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions