Skip to content

More ergonomic API for constructing sockets #56

@jonhoo

Description

@jonhoo

I started out wanting to be able to bind a socket to a given address (for the purposes of using a particular network interface) before connecting, and quickly discovered the standard library does not provide methods for doing so. I then found net2::TcpBuilder, which does provide this feature, but found the API somewhat less ergonomic than I would have liked.

@sfackler pointed out that RFC 1461 (after the failed RFC 1158) pulled in a couple of changes from net2 into the standard library, and @seanmonstar that net2 is a "Desired out-of-band evaluation" in the libz blitz, so I figured I'd send some feedback in the hopes that eventually something like TcpBuilder might land in the standard library too.

My suggestions follow below. I'd be happy to file a PR, but @sfackler suggested that given the backwards-incompatibility of these changes, filing a ticket first for discussion would be a good idea. Some of these build on the Builders enable construction of complex values (C-BUILDER) section from @brson's Rust API guidelines, whereas others are just because I like them better. Thoughts and feedback very welcome!

  • The methods on TcpBuilder that modify the socket should all take &mut self, and return that &mut in their Result. This is because, while it is technically true that a socket can be mutated safely from multiple places, it leaves the consumer of the API confused about whether there's a difference between the original builder and the returned one. For instance, it is not clear just from the API (nor from the resulting code) that tcp and tcp2 are the same socket, bound to the same port in the following code:

    let tcp = TcpBuilder::new_v4().unwrap();
    let tcp2 = tcp.bind("0.0.0.0:4000");

    Using &mut self is also more semantically accurate, since it indicates that we are in fact modifying the underlying socket.

  • The methods on TcpBuilder that listen, connect, or otherwise "initiate" the socket, should take self, not &self. The current API implies that you can re-use a builder to construct a second socket after calling listen or connect on it once, but this is not true. The code will panic if you try to do this, presumably because a single socket cannot be re-used. In theory, the builder could remember the configuration, and lazily create and configure the socket when needed, which would enable this kind of API, but I'm not sure that's really better. I think it's fairly rare that you'll want to re-use a socket configuration.

  • The change above brings us back to the first point about &mut self vs &self. @brson's post says that:

    Under the rubric of making easy things easy and hard things possible, all builder methods for a consuming builder should take and returned an owned self.

    Which suggests that we should in fact make all the builder methods take self. This unfortunately combines poorly with the fact that our methods can fail, and thus return Result<Self, ...>. If a method fails, you'd like to still be able to recover the builder, which is tricky if self was consumed. Maybe @brson can provide some insight here?

  • This is a more controversial one, but it might be that we should provide a SocketBuilder rather than a separate TcpBuilder and UdpBuilder. Under the hood, that's what the Berkeley Socket API provides, and it's unclear you want the distinction here either. Instead, you could imagine SocketBuilder being parameterized by UDP/TCP, which would still let us only expose the TCP related methods for TCP sockets. I personally think this would make the API more readable, and it would avoid the complete duplication of methods between the implementations. Along the same lines, I'd prefer to see V4/V6 be an enum passed to new, but I feel less strongly about that particular point.

@seanmonster pointed out that this is unlikely to break existing code, since current implementations must already only be calling the finalizing methods once (since any other use panics). Making the methods take &mut self will also likely require minor or no changes, since they are unlikely to be using the aliased & pointer anyway. Moving to self will require more changes to user code, as the original element will no longer be available, but if the regular one-line builder pattern is used with unwrap (which is unfortunately common), everything will keep working as usual. Dealing with the error cases will be trickier though, and we need a story there.

Here's the original discussion with @seanmonstar, @habnabit, and @sfackler from #rust:
jonhoo: why on earth do all the methods on TcpBuilder in net2 take &self
jonhoo: and return Result&TcpBuilder:
jonhoo: that's a really weird builder pattern
seanmonstar: jonhoo: cause you can mutate the socket without a mutable reference
jonhoo: true, but it's a strange pattern
jonhoo: what does the return value even mean?
jonhoo: for bind() for example, do I need to be using the returned value?
jonhoo: or the original?
jonhoo: or is the original somehow not modified?
sfackler: jonhoo: they're the same pointer
seanmonstar: that was because since all those operations could return an error (they're adjusting options on the socket, not something in rustland)
_habnabit: jonhoo, the return value is for chaining
seanmonstar: it wasn't desirable for you to call build() and get a EINVAL, and be left wondering "well crap, which of those methods had the invalid argument"
jonhoo: sure, but the classic way to do this is to then take `self`
sfackler: jonhoo: no standard library builders work that way iirc
jonhoo: being left with both pointers is weird, because it's not *clear* that they're the same
jonhoo: taking `self` is unambigious
_habnabit: jonhoo, how do you get the socket back out then?
seanmonstar: jonhoo: https://github.com/brson/rust-api-guidelines#builders-enable-construction-of-complex-values-c-builder
seanmonstar: jonhoo: you'd then need it to return ResultTcpStream, (io::Error, TcpStream):
jonhoo: hmm
jonhoo: I see
seanmonstar: otherwise its lost
jonhoo: right right
jonhoo: I still think I'd prefer it taking an `&mut self` even though it's not technically necessary, just to indicate that you are actually modifying the socket (and thus that you can use either the original or the returned)
jonhoo: just like in brson's pattern above
seanmonstar: its cause TcpStream is an odd duckling, and can be mutated through &self
jonhoo: mmm
seanmonstar: its safe to copy from a stream back onto itself
seanmonstar: so io::copy(&tcp, &tcp) works
jonhoo: oh, sure, I realize that, but for the builder pattern specifically it seems strange to not take &mut self even though *technically* you can
jonhoo: also, are you allowed to connect multiple times from the same builder?
jonhoo: because the current API allows that, right?
seanmonstar: nope, looks like it wil panic if tried again
jonhoo: yeah, I don't think the underlying sockets allows that
seanmonstar: i agree that the semantics of the api feel very not-normal
jonhoo: I guess *technically* the builder could allow that by lazily creating the socket on connect
jonhoo: which is the case for brson's example, and the reason his Command's build takes &self
jonhoo: but somehow it feels cleaner to me, especially for TcpBuilder, to take `self`
jonhoo: since there is just *one* socket that you're constructing
jonhoo: and then I think it follows that it should also be &mut
jonhoo: even though the syscall API allows &self
jonhoo: it'd be a pretty serious breaking change though
seanmonstar: would it though? the most common case is likely building it all at once
jonhoo: ah, you think most code actually conforms to the stricter api
jonhoo: yeah, probably
jonhoo: though it would require people to place `mut` in a couple of places if we made the methods `&mut self`
seanmonstar: i dunno, im sure you'd get push back if you did
seanmonstar: but i'd be silently rooting for you
jonhoo: haha, probably
jonhoo: thanks :p
arete: me too, I remember the API being pretty terrible =)
jonhoo: maybe I'll file a PR to revamp it and fix up the docs
jonhoo: worth filing an issue first to discuss you think?
sfackler: jonhoo: I would
sfackler: that builder is very old and barely used
sfackler: conventions around it are not super clear
sfackler: jonhoo: we moved a bunch of stuff from net2 into libstd a year or so ago, but left those builders out of it because we weren't super sure about the way they should work
jonhoo: sfackler: mmm, fair enough. okay, I'll file a ticket discussing this

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions