Skip to content
This repository was archived by the owner on Apr 22, 2023. It is now read-only.
This repository was archived by the owner on Apr 22, 2023. It is now read-only.

child_process descriptor passing doesn't pause underlying handle properly. #7905

@samcday

Description

@samcday

(affects latest 0.10 and 0.11)


TL;DR version:

  • Sockets call _handle.readStart() as soon as connection is made.
  • This means that under load, some data can be pulled from the Socket before its handle is detached from current process and sent over IPC channel to another destination.
  • This means the other end will never see that initial data that was pulled.

First, an example:

https://gist.github.com/samcday/d20f91aae0e4dbad27e8

Here's the scenario I'm trying to demonstrate:

  1. Connection is accepted by a cluster worker.
  2. Connection is passed to the master.
  3. Master reads initial data from connection, writes to the connection, and then closes it.

Run server.js in one shell, then run client.js in another. Run it a few times if you don't see the issue at first. What you should discover is that often some of the 10 connections won't write back to the client. This is because the readable event is never fired. This is because no data ever came in once the master got the connection. This is because actually what happened is the worker shovelled some of the bytes out before the connection descriptor was fully detached and sent to the master.

The example demonstrates a somewhat contrived scenario, I know you would generally not want to be sending connections off to the master to process, this is what you have workers for! My actual use-case involves a worker passing a connection to the master to be directed over to another worker.

What I've determined is that Socket._read is actually calling readStart on the underlying handle after connection. Even though this isn't putting the socket into flowing mode, it's still going to grab some bytes from the underlying socket before it gets passed to another process.

I can currently work around this issue by doing something like this:

server.on("connection", function(c) {
    c._handle.readStop();
});

This is a bit of a hack, and only works because I know that c._handle.readStart() is called in Socket._read in a once handler on Socket connection. But, unless I'm missing something, this is what child_process.js should be doing in the handleConversion stuff.

I believe this is what is causing #7784

It's midnight here and I've spent the past 3 days bashing my head against the wall 'till the wee hours of the morning figuring out a problem that was finally traced down to this. So, I apologise if the description reads a little ranty or nonsensical :P

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions