Skip to content
This repository was archived by the owner on Mar 10, 2020. It is now read-only.

send-files-stream.js converter does not do back-pressure (it does after all!) #702

Closed
pgte opened this issue Feb 26, 2018 · 5 comments
Closed

Comments

@pgte
Copy link
Contributor

pgte commented Feb 26, 2018

When applying a converter to the stream, the data piping is done by hand, preventing back-pressure from happening.

@bvand
Copy link

bvand commented Mar 17, 2018

@pgte I'm new to the project, happy to try to help apply back-pressure here.
I read through https://nodejs.org/en/docs/guides/backpressuring-in-streams/ and https://nodejs.org/api/stream.html#stream_event_drain and some other documentation - let me know if I'm understanding the flow (no pun intended) of usage & back-pressure here:

Is that all accurate?

Separately, I noticed the comment here: https://github.com/ipfs/js-ipfs-api/blob/master/src/utils/multipart.js#L86-L90. That suggests to me that the back-pressure in multipart.js might not actually work as expected, so I'm a little confused. Is that the case?

Thanks in advance for helping me tackle this!

@pgte
Copy link
Contributor Author

pgte commented Mar 19, 2018

@bvand It looks like there is no back-pressure problem here. It was just my pessimistic personality trait acting up. :)

Read on if you want to know my train of thought on this and an illustration of how the various streams are wired together here:

@bvand that's all accurate.

Separately, I noticed the comment here: https://github.com/ipfs/js-ipfs-api/blob/master/src/utils/multipart.js#L86-L90. That suggests to me that the back-pressure in multipart.js might not actually work as expected, so I'm a little confused. Is that the case?

These lines are pausing the source file while the output from multipart is being written.

Separately, I noticed the comment here: https://github.com/ipfs/js-ipfs-api/blob/master/src/utils/multipart.js#L86-L90. That suggests to me that the back-pressure in multipart.js might not actually work as expected, so I'm a little confused. Is that the case?

The comment is correct, and it refers to This other place where we're also doing a push. It's lacking back-pressure. It should only call back once that one line is flushed. Since it's just the last line, it's not a huge problem.


Discussion

Back-pressure is a mechanism to prevent a source (someone that writes data into a stream) from generating data at a rate higher than the data sink can handle. In our case, the ultimate sink is the HTTP API server, but we can think of our sink as HTTP API client (the "request" rectangle in the diagram below).

To allow for back-pressure to happen, we need to ensure this and only this:

We need to ensure that the _write only calls back once the write is flushed.

(This is how node keeps track of highWaterMark and decides whether to return true or false in stream.write(…)).

It turns out that we only need to ensure that the stream.write(…) function calls back in these two places:

multipart-2

  • 1: calls back here. Multipart calls back here.

  • 2: the request object is a native Node.js HTTP client object, which calls back once the socket data has flushed, thus inheriting the TCP flow control.

After all, it turns out that back-pressure is enabled here.

Sorry about all the fuss..

@pgte pgte closed this as completed Mar 19, 2018
@ghost ghost removed the ready label Mar 19, 2018
@daviddias daviddias changed the title src/utils/send-files-stream.js converter does not do back-pressure send-files-stream.js converter does not do back-pressure (it does after all!) Mar 19, 2018
@bvand
Copy link

bvand commented Mar 20, 2018

@pgte I see, thanks for the in-depth explanation.

Are there currently any performance tests that can help confirm whether or not back-pressure is implemented correctly moving forward? Or do you think it's worthwhile to create another issue for these types of tests?

@pgte
Copy link
Contributor Author

pgte commented Mar 20, 2018

@bvand no, there are no tests that are specific to back-pressure. I think back-pressure is a good characteristic, so I think we should test for it so that we don't lose it in future changes..

BTW, I don't think we need to test back-pressure benchmarking performance: it would "suffice" to create a dummy API server endpoint that takes a long time to reply and see if the source gets paused when the high water mark is reached..

@bvand
Copy link

bvand commented Mar 23, 2018

@pgte that makes sense, I can create an issue for that and get working on it - does that sound good?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants