-
-
Notifications
You must be signed in to change notification settings - Fork 33.4k
Description
- Version: At least since Node 12
- Platform: Windows - very common, OSX - very common, Linux - rare
- Subsystem: TLS/HTTPS - very common, HTTP - maybe(?), very rare if exists
What steps will reproduce the bug?
node test/parallel/test-https-truncate.js
- then dump the network traffic
The problem has existed for quite some time, but was masked for most clients and it appeared as a seemingly random problem
How often does it reproduce? Is there a required condition?
node test/parallel/test-https-truncate.js
on Windows is a guaranteed hit
#23169 describes an almost guaranteed hit on OSX
Reproduction on Linux is quite difficult and will usually happen only after running in while /bin/true
loop for some time
What is the expected behavior?
res.end()
leads to a proper active closing of the TCP connection
What do you see instead?
res.end()
causes the server to directly call uv_close()
which immediately destroys the kernel socket that goes into TIME_WAIT
state - when the client receives this FIN
packet he will respond with an ACK
and his last data packet - this packet triggers a RST
from the server
Additional information
There are two ways to properly close a TCP connection:
- The simultaneous closing, which happens 99% of the time where the higher-level protocol, through some form of a
BYE
message, signals to both ends to simultaneously call the kernelclose()
-uv_close()
for us - thus exchanging twoFIN
/ACK
sequences - today most higher level protocols do provide some form of a bye message - The passive/active closing, a more archaic form of closing the connection, where one end will unilaterally close the connection without the other one expecting it. HTTP/1.0 with "Connection: close" is a classical example. TLS/1.1 also has a provision for an optional unilateral closing. In this case, the end originating the closing, the so-called active end, should send a
FIN
packet triggered by calling the kernelshutdown()
-uv_shutdown()
in our case. Upon receiving theFIN
packet, the passive end shouldACK
it, then send any remaining data in adata
packet and then proceed to send his ownFIN
. The active end should destroy the connection withclose()
/uv_close()
What currently happens is that when doing res.end()
from JS this ends in net.Socket.close()
and then goes through TCPWrap
which does not overload Close()
and finally in HandleWrap::Close()
Here uv_close()
is called - this is a direct, unscheduled code-path
While the uv_shutdown()
lies on an indirect code path scheduled by a Dispatch micro-task in LibuvStreamWrap::CreateShutdownWrap()
The result is that the shutdown happens one or two microseconds after the close when in fact a proper close should wait for the shutdown to finish
My opinion is that for TCP connections uv_close()
should be called only in the uv_shutdown()
callback
Here is the full exchange with all the layers:
Client JS | Client Node/libuv | Client Kernel | Remote Server |
---|---|---|---|
res.end() | |||
shutdown(SD_SEND) | |||
TCP FIN -> | |||
kernel socket goes into FIN_WAIT1 state | |||
close() | |||
kernel socket goes into TIME_WAIT state | |||
<- TCP ACK | |||
<- data | |||
because the kernel socket is in TIME_WAIT TCP RST -> |
And it should be
Client JS | Client Node/libuv | Client Kernel | Remote Server |
---|---|---|---|
res.end() | |||
shutdown(SD_SEND) | |||
TCP FIN -> | |||
kernel socket goes into FIN_WAIT1 state | |||
<- TCP ACK | |||
<- data | |||
recv() | |||
TCP ACK -> | |||
<- TCP FIN | |||
kernel socket goes into FIN_WAIT2 state | |||
recv() | |||
TCP ACK -> | |||
close() | |||
kernel socket goes into TIME_WAIT state |