-
-
Notifications
You must be signed in to change notification settings - Fork 311
Description
What is the issue?
I was using tonic
(which depends on h2
for http2) for a bidi-streaming gRPC service. With some packet sniffing, I noticed that when one side is sending data of around 1KB to the other side through the gRPC stream (which is based on an underlying http2 stream), the data is always transmitted as 2 TCP segments: the 1st one carries the http2 data frame's head , which is of 9 bytes, while the 2nd one carries the real payload.
Since tonic
(and many other libraries) disables nagle algorithm by default, the above issue happens in a relatively wide range. For us, our application keeps transmitting data around 1KB, and this issue leads to a lot of tcp segments being transmitted, which is a big overhead and affects our applications' performance.
What is the root cause?
Yes, h2
is agnostic to the underlying transport protocol, but I assume that most of use cases are built on top of TCP; the problematic code is here
Lines 243 to 251 in da38b1c
if len >= CHAIN_THRESHOLD { | |
let head = v.head(); | |
// Encode the frame head to the buffer | |
head.encode(len, self.buf.get_mut()); | |
// Save the data frame | |
self.next = Some(Next::Data(v)); | |
} else { |
When sending a data frame with payload size over CHAIN_THRESHOLD
(which is 256 now), the frame head (9 bytes) is written to buffer, while the entire payload is left in self.next
; then when data is being transmitted over the transport,
Lines 125 to 135 in da38b1c
match self.encoder.next { | |
Some(Next::Data(ref mut frame)) => { | |
tracing::trace!(queued_data_frame = true); | |
let mut buf = (&mut self.encoder.buf).chain(frame.payload_mut()); | |
ready!(write( | |
&mut self.inner, | |
self.encoder.is_write_vectored, | |
&mut buf, | |
cx, | |
))? | |
} |
the buffer is chained with the next token, which is finally written into the transport via
Line 186 in da38b1c
ready!(Pin::new(writer).poll_write(cx, buf.chunk()))? |
here buf
is of type bytes::buf::Chain
in bytes
crate, whose chunk()
method is implemented as follows
fn chunk(&self) -> &[u8] {
if self.a.has_remaining() {
self.a.chunk()
} else {
self.b.chunk()
}
}
Therefore, the buffer, which only contains the data frame's head, is first written to the transport (which is tcp in most use cases), and the payload is then written separately. In the scenarios where nagel algorithm is disabled, it'll lead to the that data frame's head and payload are transmitted in separate tcp segments, which causes some overhead.