Sending data frame results in transmission of more tcp segments than needed

## What is the issue?

I was using `tonic` (which depends on `h2` for http2) for a bidi-streaming gRPC service. With some packet sniffing, I noticed that when one side is sending data of around 1KB to the other side through the gRPC stream (which is based on an underlying http2 stream), the data is always transmitted as 2 TCP segments: the 1st one carries the http2 data frame's head , which is of 9 bytes, while the 2nd one carries the real payload.

Since `tonic` (and many other libraries) disables nagle algorithm by default, the above issue happens in a relatively wide range. For us, our application keeps transmitting data around 1KB, and this issue leads to a lot of tcp segments being transmitted, which is a big overhead and affects our applications' performance.

## What is the root cause?

Yes, `h2` is agnostic to the underlying transport protocol, but I assume that most of use cases are built on top of TCP; the problematic code is here

https://github.com/hyperium/h2/blob/da38b1c49c39ccf7e2e0dca51ebf8505d6905597/src/codec/framed_write.rs#L243-L251

When sending a data frame with payload size over `CHAIN_THRESHOLD` (which is 256 now), the frame head (9 bytes) is written to buffer, while the entire payload is left in `self.next`; then when data is being transmitted over the transport,

https://github.com/hyperium/h2/blob/da38b1c49c39ccf7e2e0dca51ebf8505d6905597/src/codec/framed_write.rs#L125-L135

the buffer is chained with the next token, which is finally written into the transport via 

https://github.com/hyperium/h2/blob/da38b1c49c39ccf7e2e0dca51ebf8505d6905597/src/codec/framed_write.rs#L186-L186

here `buf` is of type `bytes::buf::Chain` in `bytes` crate, whose `chunk()` method is implemented [as follows](https://github.com/tokio-rs/bytes/-/blob/64c4fa286771ad9e522ffbefc576bcf7b76933d0/src/buf/chain.rs#L141-147)

```rust
    fn chunk(&self) -> &[u8] {
        if self.a.has_remaining() {
            self.a.chunk()
        } else {
            self.b.chunk()
        }
    }
```

Therefore, the buffer, which only contains the data frame's head, is first written to the transport (which is tcp in most use cases), and the payload is then written separately. In the scenarios where nagel algorithm is disabled, it'll lead to the that data frame's head and payload are transmitted in separate tcp segments, which causes some overhead.


	if len >= CHAIN_THRESHOLD {
	let head = v.head();

	// Encode the frame head to the buffer
	head.encode(len, self.buf.get_mut());

	// Save the data frame
	self.next = Some(Next::Data(v));
	} else {

	match self.encoder.next {
	Some(Next::Data(ref mut frame)) => {
	tracing::trace!(queued_data_frame = true);
	let mut buf = (&mut self.encoder.buf).chain(frame.payload_mut());
	ready!(write(
	&mut self.inner,
	self.encoder.is_write_vectored,
	&mut buf,
	cx,
	))?
	}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Sending data frame results in transmission of more tcp segments than needed #711

What is the issue?

What is the root cause?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Sending data frame results in transmission of more tcp segments than needed #711

Description

What is the issue?

What is the root cause?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions