Skip to content

Data read with end of SSL handshake is discarded #665

@cstratton

Description

@cstratton

Description

Essentially, the bug is that

CONCLUDE SSL HANDSHAKE
PLEASE CAN WE DO WEBSOCKETS

works, while

CONCLUDE SSL HANDSHAKE PLEASE CAN WE DO WEBSOCKETS

does not

Expected Behavior

Data following the conclusion of the SSL handshake should be processed, regardless of how quickly it follows the handshake.

Websockets operating over SSL are ultimately using TCP, which is a streaming protocol, meaning that a given read() call will generally return all unclaimed data which has arrived, regardless of any logical boundaries, origin in distinct write() calls, or breaks between packets.

When operating as a server, the final read() call in processHandshake() in SSLSocketChannel2.java often ends up claiming not only the concluding 45 or so bytes of the SSL handshake exchange, but also 300-something bytes of immediately following HTTP traffic - typically the initial Websocket upgrade request.

Current Behavior

Unfortunately, at least as of Java-WebSocket-1.3.0-334-g105e2e0 and preceding recent versions, any remaining data in the inCrypt buffer is dropped when processHandshake calls createBuffers(). This means that the websocket upgrade request contained within the dropped data buffer is never seen, and application programs on both ends fail to receive a connection callback.

Possible Solution

We've had initial success modifying createBuffers() to not rewind/flip inCrypt if it contains yet unused data. However, it's not clear if this is entirely safe or how it interacts with past bugfixes like that for #190. It would also seem that if the buffer size should change the data could need to be copied from the old one to the new.

We are also unsure if this should be restricted to the createBuffers() call resulting from the SSL handshake's read() operation. We only operate the library in the server role, and suspect that the SSL handshake completing during a write() would be more likely to occur in the complementary client role?

Steps to Reproduce (for bugs)

The problem is most obvious when a client (or the TCP stack of the client's operating system) combines both the final piece of the SSL handshake and the following initial protocol traffic within the same packet. Using daltoniam/Starscream on an iphone for example, initial connections seem to send these in distinct packets and the connection works. But if the websocket is disconnected and new connection attempt made soon thereafter, the final piece of the new connection's SSL handshake and its initial data seem to reliably end up in the same packet, as a result of this bug the portion following the SSL handshake is dropped by createBuffers(), and the connection fails to establish a Websocket. (Packet sniffing shows that the client's source port does increment, so this is not a TCP connection confusion. Additionally, distinct write() calls are actually being used for the SSL handshake vs. application data, but unless a few hundred milliseconds of delay in inserted in between, the operating system/TCP stack is merging them into a single packet for stream efficiency as the design of TCP encourages)

The problem can also theoretically occur if the client manages to fire off two distinct packets in between the server's read() calls, such as might happen if the server is heavily loaded or experiencing scheduler latency, or in the case of network level issues, especially as Nagle encourages sending an additional packet even with the ack of the previous still unreceived.

The situation could probably also be artificially caused by introducing a half-second sleep() before each read() call, such that all of the data the client intends to send at that point in the conversation will probably arrive before the read() can claim any of it.

Ultimately, communication over TCP needs to handle both the case of a read() call retrieving less than a logical chunk of data, and also the case of it retrieving the end of one logical chunk and all or part of another. Because of the buffer clearing, the latter case is not handled at the specific transition which occurs at the end of the SSL handshake.

Context

Android based secure websocket server is unable to reliably accept websocket connection attempts from clients behaving in accordance with the TCP specification.

Your Environment

Android 6.1 with Java-WebSocket-1.3.0-334-g105e2e0 operating in server role
Secure Websockets
Client using https://github.com/daltoniam/Starscream v 3.0.4 on iOS 11.0.2
both server and client on the same local wifi network

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions