Skip to content

very slow pong responses and incorrect amount of bytes for pong response #8308

Open
@Roasbeef

Description

@Roasbeef

In lnd we have some logic to measure how long it takes for our peer to respond to a ping. If it takes too long, we'll disconnect them. We also have logic to make sure that they send back exactly the amount of pong bytes that we asked for.

Initially we suspected some false positives here, so we added a flag to disable the disconnect, and just log instead.

My node is running with that flag active (don't disconnect, just log), but seems that one of my CLN peers is persistently very slow (10s of seconds) when responding to pings:

2025-05-24 01:20:01.946 [WRN] PEER: Peer(X): pong response failure for X@X:9736: pong response does not match expected size. Expected: 3788, Got: 1144. Time waited for this pong: 25.4752982s. Last successful RTT: 592.678988ms. -- not disconnecting due to config
2025-05-24 01:20:18.488 [WRN] GRDB: Channel=939076388748591104 not found in graph cache
2025-05-24 01:20:18.488 [WRN] GRDB: Channel=939076388748591104 not found in graph cache
2025-05-24 01:20:47.910 [WRN] PEER: Peer(X): pong response failure for X@X:9736: pong response does not match expected size. Expected: 3487, Got: 3788. Time waited for this pong: 11.439536257s. Last successful RTT: 592.678988ms. -- not disconnecting due to config

Here we see it taking 25, then 11 seconds to respond to a pong. Both times, it sends back the wrong number of pong bytes.

It doesn't always return the wrong number of bytes, but is frequently very slow re responding to pings:

2025-05-24 01:19:06.470 [WRN] PEER: Peer(X): pong response failure for X@X:9736: timeout while waiting for pong response. Time waited for this pong: 30.00005563s. Last successful RTT: 592.678988ms. -- not disconnecting due to config

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions