Skip to content

Thread BLOCKED issues of DestinationCache in DefaultSubscriptionRegistry #24395

Closed
@liheyuan

Description

@liheyuan

We are using the spring-messaging to implement a STOMP server (using SimpleBrokerMessageHandler).
The Client will subscribe on 5 channel and everything is ok when there are only a few users.
However, when the online user is above ~ 700, the websocket channel is "out of response".

After analysis, I found many other thread has been "BLOCKED" by DestinationCache, as follows:

"pk-ws-worker-100-thread-78" #560 prio=5 os_prio=0 tid=0x00007ff19c182000 nid=0x1252 waiting for monitor entry [0x00007ff0a8d9f000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at org.springframework.messaging.simp.broker.DefaultSubscriptionRegistry$DestinationCache.getSubscriptions(DefaultSubscriptionRegistry.java:269)
        - waiting to lock <0x00000004c007ec20> (a org.springframework.messaging.simp.broker.DefaultSubscriptionRegistry$DestinationCache$1)

And part of the code are as follows:

public LinkedMultiValueMap<String, String> getSubscriptions(String destination, Message<?> message) {
  LinkedMultiValueMap<String, String> result = this.accessCache.get(destination);
  if (result == null) {
    synchronized (this.updateCache) {
      result = new LinkedMultiValueMap<>();
      for (SessionSubscriptionInfo info : subscriptionRegistry.getAllSubscriptions()) {
        for (String destinationPattern : info.getDestinations()) {
          if (getPathMatcher().match(destinationPattern, destination)) {
            for (Subscription sub : info.getSubscriptions(destinationPattern)) {
              result.add(info.sessionId, sub.getId());
            }
          }
        }
      }
      if (!result.isEmpty()) {
        this.updateCache.put(destination, result.deepCopy());
        this.accessCache.put(destination, result);
      }
    }
  }
  return result;
}

As you can see, the code inside synchronized will traverse all subscription, which will cost too much time and block other Thread.

Also, the accessCache / updateCache is not works if the client has not success make the subscription, which will make the situation worse.

We try to increase the cache limit and it does't work for our case.

To solve the problem, we remove the DestinationCache and
reimplement an Map -> <sessionId -> subsId> inside SessionSubscriptionRegistry.
(in our own codebase of course)

After theses change, the server can handle more than 5K online users with no problem.

Meanwhile, I noticed that DefaultSubscriptionRegistry and DestinationCache has been there for many years.

So, I just wonder is it ok to make a pr?
Or the existing DestinationCache is good for some other reason?

Metadata

Metadata

Assignees

Labels

in: messagingIssues in messaging modules (jms, messaging)status: supersededAn issue that has been superseded by another

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions