Skip to content

InMemoryWebSessionStore indirectly causing infinite loop inside tomcat-native OpenSSL under load #26407

Closed
@philsttr

Description

@philsttr

This is a fun one....

When load testing a new app using...

  • Spring Boot 2.4.2
  • WebFlux 5.3.3
  • Spring Security 5.4.2
  • Tomcat 9.0.41
  • tomcat-native 1.2.25
  • APR 1.6.5
  • OpenSSL 1.1.1f

...CPU utilization will max out, and stay maxed out even after the load test completes.

When investigating, I found that threads in the global boundedElastic Scheduler are consuming the entire CPU, as seen in top (broken out by threads)...

top - 17:56:42 up 12 min,  0 users,  load average: 0.61, 0.24, 0.15
Threads: 112 total,   3 running, 109 sleeping,   0 stopped,   0 zombie
%Cpu(s): 53.8 us,  0.3 sy,  0.0 ni, 45.7 id,  0.0 wa,  0.0 hi,  0.1 si,  0.1 st
MiB Mem :  11852.9 total,   8478.8 free,   1267.9 used,   2106.2 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.  10278.6 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
  212 bogus     20   0 6635460 493008  18064 R  99.9   4.1   0:07.44 boundedElastic-  <--- CPU maxed out
  170 bogus     20   0 6635460 493008  18064 R  99.7   4.1   0:21.02 boundedElastic-  <--- CPU maxed out
   17 bogus     20   0 6635460 493008  18064 S   9.6   4.1   0:26.19 C2 CompilerThre
  235 bogus     20   0 6635460 493008  18064 S   1.3   4.1   0:00.13 boundedElastic-
   18 bogus     20   0 6635460 493008  18064 S   0.7   4.1   0:04.61 C1 CompilerThre
  234 bogus     20   0 6635460 493008  18064 S   0.7   4.1   0:00.79 boundedElastic-
   85 bogus     20   0    9416   2340   1468 R   0.7   0.0   0:00.43 top
   36 bogus     20   0 6635460 493008  18064 S   0.3   4.1   0:00.39 https-openssl-n
   47 bogus     20   0 6635460 493008  18064 S   0.3   4.1   0:01.70 https-openssl-n
  166 bogus     20   0 6635460 493008  18064 S   0.3   4.1   0:00.77 parallel-4
  167 bogus     20   0 6635460 493008  18064 S   0.3   4.1   0:00.15 https-openssl-n
  175 bogus     20   0 6635460 493008  18064 S   0.3   4.1   0:00.11 https-openssl-n
  179 bogus     20   0 6635460 493008  18064 S   0.3   4.1   0:00.09 https-openssl-n
  184 bogus     20   0 6635460 493008  18064 S   0.3   4.1   0:00.11 https-openssl-n
  192 bogus     20   0 6635460 493008  18064 S   0.3   4.1   0:00.09 https-openssl-n
  204 bogus     20   0 6635460 493008  18064 S   0.3   4.1   0:00.09 https-openssl-n
  210 bogus     20   0 6635460 493008  18064 S   0.3   4.1   0:00.10 https-openssl-n
...

Taking a stackdump of the process reveals the two threads are inside tomcat's OpenSSLEngine...
(note that nid=0xd4 correlates to PID 212 above)

"boundedElastic-8" #86 daemon prio=5 os_prio=0 cpu=128128.31ms elapsed=215.53s allocated=28746K defined_classes=1 tid=0x00007fee00035800 nid=0xd4 runnable  [0x00007fed91cbe000]
   java.lang.Thread.State: RUNNABLE
	at org.apache.tomcat.util.net.openssl.OpenSSLEngine.unwrap(OpenSSLEngine.java:603)
	- locked <0x00000007576d7df8> (a org.apache.tomcat.util.net.openssl.OpenSSLEngine)
	at javax.net.ssl.SSLEngine.unwrap([email protected]/SSLEngine.java:637)
	at org.apache.tomcat.util.net.SecureNioChannel.read(SecureNioChannel.java:617)
	at org.apache.tomcat.util.net.NioEndpoint$NioSocketWrapper.fillReadBuffer(NioEndpoint.java:1229)
	at org.apache.tomcat.util.net.NioEndpoint$NioSocketWrapper.read(NioEndpoint.java:1141)
	at org.apache.coyote.http11.Http11InputBuffer.fill(Http11InputBuffer.java:795)
	at org.apache.coyote.http11.Http11InputBuffer.available(Http11InputBuffer.java:675)
	at org.apache.coyote.http11.Http11Processor.available(Http11Processor.java:1201)
	at org.apache.coyote.AbstractProcessor.isReadyForRead(AbstractProcessor.java:838)
	at org.apache.coyote.AbstractProcessor.action(AbstractProcessor.java:577)
	at org.apache.coyote.Request.action(Request.java:432)
	at org.apache.catalina.connector.InputBuffer.isReady(InputBuffer.java:305)
	at org.apache.catalina.connector.CoyoteInputStream.isReady(CoyoteInputStream.java:201)
	at org.springframework.http.server.reactive.ServletServerHttpRequest$RequestBodyPublisher.checkOnDataAvailable(ServletServerHttpRequest.java:295)
	at org.springframework.http.server.reactive.AbstractListenerReadPublisher.changeToDemandState(AbstractListenerReadPublisher.java:222)
	at org.springframework.http.server.reactive.AbstractListenerReadPublisher.access$1000(AbstractListenerReadPublisher.java:48)
	at org.springframework.http.server.reactive.AbstractListenerReadPublisher$State$2.request(AbstractListenerReadPublisher.java:333)
	at org.springframework.http.server.reactive.AbstractListenerReadPublisher$ReadSubscription.request(AbstractListenerReadPublisher.java:260)
	...

In a debug session, I discovered that an infinite loop is executing in tomcat's OpenSSLEngine.unwrap where:

  • pendingApp = 2
  • idx = 1
  • endOffset = 1
  • capacity = 16384

I would have expected the OpenSSL I/O code to execute on one of the https-openssl-nio-* threads, not the boundedElastic Scheduler. Therefore, I started investigating why this code was executing on the boundedElastic Scheduler.

After more debugging I narrowed it down to InMemoryWebSessionStore.createWebSession().
This is the only location in this particular app that uses the boundedElastic Scheduler.

The WebSession is being created because Spring Security's WebSessionServerRequestCache is being used, which persists requests in the WebSession.

If I disable the request cache (which removes the usage of WebSession, which removes the call to InMemoryWebSessionStore.createWebSession(), which removes usage of boundedElastic), then all I/O is performed on the https-openssl-nio-* threads, and the infinite loop does not occur.

I haven't fully investigated why the infinite loop occurs, but I assume there is a threadsafety bug somewhere in tomcat's OpenSSLEngine. (Either that or it was never intended to be used from multiple threads.) Having said that, I don't think that the I/O should be occurring on the boundedElastic thread, so I did not investigate further.

In other words, in my opinion, using InMemoryWebSessionStore should not cause the OpenSSL I/O to occur on a boundedElastic thread.

I have attached a simple application that can be used to reproduce the problem.
After extracting, use docker-compose up to build and start a container with the spring boot app with the above configuration.
Sending a lot of load (>= 2000 calls per second) to the /echo endpoint will reproduce the infinite loop.
However, you can see OpenSSL I/O occurring on the boundedElastic threads with any amount of load.

Metadata

Metadata

Assignees

No one assigned

    Labels

    status: supersededAn issue that has been superseded by another

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions