-
Notifications
You must be signed in to change notification settings - Fork 41.1k
Context refresh may deadlock when using Prometheus Exemplars #33070
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks for the report. For a deadlock to occur, two or more threads much be involved. Looking at the thread dump, there only appears to be a single thread involved. There's also no information about the locks that are held which unfortunately makes it impossible to see what's going on. Can you please try reproducing the problem and, when it occurs, capture a thread dump using |
Spring Boot's issue tracker isn't really the right place for this. The Micrometer Slack would be better. |
@wilkinsona |
does not seem to be reproduceable on my faster macbook pro |
Attached the kill -3 output. Same code, runs with Spring Boot 2.x .. taking into account that a lot of transient libraries of Spring of Course have changed. |
Thank you, @goafabric. This dump is much more useful as it shows the deadlock:
This deadlock is the same as reported in spring-projects/spring-framework#23501. I'm not sure what, if anything, we can do about it in Boot. |
I think one possible solution to this would be start using the Just an example: what do you think about using the Since Exemplars are sampled and in our case using them means correlating logs/spans etc. I think saying that sampling will start only after the application reached a certain point in its lifecycle is ok. Technically, this what we do as of today too but in order to solve this issue, we could start sampling even a later point of the lifecycle of the app. |
Deferring binding is something we've considered before in #30636 |
thank you all for looking for a solution |
Here's a minimal reproducer that uses
A thread dump shows the same deadlock as we saw above:
|
Description
I am experiencing a very strange problem,
thats not easy to reproduce.
However I was able to capture a Thread Dump via JProfile.
With the combination of :
Spring Boot 3-RC1 (also latest Snapshot from 7th November)
Jaeger Tracing inside Docker or Kubernetes
Prometheus
Zipkin
Data JPA
The Application will occasionally deadlock during bootstrap.
Just after the initialization of "ServletWebServerApplicationContext"
Removing one of the Dependencies will remedy the situation.
The stacktrace below shows that the problem seems to occur when prometheus triggers the autoconfig of the tracer.
However this only happens when combined with JPA.
Full Stacktrace is attached as well as the gradle build file.
Unfortunately I was able to create a downsized application that is able to reproduce the error.
So that is all the information i can provide for now
Question
I also noticed, that the rebuild Micrometer Tracing behaves where differently from the old sleuth implementation.
The traces include a lot of low level information (e.g. Spring Security Filter Chain).
And I was also not able to get complete spans across multiple applications, just seperate ones.
Will there be a fix and/or updated documentation with the Release of 3.0 GA ?
Thanks in advance
thread-trace.zip
build.gradle.zip
The text was updated successfully, but these errors were encountered: