Prevent concurrent access to local breaker in rerank (#128162) #128945
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When an async operator receives a response, we can't create new blocks on the responding thread because multiple threads may adjust the local breaker simultaneously, leading to a data race. To address this, we can either use the global breaker or delay block creation in getOutput. While using the global block factory is simpler, I prefer the second option to use the local breaker when possible. Therefore, I opted to keep the results in the queue and create new blocks in getOutput. Our tests didn't catch this issue because: (1) only one block is created in the test, and (2) there is no delay when simulating the inference service.
Closes #127638
Closes #127051