-
Notifications
You must be signed in to change notification settings - Fork 306
fix: prevent sub-batch 413's from blocking whole batch #972
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3f1bf7b
to
13fd84b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering if it's feasible to add a unit test that verify the new behavior introduced, overwriting TARGET_BULK_BYTES
doubling the LogStash::Outputs::ElasticSearch::HttpClient::Pool
client to return a 413 on the post
request.
c6a1006
to
2cd4843
Compare
@andsel I've added specs and improved logging some more. |
6475020
to
0f601cd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've noticed only a minor repetition in the Changelog.
The changes to me seems OK, but I don't understand why the it fails on Travis
0f601cd
to
9a43f31
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The Http Client breaks batches of actions into sub-batches that are up to 20MB in size, sending larger actions as batches-of-one, and zips the responses together to emulate a single batch response from the Elasticsearch API. When an individual HTTP request is rejected by Elasticsearch (or by an intermediate proxy or load-balancer) with an HTTP/1.1 413, we can emulate the error response instead of blowing an exception through to the whole batch. This allows only the offending events/actions to be subject to retry logic. Along the way, we improve logging at the `debug` level for sub-batches, and emit clear `warn`-level logs with payload sizes when we hit HTTP 413 rejections.
d970253
to
b2ef074
Compare
Reviving, and expanding the scope to address a situation where we could send a <20MB payload that expanded beyond Elasticsearch's 100MB buffers (#823). Bulk grouping is now entirely determined by decompressed size. |
Failing tests on 8.0-SNAPSHOT are unrelated -- it looks like integration tests do some cleanup of ILM policies without first cleaning up the indices governed by those policies, which will be an error in 8.x. I'll see if fixing is trivial, or will file a separate issue to chase that down outside this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The default value of Elasticsearch's `action.destructive_requires_rename` has changed to true in elastic/elasticsearch#66908 which causes our integration tests' wildcard deletes to fail. By specifying this config explicitly, we ensure the desired behaviour is selected.
Theoretically have a fix in place for the failing integration tests against ES 8 snapshot -- They recently made a change to the default value of Once CI goes green, I plan to merge. |
The Http Client breaks batches of actions into sub-batches that are up to 20MB
in size, sending larger actions as batches-of-one, and zips the responses
together to emulate a single batch response from the Elasticsearch API.
When an individual HTTP request is rejected by Elasticsearch (or by an
intermediate proxy or load-balancer) with an HTTP/1.1 413, we can emulate
the error response instead of blowing an exception through to the whole batch.