-
Notifications
You must be signed in to change notification settings - Fork 33
eliminate re-use of codecs across identities #70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Some Logstash codecs (like CSV) are stateful and cannot be reused across multiple input streams. This change removes a minor optimization, in which the base codec is repeatedly reused for the first concurrent accessor, despite the identities for the identity map not matching previous usages. Instead, when an identity is provided, we ensure that a _clone_ of the base codec is provided for each distinct identity up and until eviction, and that the clone is not reused after its eviction flush.
karenzone
reviewed
Sep 28, 2021
karenzone
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add link to PR in changelog entry
andsel
approved these changes
Sep 28, 2021
andsel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, good catch!
robbavey
added a commit
to robbavey/logstash-input-file
that referenced
this pull request
Oct 18, 2021
Following logstash-plugins/logstash-codec-multiline#70, clones of the codec are used, rather than the original codec, causing the codec tracer not to be able to trace events that occurred after the clone. This commit removes the physical clone on the CodecTracer, rather treating it as another function to be traced, returning the same Tracer instance, but adding a clone event instead.
andsel
added a commit
to andsel/logstash-input-file
that referenced
this pull request
Apr 29, 2022
With the fix on multiline codec (logstash-plugins/logstash-codec-multiline#70) introduced to avoid sharing of codec across multiple threads, it started to make fail this test that leveraged that shared instance to tracing the codec calls. The multithreading nature of the test and the lack of synchornizaion to access the IdentityCodecMap led to multiple threads overwrite of the codec instances. This overwrite cleans the history of previous tracked calls making it failing. This commit fixes it giving to che #clone a method a simil-singleton functionality, to don't loose the list of tracked calls.
2 tasks
andsel
added a commit
to logstash-plugins/logstash-input-file
that referenced
this pull request
May 2, 2022
There are two failures that this PR fixes:
- tests that referred host or path attribute running against Logstash 8, which by default has ECS enabled.
The fix consisted in surrounding the test with the ECS harness and use the ecs_select to refer attributes with proper name.
- another fix regards the test having timed_out, the codec is auto flushed. It presented 2 problems:
- the first is that to request the subject.stop to trigger the auto_flush call (called by exit_flush method).
- the second regard a regression introduced with eliminate PR logstash-plugins/logstash-codec-multiline#70. The test worked because the the IdentityMapCodec used a shared codec. When this behavior was removed the test, that engage multiple threads, exposed a problem of overwriting of codec reference. This overwriting drove to the lost of data, used in test logic. The fix consists in clone redefinition to make it almost a singleton.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Some Logstash codecs (like CSV) are stateful and cannot be reused across
multiple input streams. This change removes a minor optimization, in which
the base codec is repeatedly reused for the first concurrent accessor, despite
the identities for the identity map not matching previous usages. Instead, when
an identity is provided, we ensure that a clone of the base codec is provided
for each distinct identity up and until eviction, and that the clone is not
reused after its eviction flush.