-
Notifications
You must be signed in to change notification settings - Fork 915
Consumer poll() does not always return #18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This seems to be a problem with the underlying librdkafka library. |
Is there a possibility that with this issue, the consumer is skipping messages? I'm trying to match up message count from one topic to the next when all should occur is a simple transform and re-publish to next topic in chain. The counts don't add up. I scour the logs and compare counts of occurrence of a log statement to the topic offsets and they don't match, and I see a lot of those "no message object" occurrences in log. Odd (negative) offsets even though I deleted and restarted pull from new topics. The total of logsize is over 1600 messages for this topic, however.
When I check the log file and count occurrances of string in message handler, it is about 600 records short (evident in my subsequent topic) and I see over 900 occurrences of the No message object error noted above.
Could this issue be causing my consumer to miss messages and also make the |
On a whim, I wiped the topics and change them to single partition before creating them again, and then re-ran my program. Now counts match exactly as expected! It appears there is also an issue if you have many partitions and a single consumer in the routing of messages?
No message object issue still occurs but now my consumer consumed every message and didn't lose any. And now after changing to single partition, I have 0 lag and no more (-)negative lag counts and offset and logsize match. There must also be a bug in partition routing/distribution.
|
Are you using log compaction on this topic? Can you check the offset consistency of the consumed messages to see if messages are dropped or delivered out of order? |
Yes, I have compaction enabled. I created topic with the following while testing so I can monitor compaction of same-key topics: |
@edenhill I haven't checked offset consistency but could output those to log to figure out if non-sequential. I'm wrapping up something else at the moment. I did have to switch to single partition to fix the data "loss" issue reported with that librdkafka bug you confirmed when consumer returns no msg object sometimes. I've played around now with the consumer config settings and queues. I have no errors in logs and with single partition I have 1:1 record count as expected - adding partitions I get loss (with single consumer across many partitions). |
Are you still seeing this? |
I'm running into an odd scenario with Consumer class implementation where sometimes the
poll()
return is null or empty. I would expect this to be one of the object with message, or object with error (sometimes just EOF for partition), but never empty or null?Here is my code and out of 25,000 messages in topic, just under 2,000 warnings in log that
poll()
returned no msg object. What conditions would this occur or did I implement incorrectly below?In logs:
The text was updated successfully, but these errors were encountered: