-
Notifications
You must be signed in to change notification settings - Fork 933
AvroConsumer for handling schema registry #80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AvroConsumer for handling schema registry #80
Conversation
926e55e to
550688d
Compare
README.md
Outdated
| ``` | ||
| from confluent_kafka.avro import AvroConsumer | ||
| c = AvroConsumer({'bootstrap.servers': 'mybroker,mybroker2', 'group.id': 'greoupid', 'schema.registry.url': 'http://127.0.0.1:9002'}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: greoupid
the schema registry url port almost looks like 9092 (kafka port), while the default port is 8081, so might want to use that instead
README.md
Outdated
| msg = c.poll(10) | ||
| if msg: | ||
| if not msg.error(): | ||
| print(msg.value()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want an example that actually uses the deserialized type? This looks just like a standard non-avro consumer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@edenhill How do we show the type of msg? msg value is going to be a dict object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No idea, but it is boring if the example is no different than the vanilla example, right? :)
confluent_kafka/avro/__init__.py
Outdated
| message = super(AvroConsumer, self).poll(timeout) | ||
| if not message: | ||
| return message | ||
| print(message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove print
| } | ||
|
|
||
| static PyObject *Message_set_value (Message *self, PyObject *new_val) { | ||
| Py_DECREF(self->value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe both value and key can be NULL, do:
if (self->value)
Py_DECREF(self->value);
```
and same for key
| "\n" | ||
| }, | ||
| { "set_value", (PyCFunction)Message_set_value, METH_O, | ||
| " :returns: None.\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing docstring saying what the method actually does
README.md
Outdated
| msg = c.poll(10) | ||
| if msg: | ||
| if not msg.error(): | ||
| if (msg.value() && isinstance(msg.value(), dict)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this ever not true?
From what I understand the deserializer will raise an exception if it can't deserialize a message.
Speaking of which, this example should probably have a proper try:..except block catching such errors, right?
| Py_INCREF(self->value); | ||
| if (self->value) { | ||
| Py_DECREF(self->value); | ||
| self->value = new_val; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This only sets a new value if there was an old one, which isnt right.
Look at my initial comment.
| "\n" | ||
| }, | ||
| { "set_value", (PyCFunction)Message_set_value, METH_O, | ||
| " Set the field 'Message.value' with new value.\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document param and type, see an example here:
https://github.com/confluentinc/confluent-kafka-python/blob/master/confluent_kafka/src/confluent_kafka.c#L741
README.md
Outdated
| elif msg.error().code() != KafkaError._PARTITION_EOF: | ||
| print(msg.error()) | ||
| running = False | ||
| except SerializerError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is SerializerError defined? I can't seem to find it
README.md
Outdated
| print(msg.error()) | ||
| running = False | ||
| except SerializerError: | ||
| print("Unable to decode the message") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
print the exception message
| }, | ||
| { "set_value", (PyCFunction)Message_set_value, METH_O, | ||
| " Set the field 'Message.value' with new value.\n" | ||
| " :param: PyObject value: Message.value.\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PyObject is a C thing, right? Maybe you 'object' or 'anyobject' or such
confluent_kafka/avro/__init__.py
Outdated
| if not message: | ||
| return message | ||
| if not message.error(): | ||
| if message.value(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you want to check if message.value() is not None here. An empty bytes can be a valid message but wouldn't be deserialized here. (In particular this can be an issue if the serialization format does not encode fields that contain their default values, resulting in an empty, but non-null, value.) Same for the check of the key below.
README.md
Outdated
| except SerializerError: | ||
| print("Unable to decode the message") | ||
| except SerializerError as e: | ||
| exc_type, exc_value, exc_traceback = sys.exc_info() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is nice, but maybe a bit excessive for an example, I just ment to print the exception string, something like:
print("Message deserialization failed for %s: %s" % (msg, e))
|
I'm happy with this. @ewencp ? |
|
LGTM |
…nt-kafka-python into schema_registry_producer
|
Thanks again, great work! |
Fixes #36