Serialization of message payload (confluent avro format)

Hi,

I am integrating HiveMQ with Confluent cloud and writing a confluent flink sql queries on messages ingested into Kafka topics.

After researching found that HiveMQ sends key as topic name and value as payload. Hence, I’ve configured kafka topic key as “raw” and value as “confluent avro” schema.

When I send messages from HiveMQ client they have received by Kafka topic however when I run queries in Flink it gives deserialization error.

Failed to deserialize consumer record due to Caused by: Failed to deserialize Avro record. Caused by: Unknown data format. Magic number does not match

Can you please help in this, how to send a serialized data. Also, I have configured schema registry, but it looks like it’s enabled for kafka-mqtt for deserialization.

Hello @Safina ,

Thank you for the outreach, and let me first welcome you to the HiveMQ Community!

I would first like to clarify the functionality of HiveMQ and MQTT as a whole with regards to how message payloads are handled. Essentially, when a message is published to a topic, the message payload is delivered as binary data, and as a result the formatting and configuration of the message is intentionally left ‘as is’ from the original publish request. This is true of data that is being published to Kafka via the Kafka extension, as well.

With that in mind, I would not expect the message payload itself to be modified in any way as it makes its way to the Confluent Cloud. Reviewing Avro and Confluent documentation, it seems that in order to implement serialized data within Confluent cloud, an independent Schema Registry is utilized to handle the schema of the incoming data. Most often, this ‘magic number’ error indicates that some initial header data, typically indicating the contents of the proceeding data, is missing or unable to be successfully interpreted. This blog article provides some examples of where this may occur, and how to reconcile these errors.

In addition, HiveMQ does offer its own policy and schema enforcement utility via Data Hub, which could be a local HiveMQ-based solution to the schema registry implementation, though this does require a self-hosted HiveMQ broker at this time.

Let us know if there is any further assistance we can provide, or if you have any further questions!

Best,
Aaron from the HiveMQ Team