HiveMQ

AsyncClient Consumer parallel execution?

Hello all,

we are currently switching from Eclipse Paho to HiveMQ client.
In our Paho version of a subscriber we used shared subscriptions to be able to handle a high load of many parallel messages coming in.

However since the HiveMQ cient offers async with a consumer as a callback Implementation we where wondering if we still need shared subscriptions.

So what does actually happen, when the consumer that processes a If “message1” is still at work and a new “message2” for the same subscriber is coming in. Will it be processed in parallel by a new consumer instance? Or will it be blocked until processing of “message1” has finished?

And if it is working in parallel, what kind of limits (how many threads?) do exist? We would expect some kind of thread pool configuration for that?

Thank you!
Cheers
Johannes

Hi @JohnM

Usually shared subscriptions are only needed when scaling an application to multiple machines.
If your processing of the messages can be parallelized, then there should be no reason to use a shared subscription on a single machine.
If the message load will increase in the future, you can still choose shared subscriptions to scale out to multiple machines later.

So what does actually happen, when the consumer that processes a If “message1” is still at work and a new “message2” for the same subscriber is coming in. Will it be processed in parallel by a new consumer instance? Or will it be blocked until processing of “message1” has finished?

As MQTT provides ordering guarantees the HiveMQ MQTT Client calls the same callback serially. Multiple callbacks for different subscriptions are executed in parallel.
For a single callback only your application can choose to break up the ordering.
To do this, you can just hand over the messages from the callback to parallel workers.

Which QoS are you using?
When using QoS 1 or 2, you might use manual acknowledgement after processing the messages.
It does not matter in which order you acknowledge messages that are processed in parallel; the HiveMQ MQTT Client will ensure the right order of acknowledgements to the server.
One thing you might have to configure when using a single client: a single connection has a limit for the amount of concurrent unacknowledged messages: the so called receive maximum.
By default this value is configured as the maximum (65.535), which should be right for your use case of high throughput.
With MQTT 5 you can adjust the receive maximum to your needs, for example to limit overload of the subscriber.

To answer your question: you should not need shared subscriptions on a single machine, but it depends on your use case: what is “high load” in numbers and how can the processing be parallelized.

Hope this helps
Silvio from the HiveMQ Team

2 Likes

Thanks a lot @SgtSilvio, that clarifies an important issue (sequential delivery per topic).

We are using QoS 1.
And we use have thousands of completely independently operating devices sending signals to a handful of topics like:

  • myTopic/+/signalA
  • myTopic/+/signalB

where ‘+’ will be used for the publishing devices ID (e.g. ‘1234’).
Our goal is to enable quick acknowledgement of these messages which are being put to a separate service bus by the callback.
Messages from the service bus are later consumed and processed into a key-value store by another app.

If we want a correct ordering of messages, we only need that within the same deviceId+topic combo to avoid that obsolete information from old messages overwrites information of newer messages for a particular device and signal.

If we get your information right it means that for a particular subscription/topic the server will hold back the next message until the client has acknowledged the current message.
Furthermore each client instance uses a thread pool and each subscription uses exactly one dedicated thread of this pool to process the messages of that same subscription (using the callback function provided)

If true, this raises other questions:

  1. When does the client acknowledge the message? Right after queueing the message to the callback or just after the callback execution of the previous message has finished? We assume the latter (i.e. acknowledgment will wait until just after the callback has finished so there is NO client-side queue of already acknowledged messages waiting for processing).

  2. If we have several shared subscriptions (client1, client2, client3) for a wildchart topic ("$share/myGroup/myTopic/+/signal"). How will the server handle this? If client1 is currently working on a message A for “myTopic/AAA/signal” and a message B for “myTopic/BBB/signal” (same shared subscription!) is waiting for delivery? When will the server decide (randomly) where to send message B? Will message B be blocked until message A has been achnowledged, even if the topic is different …/AAA/… versus …/BBB/… ?
    That’s unclear because the the callback and initial subscripotion is the same but the topic is actually different.

So is it all actually about the subscription (myTopic/+/signal) or the topic (myTopic/AAA/signal)?

To enable the most efficient and still consistent processing of the signals we would need to know how to subscribe them in the best way.

Thank you!