Hello! We have observed a strange behaviour of HiveMQ cluster regarding client session invalidation.
We have a HiveMQ cluster of 2 brokers deployed in Kubernetes. We use custom docker image based on hivemq3:dns-3.4.6. The base image includes the HiveMQ DNS Discovery Plugin. We add our custom plugin which provides a REST API to invalidate client sessions. The plugin inside does a call to com.hivemq.spi.services.AsyncClientService#invalidateSession(String clientID) to invalidate a session for the given client.
In our case we have 2 roles: a Client which uses Paho MQTT v3 library and subscribes to a certain topic and a Service which uses HiveMQ MQTT library and publishes to that topic. Here is the situation:
- Client connects to HiveMQ with cleanSession=false and subscribes to a certain topic.
- Service is also connected and publishes a message with QoS=1 (Message 1) to the same topic.
- Client receives the message and handles it successfully. Paho automatically sends PUBACK right after hander method is completed.
- Service sends a Message 2 (QoS=1) to the topic.
- Client receives it, but gets disconnected before handler method completes → broker receives no PUBACK.
- Service sends a Message 3 (QoS=1) to the topic.
- Client is disconnected at this moment, so it does not receive it.
- Session for the client is invalidated via our custom plugin.
- Client connects to the broker and immediately receives the Message 2 with the DUP flag, Message 3 is not received though.
We expected that after session is invalidated for the client, after second connect it should not receive any of the messages sent before that. But it still receives this Message 2, which was sent by the broker but unacknowledged by the client.
Interesting part is that the behavior is different if we use only single broker instead of the cluster of 2 brokers. If we scale the hivemq in Kubernetes to just 1 instance and go through all the same steps, the client will not receive any message when it connects to the broker after session invalidation.
Has anybody faced similar issues? Seems to be related to the communication between HiveMQ cluster nodes. Maybe this unacknowledged message stays somewhere internally on one of the nodes even though it should not?
Looking forward to hear your thoughts on that issue.
Thanks in advance!