Since the last week we have clients complaining that they see disconnects every 30 minutes. When we look at one of our MQTT nodes and the logs we can see:
2023-09-25 14:09:05,259 INFO - Limiting the connect-rate of listener 'tls-tcp-listener-8883' to '68' CONNECT/s, because 'MEDIUM' global overload protection was activated.
2023-09-25 14:09:10,061 INFO - Limiting the connect-rate of listener 'tls-tcp-listener-8883' to '35' CONNECT/s, because 'MEDIUM_HIGH' global overload protection was activated.
2023-09-25 14:09:14,835 INFO - 'HIGH' local overload protection activated because of '67114' total running tasks (SingleWriter: 67093, SingleWriterCallbacks: 21, Extension: 0, Persistence: 0, Netty: 0).
2023-*09-25 14:09:14,836 INFO - Limiting the connect-rate of listener 'tls-tcp-listener-8883' to '19' CONNECT/s, because 'HIGH' global overload protection was activated.*
*2023-09-25 14:0*9:19,659 INFO - 'HIGHER' local overload protection activated because of '78075' total running tasks (SingleWriter: 78022, SingleWriterCallbacks: 53, Extension: 0, Persistence: 0, Netty: 0).
2023-09-25 14:09:19,663 INFO - Limiting the connect-rate of listener 'tls-tcp-listener-8883' to '11' CONNECT/s, because 'HIGHER' global overload protection was activated.
2023-09-25 14:09:24,632 INFO - 'HIGHEST' local overload protection activated because of '89314' total running tasks (SingleWriter: 89304, SingleWriterCallbacks: 10, Extension: 0, Persistence: 0, Netty: 0).
2023-09-25 14:09:24,632 INFO - Limiting the connect-rate of listener 'tls-tcp-listener-8883' to '7' CONNECT/s, because 'HIGHEST' global overload protection was activated.
2023-09-25 14:09:28,059 INFO - Local overload protection deactivated
2023-09-25 14:10:09,331 INFO - Stopped limiting the connect-rate of listener 'tls-tcp-listener-8883', because the global overload protection was deactivated.
I’m wondering what can cause this issue?
We are suspecting it can occur if our Queue Size increases drastically? Can the Shared Subscription Queue Size in that case also start to go up up as a result of the Queued messages increasing drastically?
How, if possible, can we adjust our MQTT configuration in order to prevent this from happening in the future? Can we set some limitations other than what we have currently in following settings:
<lifetime>604800</lifetime> <!-- 7 days -->
</client-event-history>
<queued-messages>
<max-queue-size>259200000</max-queue-size> <!-- 1/s x 3600 x 24 x 3 x 1000 -->
<strategy>discard-oldest</strategy>
</queued-messages>
<session-expiry>
<max-interval>259200</max-interval> <!-- 3600 x 24 x 3 (3 days) -->
</session-expiry>
<packets>
<max-packet-size>262144</max-packet-size> <!-- 256KB -->
</packets>
<key>supervision.global.tasks.maximum</key>
<value>100000</value>
<key>supervision.global.tasks.minimum</key>
<value>1000</value>
<key>initial.client-credits.per-tick</key>
<value>3000</value>
<key>initial.client-credits.publish</key>
<value>75000</value>
To sum up:
1. Can you let us know why overload protection was activated?
2. Is there a direct correlation between Queued Messages & Shared Subscription Queue Size ie. when the earlier starts to grow rapidly the second Queue also start to grow?
3. Can we adjust our config to prevent this from happening in the future, other than the obvious, to decrease the max-queue-size
Best regards
Ash