MQTT disconnects after a few mins with CSF firewall enabled

ntvbs · January 22, 2024, 1:37pm

Hi all,

We have migrated our MQTT server to a new VPS with CSF/LFD firewall enabled. With the CSF firewall enabled our MQTT clients stops communicating after a few minutes. With the firewall turned off communication is back instantly.

We cannot pinpoint where the problem comes from. Looks like some TCP traffic is blocked, but only after a X amount of time, not directly.

Our MQTT clients connect over an internal network (Strongswan VPN). Any idea what we can tweak in our firewall to allow all MQTT/TCP traffic?

Thanks

Diego · January 22, 2024, 4:49pm

Hello @ntvbs

Welcome to HiveMQ Community! I’m not very familiar with CSF Firewall but I can try to help you to identify what is happening.

Can you share CSF/LFD logs around the time the MQTT clients disconnection happens?
Is there any Connection Limit Protection enabled?
Is “ICMP_IN” parameter configured to 1?

Kind regards,
Diego from HiveMQ Team

ntvbs · January 26, 2024, 10:30am

Hey Diego,

Thank you for your response.

The weird thing is that the connection doesn’t actually disconnect. Both sides believe there is a working TCP connection going. But after a while the client realizes that the messages aren’t being received by the HiveMQ broker, and then restarts the connection. And then it works again for a while.

So the only things we see in the event.log of the broker is this:

2024-01-18 15:36:00,112 - Client ID: 356441112180092, IP: 172.17.29.112 was disconnected. reason: Another client connected with the same client id..
2024-01-18 15:36:00,129 - Client ID: 356441112180092, IP: 172.17.29.112, Clean Start: true, Session Expiry: 0 connected.
2024-01-18 15:36:00,513 - Client ID: 359215105836358, IP: 172.17.116.48 was disconnected. reason: Another client connected with the same client id..
2024-01-18 15:36:00,525 - Client ID: 359215105836358, IP: 172.17.116.48, Clean Start: true, Session Expiry: 0 connected.
2024-01-18 15:36:00,835 - Client ID: 358750177923610, IP: 172.17.124.245 was disconnected. reason: Another client connected with the same client id..
2024-01-18 15:36:00,849 - Client ID: 358750177923610, IP: 172.17.124.245, Clean Start: true, Session Expiry: 0 connected.
2024-01-18 15:36:01,067 - Client ID: 356441112179045, IP: 172.17.74.198 was disconnected. reason: Another client connected with the same client id..
2024-01-18 15:36:01,085 - Client ID: 356441112179045, IP: 172.17.74.198, Clean Start: true, Session Expiry: 0 connected.
2024-01-18 15:36:01,475 - Client ID: 350857124776898, IP: 172.17.99.154 was disconnected. reason: Another client connected with the same client id..
2024-01-18 15:36:01,495 - Client ID: 350857124776898, IP: 172.17.99.154, Clean Start: true, Session Expiry: 0 connected.
2024-01-18 15:36:01,798 - Client ID: 350857124776997, IP: 172.17.99.148 was disconnected. reason: Another client connected with the same client id..
2024-01-18 15:36:01,828 - Client ID: 350857124776997, IP: 172.17.99.148, Clean Start: true, Session Expiry: 0 connected.
2024-01-18 15:36:02,134 - Client ID: 356441112179391, IP: 172.17.74.119 was disconnected. reason: Another client connected with the same client id..
2024-01-18 15:36:02,174 - Client ID: 356441112179391, IP: 172.17.74.119, Clean Start: true, Session Expiry: 0 connected.

This is the config we’re using, we’ve set nothing about a connection limit. So ‘max-connections’ should be default -1.

<?xml version="1.0"?>
<!--

    Copyright 2019 dc-square GmbH

    Licensed under the Apache License, Version 2.0 (the "License");
    you may not use this file except in compliance with the License.
    You may obtain a copy of the License at

          http://www.apache.org/licenses/LICENSE-2.0

    Unless required by applicable law or agreed to in writing, software
    distributed under the License is distributed on an "AS IS" BASIS,
    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    See the License for the specific language governing permissions and
    limitations under the License.

-->
<hivemq>

    <listeners>
        <tcp-listener>
            <port>1883</port>
            <bind-address>31.14.99.216</bind-address>
        </tcp-listener>

	<tcp-listener>
            <port>1883</port>
            <bind-address>172.16.255.254</bind-address>
            <name>open-world-listener</name>
        </tcp-listener>
    </listeners>

    <anonymous-usage-statistics>
        <enabled>true</enabled>
    </anonymous-usage-statistics>

    <mqtt>
        <no-connect-packet-idle-timeout-millis>30000</no-connect-packet-idle-timeout-millis>
        <session-expiry>
            <max-interval>3600</max-interval>
        </session-expiry>
        <message-expiry>
            <max-interval>3600</max-interval>
        </message-expiry>
        <queued-messages>
            <max-queue-size>100</max-queue-size>
        </queued-messages>
	<retained-publish-ttl>3600</retained-publish-ttl>
	<publish-ttl>3600</publish-ttl>
	<client-session-ttl>3600</client-session-ttl>
	<receive-maximum>
            <server-receive-maximum>10</server-receive-maximum>
        </receive-maximum>
    </mqtt>

</hivemq>

I am not sure how and where you can set the “ICMP_IN” parameter, could you elaborate please?

Diego · January 26, 2024, 11:37am

Hello @ntvbs

You’re welcome. You mentioned, “With the CSF/LFD firewall turned off communication is back instantly” so most probably the firewall is filtering/dropping the connection. It seems some important firewall logs are located at /var/log/lfd.log and /etc/csf/csf.deny, can you share those firewall messages logs on the same timestamp?

ICPM_IN is a specific configuration value for the CSF firewall.

What is the HiveMQ version that is being used?

Is CSF/LFD firewall installed directly on Linux or it’s running as a plug-in?

Kind regards,
Diego from HiveMQ Team

ntvbs · February 7, 2024, 10:35am

Yes it has to be the firewall, but I’ve gone through the configuration and the pure IP tables of it 20 times and I just can’t pinpoint what could cause the problem.

I looked at the ICMP_IN option and this is set, ICMP_OUT as well:

# Allow incoming PING. Disabling PING will likely break external uptime
# monitoring
ICMP_IN = "1"

# Set the per IP address incoming ICMP packet rate for PING requests. This
# ratelimits PING requests which if exceeded results in silently rejected
# packets. Disable or increase this value if you are seeing PING drops that you
# do not want
#
# To disable rate limiting set to "0", otherwise set according to the iptables
# documentation for the limit module. For example, "1/s" will limit to one
# packet per second
ICMP_IN_RATE = "1/s"

Could this rate have anything to do with it?

We first used the latest version: hivemq-ce-2024.1 via docker. When we noticed the problems we downgraded to the version that was (succesfully) running on the old server: hivemq-ce-2020.2, which is what we’ve got running now. This version is running directly on Linux now.

We purposely added the whole IP range to csf.allow:
172.17.0.0/16 # MQTT clients

We also added the MQTT port to incoming and outgoing TCP and UDP ports (1883):

# Allow incoming TCP ports
TCP_IN = "35000:35999,20,21,22,25,53,853,80,110,143,443,465,587,993,995,2222,1883,1884"

# Allow outgoing TCP ports
TCP_OUT = "143,20,21,22,25,53,853,80,110,113,443,587,993,995,2222,1883,1884,3050"

# Allow incoming UDP ports
UDP_IN = "20,21,53,853,80,443,1883,1884"

# Allow outgoing UDP ports
# To allow outgoing traceroute add 33434:33523 to this list 
UDP_OUT = "11335,443,20,21,53,853,113,123,1883,1884"

Connection tracking is turned off.
It checks processes but only warns and doesn’t kill them.

You can see the LFD logs from around that time here:
labs.netvibes.nl/2024/lfd-hivemq.log

But it doesn’t seem to show anything except some SSH blocks and sees some potentially suspisious processes. But as I said before it doesn’t kill any of them.

Diego · February 7, 2024, 11:14am

Hello @ntvbs

Unfortunately, I couldn’t access the LFD logs file due to a lack of permissions. Based on all the information that you have shared I will suggest the following.

Make a clean installation of our latest version (HiveMQ CE 2024.2).
In the config file just set the tcp listeners by using different ports (1883, 1884), sample below.

    <listeners>
        <tcp-listener>
            <port>1883</port>
            <bind-address>31.14.99.216</bind-address>
        </tcp-listener>

	<tcp-listener>
            <port>1884</port>
            <bind-address>172.16.255.254</bind-address>
            <name>open-world-listener</name>
        </tcp-listener>
    </listeners>

Set the “Clean Start” option to false for your MQTT clients.

Kind regards,
Diego from HiveMQ Team

Topic		Replies	Views
Public broker recently overloaded?	6	99	April 8, 2025
Intermittent connection issues HiveMQ Community Edition	2	311	February 28, 2024
Can no longer connect to broker HiveMQ Community Edition	3	487	August 2, 2024
Any limit on message per minute/hour per connection? HiveMQ Cloud	2	1226	October 6, 2023
MQTT Disconnect and Reconnects at very short intervals for Android Client HiveMQ Client Library	4	2743	December 17, 2021

MQTT disconnects after a few mins with CSF firewall enabled

Related topics