Hivemq restart after ERROR - Client certificate authentication forced

Hi team,
We frequently have automatic restart after receiving the below error in logs multiple times. I read that hivemq doesn’t have an inbuilt automatic restart option. Can you suggest any tips on how to prevent this error or where to check for the automatic restart configurations? In certain situation, there are various client threads that go into WAIT state instead of established which also increases the load on hivemq. How can we close this TIME_WAIT connections? Any help would be appreciated. Thanks!
ERROR - Client certificate authentication forced but no client certificate was provided. Disconnecting. javax.net.ssl.SSLPeerUnverifiedException: peer not authenticated at java.base/sun.security.ssl.SSLSessionImpl.getPeerCertificates(SSLSessionImpl.java:1033) at com.hivemq.security.ssl.SslClientCertificateHandler.userEventTriggered(SslClientCertificateHandler.java:72) at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:346) at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:332) at io.netty.channel.AbstractChannelHandlerContext.fireUserEventTriggered(AbstractChannelHandlerContext.java:324) at io.netty.channel.ChannelInboundHandlerAdapter.userEventTriggered(ChannelInboundHandlerAdapter.java:117) at com.hivemq.security.ssl.SslParameterHandler.userEventTriggered(SslParameterHandler.java:56) at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:346) at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:332) at io.netty.channel.AbstractChannelHandlerContext.fireUserEventTriggered(AbstractChannelHandlerContext.java:324) at io.netty.handler.ssl.SslHandler.setHandshakeSuccess(SslHandler.java:1838) at io.netty.handler.ssl.SslHandler.wrapNonAppData(SslHandler.java:973) at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1409) at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1236) at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1285) at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:510) at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:449) at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:279) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) at io.netty.handler.codec.ByteToMessageDecoder.handlerRemoved(ByteToMessageDecoder.java:255) at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:517) at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:449) at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:279) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:722) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:658) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:584) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at java.base/java.lang.Thread.run(Thread.java:1583)

Hello @divyasd, and welcome to our community!

The error you’re seeing — “Client certificate authentication forced but no client certificate was provided” — usually happens when the client-authentication-mode setting in your config.xml is set to REQUIRED, but the client is not providing a valid certificate and private key during the SSL/TLS handshake.

If the client-authentication-mode is set to REQUIRED, the client must provide both the certificate and private key. If the client fails to do so, the error you’re encountering will occur.

Check your config.xml to ensure the client-authentication-mode setting aligns with your client configuration. If you don’t want to enforce client certificate authentication, you can change the setting to OPTIONAL or NONE.

Hope this helps!

Best regards,
Dasha from The HiveMQ Team

Hi @Daria_H, thank you for your response. That slightly helps. We want the client-authentication to be enforced but we suspect that the load on hivemq is increasing because of 1-2 clients that don’t produce certificates / produce invalid ones. We’re not able to see the IPs of the clients not producing the certificates / producing invalid certificates. Is there a way to find them?

Can you also help me understand if multiple clients connecting with same client ID can also increase load?
We’re trying to debug the constant overload on hivemq and the possible reasons for it.

Hi @divyasd,

Thank you for your follow-up question regarding your HiveMQ server performance issues.

Regarding clients connecting with the same client ID - yes, this can definitely contribute to increased load on your HiveMQ broker. When multiple clients attempt to use the same client ID, HiveMQ must handle session takeovers, which involves disconnecting existing sessions and establishing new ones. This process consumes additional resources and can contribute to performance degradation, especially at scale.

To identify clients failing certificate authentication, I recommend enabling more detailed logging for your TLS connections. You can modify your logging configuration to capture client IP addresses during failed SSL/TLS handshakes. This might help you pinpoint the problematic clients. How to enable more detailed logging for your TLS connections: https://hivemq.atlassian.net/wiki/x/wAPWMQ?atlOrigin=eyJpIjoiNzMxNjY1MjU4OGY5NGU1ZDkxNDljMGRjMjQ4ZGNlMmQiLCJwIjoiYyJ9

For managing the TIME_WAIT connections, consider adjusting your TCP keepalive settings and checking if any clients are abruptly disconnecting rather than properly closing their connections.

  • In the MQTT protocol, the client informs the broker about its keep-alive interval during the connection process. Specifically:
  1. The keep-alive value is sent as part of the CONNECT packet when a client establishes a connection with the MQTT broker.
  2. The keep-alive interval is specified in seconds and indicates the maximum time interval permitted to elapse between two control packets sent or received by the client.
  3. If no other messages are being exchanged, the client must send a PINGREQ packet (ping request) within the keep-alive interval, and the broker responds with a PINGRESP (ping response) to maintain the connection.
  4. A keep-alive value of zero (0) means the keep-alive mechanism is disabled (no timeout checking).
  5. If the broker doesn’t receive any message from the client within 1.5 times the keep-alive period, it assumes the connection is broken and will typically close the connection.

This keep-alive mechanism helps detect network issues and half-open connections, which is particularly important in unreliable network environments.

To help understand what’s causing the constant overload on your HiveMQ broker, I recommend implementing monitoring with tools like Prometheus and Grafana to track detailed metrics of your MQTT and cluster internals. This would provide valuable insights into resource utilization patterns.

You can find more information about HiveMQ monitoring here:

I hope these suggestions help with your troubleshooting efforts. Please let us know if you need any further assistance.

Best regards,
Dasha from The HiveMQ Team