Kubernetes Operator Tutorial not working

Hi everyone, thanks for having a look at my issue!

I am trying to get through the tutorial for using the HiveMQ Kubernetes Operator (link), however I can’t get through it.

My first issue was that I could only get the helm upgrade command to run when using the workaround from this thread.

After that, when I am executing kubectl get hivemq-clusters it does not give me the complete result that would be expected in the tutorial:

% kubectl get hivemq-clusters
NAME     SIZE   IMAGE            VERSION     STATUS   ENDPOINT   MESSAGE
hivemq   1      hivemq/hivemq4   k8s-4.8.2

The fields STATUS, ENDPOINT and MESSAGE are always empty. I also don’t see any pods that are created by the operator:

kubectl get pods
NAME                                               READY   STATUS    RESTARTS   AGE
hivemq-hivemq-operator-operator-5479d9888f-f6xdb   0/1     Pending   0          11m

I have seen this behaviour locally on minikube and on my managed Kubernetes cluster in the cloud as well. I hope that you can help me, debug the issue. Thanks in advance!

Greetings
Florian

Update: I could get the operator running by increasing the memory limit. The 512M from the default values.yaml don’t seem to be enough to run the operator.
I added this to myCustomValues.yaml (I am pretty sure that less than 4G of memory would also be enough):

operator:
  resources:
    limits:
      memory: 4096M

However, the operator still does not provision my nodes. Some further information:

% kubectl get pods
NAME                                               READY   STATUS    RESTARTS   AGE
hivemq-hivemq-operator-operator-8585c48d97-9qqgv   1/1     Running   0          2m46s
% kubectl logs hivemq-hivemq-operator-operator-8585c48d97-9qqgv
Picked up JAVA_TOOL_OPTIONS: -XX:+UnlockExperimentalVMOptions -XX:InitialRAMPercentage=30 -XX:MaxRAMPercentage=80 -XX:MinRAMPercentage=30
12:56:47.330 [main] INFO  com.hivemq.Application - Preparing SSL files
12:56:55.652 [main] INFO  com.hivemq.Application - Wrote converted key store to /tmp/store.p12
 __  __ _                                  _
|  \/  (_) ___ _ __ ___  _ __   __ _ _   _| |_
| |\/| | |/ __| '__/ _ \| '_ \ / _` | | | | __|
| |  | | | (__| | | (_) | | | | (_| | |_| | |_
|_|  |_|_|\___|_|  \___/|_| |_|\__,_|\__,_|\__|
  Micronaut (v2.4.2)

12:57:23.730 [main] INFO  io.micronaut.runtime.Micronaut - Startup completed in 28000ms. Server Running: https://hivemq-hivemq-operator-operator-8585c48d97-9qqgv:8443
12:57:29.759 [main] INFO  com.hivemq.Operator - Operating from namespace 'default'
12:57:29.760 [main] INFO  com.hivemq.Operator - Initializing HiveMQ operator
12:57:40.535 [main] INFO  com.hivemq.Operator - Operator started in 10775ms
12:57:40.619 [pool-1-thread-1] INFO  com.hivemq.AbstractWatcher - CustomResource watcher running for kinds HiveMQCluster
12:57:49.528 [pool-1-thread-2] INFO  com.hivemq.Operator - Syncing state for cluster hivemq
% kubectl get hivemq-cluster
NAME     SIZE   IMAGE            VERSION     STATUS    ENDPOINT   MESSAGE
hivemq   2      hivemq/hivemq4   k8s-4.8.2   Pending              Initial status

Content of myCustomValues.yml:

hivemq:
  nodeCount: "2"
  cpu: "2"
  memory: "2Gi"
operator:
  resources:
    limits:
      memory: 4096M

Hello @fluediger ,

Your hivemq pods are still below Minimum Hardware Requirements and the operator fails to start them. Try:

hivemq:
  nodeCount: '2'
  cpu: '4'
  memory: 4Gi
  ephemeralStorage: 15Gi

Btw, my operator configuration:

operator:
  resources:
    limits:
      cpu: 800m
      memory: 512M
    requests:
      cpu: 200m
      memory: 512M

I hope this helps,
Dasha from HiveMQ team

Hi Dasha and thanks for your reply!

The hivemq configuration you provided is very close to the default configuration. I have tried various combinations of resource configurations and the one I posted above was just the latest one that I have tried when I wrote the question here. I have tried the settings you provided again today but I still don’t have any luck.

Here are some more logs that may help debug the issue:

The result of kubectl describe hivemq-cluster hivemq can be found here: hivemq-cluster-describe.yml · GitHub

The operator logs with log level DEBUG can be found here: operator-logs · GitHub

Is the operator code available somewhere? I couldn’t find it on GitHub.

Thanks a lot

Hi @fluediger ,

hivemq-operator code is not publicly available. Helm charts are available in this repo: GitHub - hivemq/helm-charts: HiveMQ Helm charts

I have tested again, now locally with a Kind cluster. For me it works without editing the operator resources. Here is how I run it:

  • Install kind https://kind.sigs.k8s.io or brew install kind.
  • Create Cluster kind create cluster
  • Switch the context of the kubectl to the Kind cluster!
  • add Helm repo helm repo add hivemq https://hivemq.github.io/helm-charts.
  • get the values.yaml helm show values hivemq/hivemq-operator > hivemq-values.yaml
  • Edit the hivemq-values.yaml:
hivemq:
    admissionWebhooks:
        enabled: false
hivemq:
  nodeCount: "1"
  cpu: "4"
  memory: "4Gi"
hivemq:
   clusterReplicaCount: "1"
monitoring:
  enabled: true
  dedicated: false

Note: I did not edit any operator values, only hivemq values. I have decreased hivemq.nodeCount and adjusted the hivemq.clusterReplicaCount accordingly, since it is not enough resources on my laptop to run a cluster with 2 or 3 nodes.

Final steps:

  • create a new namespace kubectl create ns hivemq-demo
  • switch to the new namespace kubectl ns hivemq-demo
  • Install HiveMQ :
helm upgrade hivemq-test --install hivemq/hivemq-operator --values hivemq-values.yaml

This starts a hivemq cluster with 1 node, there are 2 pods: hivemq-operator pod and hivemq-broker pod.

You could try if it works same way on your machine.

Kind regards,
Dasha

Thanks for your thorough reply. However, after following the steps, I see the same behavior as before. Switching to kind does not seem to change anything either.

Is there an option to increase the log level of the operator? Maybe that way we can find out why it’s not working properly.

@fluediger , sure, you can adjust the logLevel of the operator to DEBUG or TRACE to get more diagnostic info:

operator:
    logLevel: DEBUG

Options are: OFF INFO WARN ERROR DEBUG TRACE.

I have set the log level to TRACE but I can’t really find anything unusual in the log. Could you maybe take a look?

Hi @fluediger ,

Thanks for sharing your log I cannot see neither error nor exception in it. Please run the following and see if there are any errors indicated:

kubectl describe operator pod

Please also specify:

  • on what OS your Kubernetes is running
  • hardware spec of the machine that Kubernetes is running on
  • Kubernetes version
  • Java version
  • Helm chart version of hivemq-operator
  • HiveMQ version (in case you are building a custom image)

Thanks,
Dasha from HiveMQ team

Hi Dasha,

kubectl describe pod <operator_id> contains no errors. Here is the “Events” section of the output:

Events:
  Type    Reason          Age   From     Message
  ----    ------          ----  ----     -------
  Normal  SandboxChanged  74s   kubelet  Pod sandbox changed, it will be killed and re-created.
  Normal  Pulled          73s   kubelet  Container image "hivemq/hivemq-operator:4.7.1" already present on machine
  Normal  Created         73s   kubelet  Created container operator
  Normal  Started         73s   kubelet  Started container operator

OS: MacOS 12.5.1
Hardware: MacBook Pro 2021, M1 Max, 32GB RAM
kubectl version --short output:

Client Version: v1.24.4
Kustomize Version: v4.5.4
Server Version: v1.24.0

Java version shouldn’t really matter because everything is running in containers anyways, right?
HiveMQ operator image version: hivemq/hivemq-operator:4.7.1

Thanks for looking into it

Hi @fluediger , thank you for providing more data regarding your issue. Unfortunately I have bad news for you: the M1 processor is not supported officially due to the fact that an image it uses is not built for M1.
Sorry I could not help to here.
Kind regards,
Dasha from HiveMQ team

P.S: There one more hint, in case you continue throubleshooting your deployment. Adding a -w or --watch to all your kubectl get [resource]s command makes the result notified of each creation, modification, or deletion of that resource. So you can run:
kubectl get pods --watch
and look for errors from the pods that cannot get initialized.

Hi Dasha,

alright thanks for clarifying. I will see what I can do about another development machine.
Have a great day and thanks again for looking into the issue.

Cheers
Florian