产品定价 立即试用
PE MQTT Broker
安装 > Azure (Kubernetes) 集群
入门 文档
架构 API 常见问题
目录

使用 Kubernetes 在 Azure 上部署 TBMQ PE 集群

本指南将帮助你在AKS中部署TBMQ PE。

前置要求

安装和配置工具

在 AKS 集群上部署 TBMQ 前,需安装 kubectlhelmaz 工具。

安装完成后,使用以下命令登录 CLI:

1
az login

克隆TBMQ PE K8S仓库

1
2
git clone -b release-2.2.0 https://github.com/thingsboard/tbmq-pe-k8s.git
cd tbmq-pe-k8s/azure

定义环境变量

定义本指南后续命令中将使用的环境变量。

假设您使用 Linux。执行以下命令:

1
2
3
4
5
6
7
export AKS_RESOURCE_GROUP=TBMQResources
export AKS_LOCATION=eastus
export AKS_GATEWAY=tbmq-gateway
export TB_CLUSTER_NAME=tbmq-cluster
export TB_DATABASE_NAME=tbmq-db
echo "You variables ready to create resource group $AKS_RESOURCE_GROUP in location $AKS_LOCATION 
and cluster in it $TB_CLUSTER_NAME with database $TB_DATABASE_NAME"

其中:

  • TBMQResources — Azure 资源部署和管理的逻辑组。本指南后续将用 AKS_RESOURCE_GROUP 表示;
  • eastus — 创建资源组的位置。本指南后续将用 AKS_LOCATION 表示。执行 az account list-locations 可查看所有可用位置;
  • tbmq-gateway — Azure 应用程序网关名称;
  • tbmq-cluster — 集群名称。本指南后续将用 TB_CLUSTER_NAME 表示;
  • tbmq-db — 数据库服务器名称。可输入不同名称。本指南后续将用 TB_DATABASE_NAME 表示。

配置并创建AKS集群

创建 AKS 集群前,需先创建 Azure 资源组。使用 Azure CLI 操作:

1
az group create --name $AKS_RESOURCE_GROUP --location $AKS_LOCATION

有关 az group 的更多信息请参阅此链接

资源组创建后,可使用以下命令创建 AKS 集群:

1
2
3
4
5
6
7
8
az aks create --resource-group $AKS_RESOURCE_GROUP \
    --name $TB_CLUSTER_NAME \
    --generate-ssh-keys \
    --enable-addons ingress-appgw \
    --appgw-name $AKS_GATEWAY \
    --appgw-subnet-cidr "10.225.0.0/24" \
    --node-vm-size Standard_D4s_v6 \
    --node-count 3

az aks create 有两个必填参数:nameresource-group(使用先前设置的变量), 以及多个可选参数(未设置时使用默认值)。其中部分参数如下:

  • node-count - Kubernetes 节点池中的节点数量。创建集群后,可用 az aks scale 调整节点池大小(默认值为 3);
  • enable-addons - 以逗号分隔的 Kubernetes 附加组件列表(使用 az aks addon list 查看可用附加组件);
  • node-osdisk-size - 给定代理池中机器的 OS 磁盘类型:临时或托管。在 VM 大小和 OS 磁盘大小满足条件时默认为 “Ephemeral”。创建后可能无法更改;
  • node-vm-size(或 -s) - 创建为 Kubernetes 节点的虚拟机大小(默认值为 Standard_DS2_v2);
  • generate-ssh-keys - 若缺失则生成 SSH 公钥和私钥文件。密钥将存储在 ~/.ssh 目录中。

上述命令中添加了 ApplicationGateway 的 AKS 附加组件。 将使用该网关作为 TBMQ 的基于路径的负载均衡器。

az aks create 的完整选项请参阅此处

您也可参考此指南进行自定义集群配置。

更新kubectl上下文

集群创建后,可使用以下命令将 kubectl 连接到该集群:

1
az aks get-credentials --resource-group $AKS_RESOURCE_GROUP --name $TB_CLUSTER_NAME

为验证连接,可执行以下命令:

1
kubectl get nodes

您应能看到集群节点列表。

Provision PostgreSQL DB

You’ll need to set up PostgreSQL on Azure. You may follow this guide, but take into account the following requirements:

  • Keep your postgresql password in a safe place. We will refer to it later in this guide using YOUR_AZURE_POSTGRES_PASSWORD;
  • Make sure your Azure Database for PostgreSQL version is 17.x;
  • Make sure your Azure Database for PostgreSQL instance is accessible from the TBMQ cluster;
  • Make sure you use “thingsboard_mqtt_broker” as the initial database name.

Note: Use “High availability” enabled. It enables a lot of useful settings by default.

Another way by which you can create Azure Database for PostgreSQL is using az tool (don’t forget to replace ‘POSTGRESS_USER’ and ‘POSTGRESS_PASS’ with your username and password):

1
2
3
4
az postgres flexible-server create --location $AKS_LOCATION --resource-group $AKS_RESOURCE_GROUP \
  --name $TB_DATABASE_NAME --admin-user POSTGRESS_USER --admin-password POSTGRESS_PASS \
  --public-access 0.0.0.0 --storage-size 32 \
  --version 17 -d thingsboard_mqtt_broker

az postgres flexible-server create has a lot of parameters, a few of them are:

  • location — Location. Values from: az account list-locations;
  • resource-group (or -g) — Name of the resource group;
  • name — Name of the server. The name can contain only lowercase letters, numbers, and the hyphen (-) character. Minimum 3 characters and maximum 63 characters;
  • admin-user — Administrator username for the server. Once set, it cannot be changed;
  • admin-password — The password of the administrator. Minimum 8 characters and maximum 128 characters. Password must contain characters from three of the following categories: English uppercase letters, English lowercase letters, numbers, and non-alphanumeric characters;
  • public-access — Determines the public access. Enter single or range of IP addresses to be included in the allowed list of IPs. IP address ranges must be dash-separated and not contain any spaces. Specifying 0.0.0.0 allows public access from any resources deployed within Azure to access your server. Setting it to “None” sets the server in public access mode but does not create a firewall rule;
  • storage-size — The storage capacity of the server. Minimum is 32 GiB and maximum is 16 TiB;
  • version — Server major version;
  • high-availability — enable or disable high-availability feature. High availability can only be set during flexible server creation (accepted values: Disabled, Enabled. Default value: Disabled);
  • database-name (or -d) — The name of the database to be created when provisioning the database server.

You can see the full parameters list here.

Example of response:

1
2
3
4
5
6
7
8
9
10
11
12
13
{
  "connectionString": "postgresql://postgres:postgres@$tbmq-db.postgres.database.azure.com/postgres?sslmode=require",
  "databaseName": "thingsboard_mqtt_broker",
  "firewallName": "AllowAllAzureServicesAndResourcesWithinAzureIps_2021-11-17_15-45-6",
  "host": "tbmq-db.postgres.database.azure.com",
  "id": "/subscriptions/daff3288-1d5d-47c7-abf0-bfb7b738a18c/resourceGroups/myResourceGroup/providers/Microsoft.DBforPostgreSQL/flexibleServers/thingsboard_mqtt_broker",
  "location": "East US",
  "password": "postgres",
  "resourceGroup": "TBMQResources",
  "skuname": "Standard_D2s_v3",
  "username": "postgres",
  "version": "17"
}

记下命令输出中的host值(本示例为 tbmq-db.postgres.database.azure.com)以及用户名和密码(postgres)。

编辑数据库配置文件,将 YOUR_AZURE_POSTGRES_ENDPOINT_URL 替换为host值,YOUR_AZURE_POSTGRES_USERYOUR_AZURE_POSTGRES_PASSWORD 替换为正确凭据:

1
nano tbmq-db-configmap.yml

Create Namespace

Let’s create a dedicated namespace for our TBMQ cluster deployment to ensure better resource isolation and management.

1
2
kubectl apply -f tbmq-namespace.yml
kubectl config set-context $(kubectl config current-context) --namespace=thingsboard-mqtt-broker

Azure Cache for Valkey

TBMQ PE 依赖 Valkey 存储 DEVICE持久客户端 的消息。 缓存还能减少直接数据库读取,提升性能,尤其在启用认证且多客户端并发连接时。 若无缓存,每次新连接都会触发数据库查询以验证MQTT客户端凭据,在高连接率下可能造成不必要的负载。

文档信息图标

Note: Starting from TBMQ PE v2.2.0, Valkey 8.0 is officially supported. Azure currently does not provide a managed Valkey service. However, Valkey is fully compatible with Redis7.2.x, which is supported on Azure Cache for Redis Enterprise and Enterprise Flash SKUs. The Basic, Standard, and Premium SKUs only support up to Redis6.x, and are therefore not recommended for TBMQ deployments. To ensure compatibility with TBMQ PE v2.2.0 and later, deploy your own Valkey cluster or use an Enterprise-tier SKU.

可根据环境选择下列路径之一:

Azure Cache就绪后,在 tbmq-cache-configmap.yml 中填入正确的endpoint配置:

  • 独立Redis: 取消注释并设置下列值。确保 REDIS_HOST 包含端口(:6379)。

    1
    2
    3
    
    REDIS_CONNECTION_TYPE: "standalone"
    REDIS_HOST: "YOUR_VALKEY_ENDPOINT_URL_WITHOUT_PORT"
    #REDIS_PASSWORD: "YOUR_REDIS_PASSWORD"
    
  • Valkey集群: 提供以逗号分隔的 “host:port” 节点endpoint列表以进行引导。

    1
    2
    3
    4
    5
    6
    
    REDIS_CONNECTION_TYPE: "cluster"
    REDIS_NODES: "COMMA_SEPARATED_LIST_OF_NODES"
    #REDIS_PASSWORD: "YOUR_REDIS_PASSWORD"
    # Recommended in Kubernetes for handling dynamic IPs and failover:
    #REDIS_LETTUCE_CLUSTER_TOPOLOGY_REFRESH_ENABLED: "true"
    #REDIS_JEDIS_CLUSTER_TOPOLOGY_REFRESH_ENABLED: "true"
    

Valkey集群创建提示

官方Azure文档假设从全新环境创建Valkey集群。由于你已按本指南完成前期资源创建,需按下列方式调整:

  • 跳过基础设施创建:Resource Group和AKS集群已创建,可跳过 az group createaz aks create 相关步骤。
  • 可选服务:可选择不创建 Azure Key Vault (AKV)Azure Container Registry (ACR) 以简化部署。
  • 节点池:为Valkey创建专用节点池为可选。专用池隔离更好,也可使用现有节点池。
  • 命名空间:建议将Valkey集群部署到与TBMQ相同的命名空间(如 thingsboard-mqtt-broker),而非单独的 valkey 命名空间,便于统一管理。

创建Secret若不使用Azure Key Vault,需手动创建generic Kubernetes secret。格式必须与Valkey容器期望的一致(含特定key与换行)。

Example: Manual Secret Creation
1
2
3
4
5
6
7
8
9
# 1. Generate a random password (or set your own)
VALKEY_PASSWORD=$(openssl rand -base64 32)
echo "Generated Password: $VALKEY_PASSWORD"

# 2. Create the secret directly in Kubernetes
# We format it exactly how the container expects: 'requirepass' on line 1, 'primaryauth' on line 2
kubectl create secret generic valkey-password \
  --namespace thingsboard-mqtt-broker \
  --from-literal=valkey-password-file.conf=$'requirepass '"$VALKEY_PASSWORD"$'\nprimaryauth '"$VALKEY_PASSWORD"

Deploying StatefulSets (Primaries and Replicas)

Proceed with creating the ConfigMap, Primary cluster pods, and Replica cluster pods. You will need to modify the Azure documentation examples to fit your environment:

  • Namespace: Ensure all resources point to your defined namespace (e.g., thingsboard-mqtt-broker).
  • Affinity: Update the affinity section. If you are using a shared node pool, remove the specific nodeSelector or nodeAffinity requirements. Instead, use podAntiAffinity to spread pods across nodes where possible.
  • Image: If skipping ACR, use the public Docker image: image: "valkey/valkey:8.0". Note: Avoid using the :latest tag for production stability; stick to a specific version.
  • Secret Volume: Update the volume configuration to use the standard Kubernetes secret created in the previous step, replacing the CSI/Key Vault driver configuration.
Example: Modified StatefulSet for Primary pods
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
kubectl apply -f-<<EOF
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: valkey-masters
  namespace: thingsboard-mqtt-broker
spec:
  serviceName: "valkey-masters"
  replicas: 3
  selector:
    matchLabels:
      app: valkey
  template:
    metadata:
      labels:
        app: valkey
        appCluster: valkey-masters
    spec:
      terminationGracePeriodSeconds: 20
      affinity:
        # Removed nodeAffinity (dedicated pool requirement)
        # Soft Anti-Affinity to prefer spreading pods but allow scheduling on available nodes
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:-weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:-key: app
                  operator: In
                  values:-valkey
              topologyKey: kubernetes.io/hostname
      containers:-name: role-master-checker
        image: "valkey/valkey:8.0"
        command:-"/bin/bash"-"-c"
        args:
          [
            "while true; do role=\$(valkey-cli --pass \$(cat /etc/valkey-password/valkey-password-file.conf | awk '{print \$2; exit}') role | awk '{print \$1; exit}');     if [ \"\$role\" = \"slave\" ]; then valkey-cli --pass \$(cat /etc/valkey-password/valkey-password-file.conf | awk '{print \$2; exit}') cluster failover; fi; sleep 30; done"
          ]
        volumeMounts:-name: valkey-password
          mountPath: /etc/valkey-password
          readOnly: true-name: valkey
        image: "valkey/valkey:8.0"
        env:-name: VALKEY_PASSWORD_FILE
          value: "/etc/valkey-password/valkey-password-file.conf"-name: MY_POD_IP
          valueFrom:
            fieldRef:
              fieldPath: status.podIP
        command:-"valkey-server"
        args:-"/conf/valkey.conf"-"--cluster-announce-ip"-"\$(MY_POD_IP)"
        resources:
          requests:
            cpu: "100m"
            memory: "100Mi"
        ports:-name: valkey
              containerPort: 6379
              protocol: "TCP"-name: cluster
              containerPort: 16379
              protocol: "TCP"
        volumeMounts:-name: conf
          mountPath: /conf
          readOnly: false-name: data
          mountPath: /data
          readOnly: false-name: valkey-password
          mountPath: /etc/valkey-password
          readOnly: true
      volumes:-name: valkey-password
        # Replaced CSI/KeyVault with standard Kubernetes Secret
        secret:
          secretName: valkey-password-name: conf
        configMap:
          name: valkey-cluster
          defaultMode: 0755
  volumeClaimTemplates:-metadata:
      name: data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: managed-csi
      resources:
        requests:
          storage: 20Gi
EOF

Finalizing the Setup

  1. Services & PDB: Create the headless services and the Pod Disruption Budget (PDB) as outlined in the documentation.
  2. Initialization: Run the Valkey cluster creation commands to join the nodes.
  3. Verification: Verify the roles of the pods and the replication status to ensure the cluster is healthy.

TBMQ Configuration

Once the cluster is verified, update your TBMQ configuration values:

  • REDIS_NODES: Set this to the headless service DNS, e.g., valkey-cluster:6379.
  • REDIS_PASSWORD: Use the password you generated during secret creation (or the value of $VALKEY_PASSWORD).

Installation

Execute the following command to run the initial setup of the database. This command will launch a short-living TBMQ pod to provision necessary DB tables, indexes, etc.

1
./k8s-install-tbmq.sh

After this command is finished, you should see the next line in the console:

1
INFO  o.t.m.b.i.ThingsboardMqttBrokerInstallService-Installation finished successfully!
文档信息图标

Otherwise, please check if you set the PostgreSQL URL and PostgreSQL password in the tbmq-db-configmap.yml correctly.

获取许可证密钥

在继续之前,请确保已选择订阅计划或购买永久许可证。 若尚未完成,请访问定价页面比较可用选项 并获取许可证密钥。

注意: 本指南中,我们将用 YOUR_LICENSE_KEY_HERE 表示您的许可证密钥。

配置许可证密钥

使用许可证密钥创建 k8s 密钥:

1
2
export TBMQ_LICENSE_KEY=YOUR_LICENSE_KEY_HERE 
kubectl create -n thingsboard-mqtt-broker secret generic tbmq-license --from-literal=license-key=$TBMQ_LICENSE_KEY
文档信息图标

请勿忘记将 YOUR_LICENSE_KEY_HERE 替换为您的许可证密钥值。

Provision Kafka

TBMQ 需要运行中的 Kafka 集群。可通过两种方式部署 Kafka:

  • 部署自管理的 Apache Kafka 集群
  • 使用 Strimzi Operator 部署托管 Kafka 集群

根据环境与运维需求选择合适方案。

选项 1. 部署 Apache Kafka 集群

  • StatefulSet 运行,3 个 pod 处于 KRaft 双角色模式(每个节点同时充当 controller 和 broker)。
  • 适用于轻量、自管的 Kafka 部署。

  • 完整部署指南见此处.

快速步骤:

1
kubectl apply -f kafka/tbmq-kafka.yml

更新 TBMQ 配置文件(tbmq.ymltbmq-ie.yml),并取消注释标记为以下内容的段落:

1
# Uncomment the following lines to connect to Apache Kafka

选项 2. 使用 Strimzi Operator 部署 Kafka 集群

  • 使用 Kubernetes 的 Strimzi Cluster Operator 管理 Kafka。
  • 便于升级、扩缩容和运维管理。

  • 完整部署指南见此处.

快速步骤:

安装 Strimzi operator:

1
helm install tbmq-kafka -f kafka/operator/values-strimzi-kafka-operator.yaml oci://quay.io/strimzi-helm/strimzi-kafka-operator --version 0.47.0

部署 Kafka 集群:

1
kubectl apply -f kafka/operator/kafka-cluster.yaml

更新 TBMQ 配置文件(tbmq.ymltbmq-ie.yml),并取消注释标记为以下内容的段落:

1
# Uncomment the following lines to connect to Strimzi

Starting

执行以下命令部署 broker:

1
./k8s-deploy-tbmq.sh

几分钟后,可执行以下命令检查所有 pod 状态。

1
kubectl get pods

若一切正常,应能看到 tbmq-0tbmq-1 pod,且均处于 READY 状态。

Configure Load Balancers

Configure HTTP(S) Load Balancer

Configure HTTP(S) Load Balancer to access the web interface of your TBMQ PE instance. Basically, you have 2 possible options of configuration:

  • http — Load Balancer without HTTPS support. Recommended for development. The only advantage is simple configuration and minimum costs. May be a good option for development server but definitely not suitable for production.
  • https — Load Balancer with HTTPS support. Recommended for production. Acts as an SSL termination point. You may easily configure it to issue and maintain a valid SSL certificate. Automatically redirects all non-secure (HTTP) traffic to secure (HTTPS) port.

See links/instructions below on how to configure each of the suggested options.

HTTP Load Balancer

执行以下命令部署纯 HTTP 负载均衡器:

1
kubectl apply -f receipts/http-load-balancer.yml

负载均衡器配置可能需要一些时间。可使用以下命令定期检查负载均衡器状态:

1
kubectl get ingress

配置完成后,您应能看到类似输出:

1
2
NAME                     CLASS    HOSTS   ADDRESS         PORTS   AGE
tbmq-http-loadbalancer   <none>   *       34.111.24.134   80      7m25s

HTTPS Load Balancer

For using ssl certificates, we can add our certificate directly in Azure ApplicationGateway using the following command:

1
2
3
4
5
6
az network application-gateway ssl-cert create \
   --resource-group $(az aks show --name $TB_CLUSTER_NAME --resource-group $AKS_RESOURCE_GROUP --query nodeResourceGroup | tr -d '"') \
   --gateway-name $AKS_GATEWAY\
   --name TBMQHTTPSCert \
   --cert-file YOUR_CERT \
   --cert-password YOUR_CERT_PASS

Execute the following command to deploy plain https load balancer:

1
kubectl apply -f receipts/https-load-balancer.yml

Configure MQTT Load Balancer

配置 MQTT 负载均衡器以支持通过 MQTT 协议连接设备。

执行以下命令创建 TCP 负载均衡器:

1
kubectl apply -f receipts/mqtt-load-balancer.yml

负载均衡器将转发 1883 和 8883 端口的所有 TCP 流量。

MQTT over SSL

按照 此指南 创建包含 SSL 证书的 .pem 文件。将文件以 server.pem 保存在工作目录中。

需创建包含 PEM 文件的 config-map,可执行以下命令:

1
2
3
4
kubectl create configmap tbmq-mqtts-config \
 --from-file=server.pem=YOUR_PEM_FILENAME \
 --from-file=mqttserver_key.pem=YOUR_PEM_KEY_FILENAME \
 -o yaml --dry-run=client | kubectl apply -f -
  • YOUR_PEM_FILENAME服务器证书文件名称。
  • YOUR_PEM_KEY_FILENAME服务器证书私钥文件名称。

然后,取消 tbmq.yml 文件中标记为「Uncomment the following lines to enable two-way MQTTS」的所有节的注释。

执行以下命令应用更改:

1
kubectl apply -f tbmq.yml

Validate the setup

现在可在浏览器中使用负载均衡器的 DNS 名称打开 TBMQ Web 界面。

可执行以下命令获取负载均衡器的 DNS 名称:

1
kubectl get ingress

输出应类似:

1
2
NAME                     CLASS    HOSTS   ADDRESS         PORTS   AGE
tbmq-http-loadbalancer   <none>   *       34.111.24.134   80      3d1h

使用 tbmq-http-loadbalancer 的 ADDRESS 字段连接集群。

您将看到TBMQ登录页面。请使用以下 System Administrator(系统管理员)默认凭据:

用户名

1
sysadmin@thingsboard.org

密码

1
sysadmin

首次登录时,系统将要求您将默认密码修改为自定义密码,然后使用新凭据重新登录。

Validate MQTT access

要通过 MQTT 连接集群,需获取对应服务的 IP。可执行以下命令:

1
kubectl get services

输出应类似:

1
2
NAME                     TYPE           CLUSTER-IP       EXTERNAL-IP              PORT(S)                         AGE
tbmq-mqtt-loadbalancer   LoadBalancer   10.100.119.170   *******                  1883:30308/TCP,8883:31609/TCP   6m58s

使用负载均衡器的 EXTERNAL-IP 字段通过 MQTT 协议连接集群。

故障排查

若遇问题,可查看服务日志排查错误。例如,查看 TBMQ 日志可执行:

1
kubectl logs -f tbmq-0

使用以下命令查看所有 StatefulSet 的状态:

1
kubectl get statefulsets

更多命令说明请参阅 kubectl 速查表

Upgrading

查看 release notes升级说明 了解最新变更详情。

若文档未涵盖您的升级场景,请联系我们以获取进一步指导。

Backup and restore (Optional)

While backing up your PostgreSQL database is highly recommended, it is optional before proceeding with the upgrade. For further guidance, follow the next instructions.

从 TBMQ CE 升级到 TBMQ PE(v2.2.0)

要将现有 TBMQ 社区版 (CE) 升级到 TBMQ 专业版 (PE),请确保在开始前已运行最新的 TBMQ CE 2.2.0 版本。 将当前配置与最新的 TBMQ PE K8S 脚本 合并。 请勿忘记配置许可证密钥

运行以下命令(包含升级脚本)以迁移 PostgreSQL 数据库数据(从 CE 到 PE):

1
2
3
./k8s-delete-tbmq.sh
./k8s-upgrade-tbmq.sh --fromVersion=ce
./k8s-deploy-tbmq.sh

Cluster deletion

Execute the following command to delete TBMQ nodes:

1
./k8s-delete-tbmq.sh

Execute the following command to delete all TBMQ nodes and configmaps, load balancers, etc.:

1
./k8s-delete-all.sh

Execute the following command to delete the AKS cluster:

1
az aks delete --resource-group $AKS_RESOURCE_GROUP --name $TB_CLUSTER_NAME

下一步