- 前置要求
- 克隆TBMQ PE K8S仓库
- 定义环境变量
- 配置并创建AKS集群
- 更新kubectl上下文
- Provision PostgreSQL DB
- Create Namespace
- Azure Cache for Valkey
- Installation
- 获取许可证密钥
- 配置许可证密钥
- Provision Kafka
- Starting
- Configure Load Balancers
- Validate the setup
- Upgrading
- Cluster deletion
- 下一步
本指南将帮助你在AKS中部署TBMQ PE。
前置要求
安装和配置工具
在 AKS 集群上部署 TBMQ 前,需安装 kubectl、helm 和 az 工具。
安装完成后,使用以下命令登录 CLI:
1
az login
克隆TBMQ PE K8S仓库
1
2
git clone -b release-2.2.0 https://github.com/thingsboard/tbmq-pe-k8s.git
cd tbmq-pe-k8s/azure
定义环境变量
定义本指南后续命令中将使用的环境变量。
假设您使用 Linux。执行以下命令:
1
2
3
4
5
6
7
export AKS_RESOURCE_GROUP=TBMQResources
export AKS_LOCATION=eastus
export AKS_GATEWAY=tbmq-gateway
export TB_CLUSTER_NAME=tbmq-cluster
export TB_DATABASE_NAME=tbmq-db
echo "You variables ready to create resource group $AKS_RESOURCE_GROUP in location $AKS_LOCATION
and cluster in it $TB_CLUSTER_NAME with database $TB_DATABASE_NAME"
其中:
- TBMQResources — Azure 资源部署和管理的逻辑组。本指南后续将用 AKS_RESOURCE_GROUP 表示;
- eastus — 创建资源组的位置。本指南后续将用 AKS_LOCATION 表示。执行
az account list-locations可查看所有可用位置; - tbmq-gateway — Azure 应用程序网关名称;
- tbmq-cluster — 集群名称。本指南后续将用 TB_CLUSTER_NAME 表示;
- tbmq-db — 数据库服务器名称。可输入不同名称。本指南后续将用 TB_DATABASE_NAME 表示。
配置并创建AKS集群
创建 AKS 集群前,需先创建 Azure 资源组。使用 Azure CLI 操作:
1
az group create --name $AKS_RESOURCE_GROUP --location $AKS_LOCATION
有关 az group 的更多信息请参阅此链接。
资源组创建后,可使用以下命令创建 AKS 集群:
1
2
3
4
5
6
7
8
az aks create --resource-group $AKS_RESOURCE_GROUP \
--name $TB_CLUSTER_NAME \
--generate-ssh-keys \
--enable-addons ingress-appgw \
--appgw-name $AKS_GATEWAY \
--appgw-subnet-cidr "10.225.0.0/24" \
--node-vm-size Standard_D4s_v6 \
--node-count 3
az aks create 有两个必填参数:name 和 resource-group(使用先前设置的变量),
以及多个可选参数(未设置时使用默认值)。其中部分参数如下:
- node-count - Kubernetes 节点池中的节点数量。创建集群后,可用
az aks scale调整节点池大小(默认值为 3); - enable-addons - 以逗号分隔的 Kubernetes 附加组件列表(使用
az aks addon list查看可用附加组件); - node-osdisk-size - 给定代理池中机器的 OS 磁盘类型:临时或托管。在 VM 大小和 OS 磁盘大小满足条件时默认为 “Ephemeral”。创建后可能无法更改;
- node-vm-size(或 -s) - 创建为 Kubernetes 节点的虚拟机大小(默认值为 Standard_DS2_v2);
- generate-ssh-keys - 若缺失则生成 SSH 公钥和私钥文件。密钥将存储在 ~/.ssh 目录中。
上述命令中添加了 ApplicationGateway 的 AKS 附加组件。 将使用该网关作为 TBMQ 的基于路径的负载均衡器。
az aks create 的完整选项请参阅此处。
您也可参考此指南进行自定义集群配置。
更新kubectl上下文
集群创建后,可使用以下命令将 kubectl 连接到该集群:
1
az aks get-credentials --resource-group $AKS_RESOURCE_GROUP --name $TB_CLUSTER_NAME
为验证连接,可执行以下命令:
1
kubectl get nodes
您应能看到集群节点列表。
Provision PostgreSQL DB
You’ll need to set up PostgreSQL on Azure. You may follow this guide, but take into account the following requirements:
- Keep your postgresql password in a safe place. We will refer to it later in this guide using YOUR_AZURE_POSTGRES_PASSWORD;
- Make sure your Azure Database for PostgreSQL version is 17.x;
- Make sure your Azure Database for PostgreSQL instance is accessible from the TBMQ cluster;
- Make sure you use “thingsboard_mqtt_broker” as the initial database name.
Note: Use “High availability” enabled. It enables a lot of useful settings by default.
Another way by which you can create Azure Database for PostgreSQL is using az tool (don’t forget to replace ‘POSTGRESS_USER’ and ‘POSTGRESS_PASS’ with your username and password):
1
2
3
4
az postgres flexible-server create --location $AKS_LOCATION --resource-group $AKS_RESOURCE_GROUP \
--name $TB_DATABASE_NAME --admin-user POSTGRESS_USER --admin-password POSTGRESS_PASS \
--public-access 0.0.0.0 --storage-size 32 \
--version 17 -d thingsboard_mqtt_broker
az postgres flexible-server create has a lot of parameters, a few of them are:
- location — Location. Values from:
az account list-locations; - resource-group (or -g) — Name of the resource group;
- name — Name of the server. The name can contain only lowercase letters, numbers, and the hyphen (-) character. Minimum 3 characters and maximum 63 characters;
- admin-user — Administrator username for the server. Once set, it cannot be changed;
- admin-password — The password of the administrator. Minimum 8 characters and maximum 128 characters. Password must contain characters from three of the following categories: English uppercase letters, English lowercase letters, numbers, and non-alphanumeric characters;
- public-access — Determines the public access. Enter single or range of IP addresses to be included in the allowed list of IPs. IP address ranges must be dash-separated and not contain any spaces. Specifying 0.0.0.0 allows public access from any resources deployed within Azure to access your server. Setting it to “None” sets the server in public access mode but does not create a firewall rule;
- storage-size — The storage capacity of the server. Minimum is 32 GiB and maximum is 16 TiB;
- version — Server major version;
- high-availability — enable or disable high-availability feature. High availability can only be set during flexible server creation (accepted values: Disabled, Enabled. Default value: Disabled);
- database-name (or -d) — The name of the database to be created when provisioning the database server.
You can see the full parameters list here.
Example of response:
1
2
3
4
5
6
7
8
9
10
11
12
13
{
"connectionString": "postgresql://postgres:postgres@$tbmq-db.postgres.database.azure.com/postgres?sslmode=require",
"databaseName": "thingsboard_mqtt_broker",
"firewallName": "AllowAllAzureServicesAndResourcesWithinAzureIps_2021-11-17_15-45-6",
"host": "tbmq-db.postgres.database.azure.com",
"id": "/subscriptions/daff3288-1d5d-47c7-abf0-bfb7b738a18c/resourceGroups/myResourceGroup/providers/Microsoft.DBforPostgreSQL/flexibleServers/thingsboard_mqtt_broker",
"location": "East US",
"password": "postgres",
"resourceGroup": "TBMQResources",
"skuname": "Standard_D2s_v3",
"username": "postgres",
"version": "17"
}
记下命令输出中的host值(本示例为 tbmq-db.postgres.database.azure.com)以及用户名和密码(postgres)。
编辑数据库配置文件,将 YOUR_AZURE_POSTGRES_ENDPOINT_URL 替换为host值,YOUR_AZURE_POSTGRES_USER 和 YOUR_AZURE_POSTGRES_PASSWORD 替换为正确凭据:
1
nano tbmq-db-configmap.yml
Create Namespace
Let’s create a dedicated namespace for our TBMQ cluster deployment to ensure better resource isolation and management.
1
2
kubectl apply -f tbmq-namespace.yml
kubectl config set-context $(kubectl config current-context) --namespace=thingsboard-mqtt-broker
Azure Cache for Valkey
TBMQ PE 依赖 Valkey 存储 DEVICE持久客户端 的消息。 缓存还能减少直接数据库读取,提升性能,尤其在启用认证且多客户端并发连接时。 若无缓存,每次新连接都会触发数据库查询以验证MQTT客户端凭据,在高连接率下可能造成不必要的负载。
可根据环境选择下列路径之一:
Azure Cache就绪后,在 tbmq-cache-configmap.yml 中填入正确的endpoint配置:
-
独立Redis: 取消注释并设置下列值。确保
REDIS_HOST不包含端口(:6379)。1 2 3
REDIS_CONNECTION_TYPE: "standalone" REDIS_HOST: "YOUR_VALKEY_ENDPOINT_URL_WITHOUT_PORT" #REDIS_PASSWORD: "YOUR_REDIS_PASSWORD"
-
Valkey集群: 提供以逗号分隔的 “host:port” 节点endpoint列表以进行引导。
1 2 3 4 5 6
REDIS_CONNECTION_TYPE: "cluster" REDIS_NODES: "COMMA_SEPARATED_LIST_OF_NODES" #REDIS_PASSWORD: "YOUR_REDIS_PASSWORD" # Recommended in Kubernetes for handling dynamic IPs and failover: #REDIS_LETTUCE_CLUSTER_TOPOLOGY_REFRESH_ENABLED: "true" #REDIS_JEDIS_CLUSTER_TOPOLOGY_REFRESH_ENABLED: "true"
Valkey集群创建提示
官方Azure文档假设从全新环境创建Valkey集群。由于你已按本指南完成前期资源创建,需按下列方式调整:
- 跳过基础设施创建:Resource Group和AKS集群已创建,可跳过
az group create和az aks create相关步骤。 - 可选服务:可选择不创建 Azure Key Vault (AKV) 和 Azure Container Registry (ACR) 以简化部署。
- 节点池:为Valkey创建专用节点池为可选。专用池隔离更好,也可使用现有节点池。
- 命名空间:建议将Valkey集群部署到与TBMQ相同的命名空间(如
thingsboard-mqtt-broker),而非单独的valkey命名空间,便于统一管理。
创建Secret若不使用Azure Key Vault,需手动创建generic Kubernetes secret。格式必须与Valkey容器期望的一致(含特定key与换行)。
Example: Manual Secret Creation
1
2
3
4
5
6
7
8
9
# 1. Generate a random password (or set your own)
VALKEY_PASSWORD=$(openssl rand -base64 32)
echo "Generated Password: $VALKEY_PASSWORD"
# 2. Create the secret directly in Kubernetes
# We format it exactly how the container expects: 'requirepass' on line 1, 'primaryauth' on line 2
kubectl create secret generic valkey-password \
--namespace thingsboard-mqtt-broker \
--from-literal=valkey-password-file.conf=$'requirepass '"$VALKEY_PASSWORD"$'\nprimaryauth '"$VALKEY_PASSWORD"
Deploying StatefulSets (Primaries and Replicas)
Proceed with creating the ConfigMap, Primary cluster pods, and Replica cluster pods. You will need to modify the Azure documentation examples to fit your environment:
- Namespace: Ensure all resources point to your defined namespace (e.g.,
thingsboard-mqtt-broker). - Affinity: Update the
affinitysection. If you are using a shared node pool, remove the specificnodeSelectorornodeAffinityrequirements. Instead, usepodAntiAffinityto spread pods across nodes where possible. - Image: If skipping ACR, use the public Docker image:
image: "valkey/valkey:8.0". Note: Avoid using the:latesttag for production stability; stick to a specific version. - Secret Volume: Update the volume configuration to use the standard Kubernetes secret created in the previous step, replacing the CSI/Key Vault driver configuration.
Example: Modified StatefulSet for Primary pods
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
kubectl apply -f-<<EOF
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: valkey-masters
namespace: thingsboard-mqtt-broker
spec:
serviceName: "valkey-masters"
replicas: 3
selector:
matchLabels:
app: valkey
template:
metadata:
labels:
app: valkey
appCluster: valkey-masters
spec:
terminationGracePeriodSeconds: 20
affinity:
# Removed nodeAffinity (dedicated pool requirement)
# Soft Anti-Affinity to prefer spreading pods but allow scheduling on available nodes
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:-weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:-key: app
operator: In
values:-valkey
topologyKey: kubernetes.io/hostname
containers:-name: role-master-checker
image: "valkey/valkey:8.0"
command:-"/bin/bash"-"-c"
args:
[
"while true; do role=\$(valkey-cli --pass \$(cat /etc/valkey-password/valkey-password-file.conf | awk '{print \$2; exit}') role | awk '{print \$1; exit}'); if [ \"\$role\" = \"slave\" ]; then valkey-cli --pass \$(cat /etc/valkey-password/valkey-password-file.conf | awk '{print \$2; exit}') cluster failover; fi; sleep 30; done"
]
volumeMounts:-name: valkey-password
mountPath: /etc/valkey-password
readOnly: true-name: valkey
image: "valkey/valkey:8.0"
env:-name: VALKEY_PASSWORD_FILE
value: "/etc/valkey-password/valkey-password-file.conf"-name: MY_POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
command:-"valkey-server"
args:-"/conf/valkey.conf"-"--cluster-announce-ip"-"\$(MY_POD_IP)"
resources:
requests:
cpu: "100m"
memory: "100Mi"
ports:-name: valkey
containerPort: 6379
protocol: "TCP"-name: cluster
containerPort: 16379
protocol: "TCP"
volumeMounts:-name: conf
mountPath: /conf
readOnly: false-name: data
mountPath: /data
readOnly: false-name: valkey-password
mountPath: /etc/valkey-password
readOnly: true
volumes:-name: valkey-password
# Replaced CSI/KeyVault with standard Kubernetes Secret
secret:
secretName: valkey-password-name: conf
configMap:
name: valkey-cluster
defaultMode: 0755
volumeClaimTemplates:-metadata:
name: data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: managed-csi
resources:
requests:
storage: 20Gi
EOF
Finalizing the Setup
- Services & PDB: Create the headless services and the Pod Disruption Budget (PDB) as outlined in the documentation.
- Initialization: Run the Valkey cluster creation commands to join the nodes.
- Verification: Verify the roles of the pods and the replication status to ensure the cluster is healthy.
TBMQ Configuration
Once the cluster is verified, update your TBMQ configuration values:
- REDIS_NODES: Set this to the headless service DNS, e.g.,
valkey-cluster:6379. - REDIS_PASSWORD: Use the password you generated during secret creation (or the value of
$VALKEY_PASSWORD).
Installation
Execute the following command to run the initial setup of the database. This command will launch a short-living TBMQ pod to provision necessary DB tables, indexes, etc.
1
./k8s-install-tbmq.sh
After this command is finished, you should see the next line in the console:
1
INFO o.t.m.b.i.ThingsboardMqttBrokerInstallService-Installation finished successfully!
获取许可证密钥
在继续之前,请确保已选择订阅计划或购买永久许可证。 若尚未完成,请访问定价页面比较可用选项 并获取许可证密钥。
注意: 本指南中,我们将用 YOUR_LICENSE_KEY_HERE 表示您的许可证密钥。
配置许可证密钥
使用许可证密钥创建 k8s 密钥:
1
2
export TBMQ_LICENSE_KEY=YOUR_LICENSE_KEY_HERE
kubectl create -n thingsboard-mqtt-broker secret generic tbmq-license --from-literal=license-key=$TBMQ_LICENSE_KEY
Provision Kafka
TBMQ 需要运行中的 Kafka 集群。可通过两种方式部署 Kafka:
- 部署自管理的 Apache Kafka 集群
- 使用 Strimzi Operator 部署托管 Kafka 集群
根据环境与运维需求选择合适方案。
选项 1. 部署 Apache Kafka 集群
- 以 StatefulSet 运行,3 个 pod 处于 KRaft 双角色模式(每个节点同时充当 controller 和 broker)。
-
适用于轻量、自管的 Kafka 部署。
- 完整部署指南见此处.
快速步骤:
1
kubectl apply -f kafka/tbmq-kafka.yml
更新 TBMQ 配置文件(tbmq.yml 和 tbmq-ie.yml),并取消注释标记为以下内容的段落:
1
# Uncomment the following lines to connect to Apache Kafka
选项 2. 使用 Strimzi Operator 部署 Kafka 集群
- 使用 Kubernetes 的 Strimzi Cluster Operator 管理 Kafka。
-
便于升级、扩缩容和运维管理。
- 完整部署指南见此处.
快速步骤:
安装 Strimzi operator:
1
helm install tbmq-kafka -f kafka/operator/values-strimzi-kafka-operator.yaml oci://quay.io/strimzi-helm/strimzi-kafka-operator --version 0.47.0
部署 Kafka 集群:
1
kubectl apply -f kafka/operator/kafka-cluster.yaml
更新 TBMQ 配置文件(tbmq.yml 和 tbmq-ie.yml),并取消注释标记为以下内容的段落:
1
# Uncomment the following lines to connect to Strimzi
Starting
执行以下命令部署 broker:
1
./k8s-deploy-tbmq.sh
几分钟后,可执行以下命令检查所有 pod 状态。
1
kubectl get pods
若一切正常,应能看到 tbmq-0 和 tbmq-1 pod,且均处于 READY 状态。
Configure Load Balancers
Configure HTTP(S) Load Balancer
Configure HTTP(S) Load Balancer to access the web interface of your TBMQ PE instance. Basically, you have 2 possible options of configuration:
- http — Load Balancer without HTTPS support. Recommended for development. The only advantage is simple configuration and minimum costs. May be a good option for development server but definitely not suitable for production.
- https — Load Balancer with HTTPS support. Recommended for production. Acts as an SSL termination point. You may easily configure it to issue and maintain a valid SSL certificate. Automatically redirects all non-secure (HTTP) traffic to secure (HTTPS) port.
See links/instructions below on how to configure each of the suggested options.
HTTP Load Balancer
执行以下命令部署纯 HTTP 负载均衡器:
1
kubectl apply -f receipts/http-load-balancer.yml
负载均衡器配置可能需要一些时间。可使用以下命令定期检查负载均衡器状态:
1
kubectl get ingress
配置完成后,您应能看到类似输出:
1
2
NAME CLASS HOSTS ADDRESS PORTS AGE
tbmq-http-loadbalancer <none> * 34.111.24.134 80 7m25s
HTTPS Load Balancer
For using ssl certificates, we can add our certificate directly in Azure ApplicationGateway using the following command:
1
2
3
4
5
6
az network application-gateway ssl-cert create \
--resource-group $(az aks show --name $TB_CLUSTER_NAME --resource-group $AKS_RESOURCE_GROUP --query nodeResourceGroup | tr -d '"') \
--gateway-name $AKS_GATEWAY\
--name TBMQHTTPSCert \
--cert-file YOUR_CERT \
--cert-password YOUR_CERT_PASS
Execute the following command to deploy plain https load balancer:
1
kubectl apply -f receipts/https-load-balancer.yml
Configure MQTT Load Balancer
配置 MQTT 负载均衡器以支持通过 MQTT 协议连接设备。
执行以下命令创建 TCP 负载均衡器:
1
kubectl apply -f receipts/mqtt-load-balancer.yml
负载均衡器将转发 1883 和 8883 端口的所有 TCP 流量。
MQTT over SSL
按照 此指南 创建包含 SSL 证书的 .pem 文件。将文件以 server.pem 保存在工作目录中。
需创建包含 PEM 文件的 config-map,可执行以下命令:
1
2
3
4
kubectl create configmap tbmq-mqtts-config \
--from-file=server.pem=YOUR_PEM_FILENAME \
--from-file=mqttserver_key.pem=YOUR_PEM_KEY_FILENAME \
-o yaml --dry-run=client | kubectl apply -f -
- YOUR_PEM_FILENAME 为服务器证书文件名称。
- YOUR_PEM_KEY_FILENAME 为服务器证书私钥文件名称。
然后,取消 tbmq.yml 文件中标记为「Uncomment the following lines to enable two-way MQTTS」的所有节的注释。
执行以下命令应用更改:
1
kubectl apply -f tbmq.yml
Validate the setup
现在可在浏览器中使用负载均衡器的 DNS 名称打开 TBMQ Web 界面。
可执行以下命令获取负载均衡器的 DNS 名称:
1
kubectl get ingress
输出应类似:
1
2
NAME CLASS HOSTS ADDRESS PORTS AGE
tbmq-http-loadbalancer <none> * 34.111.24.134 80 3d1h
使用 tbmq-http-loadbalancer 的 ADDRESS 字段连接集群。
您将看到TBMQ登录页面。请使用以下 System Administrator(系统管理员)默认凭据:
用户名:
1
sysadmin@thingsboard.org
密码:
1
sysadmin
首次登录时,系统将要求您将默认密码修改为自定义密码,然后使用新凭据重新登录。
Validate MQTT access
要通过 MQTT 连接集群,需获取对应服务的 IP。可执行以下命令:
1
kubectl get services
输出应类似:
1
2
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
tbmq-mqtt-loadbalancer LoadBalancer 10.100.119.170 ******* 1883:30308/TCP,8883:31609/TCP 6m58s
使用负载均衡器的 EXTERNAL-IP 字段通过 MQTT 协议连接集群。
故障排查
若遇问题,可查看服务日志排查错误。例如,查看 TBMQ 日志可执行:
1
kubectl logs -f tbmq-0
使用以下命令查看所有 StatefulSet 的状态:
1
kubectl get statefulsets
更多命令说明请参阅 kubectl 速查表。
Upgrading
查看 release notes 和 升级说明 了解最新变更详情。
若文档未涵盖您的升级场景,请联系我们以获取进一步指导。
Backup and restore (Optional)
While backing up your PostgreSQL database is highly recommended, it is optional before proceeding with the upgrade. For further guidance, follow the next instructions.
从 TBMQ CE 升级到 TBMQ PE(v2.2.0)
要将现有 TBMQ 社区版 (CE) 升级到 TBMQ 专业版 (PE),请确保在开始前已运行最新的 TBMQ CE 2.2.0 版本。 将当前配置与最新的 TBMQ PE K8S 脚本 合并。 请勿忘记配置许可证密钥。
运行以下命令(包含升级脚本)以迁移 PostgreSQL 数据库数据(从 CE 到 PE):
1
2
3
./k8s-delete-tbmq.sh
./k8s-upgrade-tbmq.sh --fromVersion=ce
./k8s-deploy-tbmq.sh
Cluster deletion
Execute the following command to delete TBMQ nodes:
1
./k8s-delete-tbmq.sh
Execute the following command to delete all TBMQ nodes and configmaps, load balancers, etc.:
1
./k8s-delete-all.sh
Execute the following command to delete the AKS cluster:
1
az aks delete --resource-group $AKS_RESOURCE_GROUP --name $TB_CLUSTER_NAME
下一步
-
快速入门指南 - 本指南提供 TBMQ 的快速概览。
-
安全指南 - 学习如何为 MQTT 客户端启用认证与授权。
-
配置指南 - 了解 TBMQ 配置文件和参数。
-
MQTT 客户端类型指南 - 了解 TBMQ 客户端类型。
-
与 ThingsBoard 集成 - 了解如何将 TBMQ 与 ThingsBoard 集成。