立即试用 商务报价
Edge PE
文档 > 故障排查

本页目录

Troubleshooting

Troubleshooting instruments and tips

Message Pack Processing Log

You can enable logging of the slowest and most frequently called rule-nodes. To do this you need to update your logging configuration with the following logger:

1
<logger name="org.thingsboard.server.service.queue.TbMsgPackProcessingContext" level="DEBUG" />

After this you can find the following messages in your logs:

1
2
3
4
5
6
7
8
9
2021-03-24 17:01:21,023 [tb-rule-engine-consumer-24-thread-3] DEBUG o.t.s.s.q.TbMsgPackProcessingContext - Top Rule Nodes by max execution time:
2021-03-24 17:01:21,024 [tb-rule-engine-consumer-24-thread-3] DEBUG o.t.s.s.q.TbMsgPackProcessingContext - [Main][3f740670-8cc0-11eb-bcd9-d343878c0c7f] max execution time: 1102. [RuleChain: Thermostat|RuleNode: Device Profile Node(3f740670-8cc0-11eb-bcd9-d343878c0c7f)]
2021-03-24 17:01:21,024 [tb-rule-engine-consumer-24-thread-3] DEBUG o.t.s.s.q.TbMsgPackProcessingContext - [Main][3f6debf0-8cc0-11eb-bcd9-d343878c0c7f] max execution time: 1. [RuleChain: Thermostat|RuleNode: Message Type Switch(3f6debf0-8cc0-11eb-bcd9-d343878c0c7f)]
2021-03-24 17:01:21,024 [tb-rule-engine-consumer-24-thread-3] INFO  o.t.s.s.q.TbMsgPackProcessingContext - Top Rule Nodes by avg execution time:
2021-03-24 17:01:21,024 [tb-rule-engine-consumer-24-thread-3] INFO  o.t.s.s.q.TbMsgPackProcessingContext - [Main][3f740670-8cc0-11eb-bcd9-d343878c0c7f] avg execution time: 604.0. [RuleChain: Thermostat|RuleNode: Device Profile Node(3f740670-8cc0-11eb-bcd9-d343878c0c7f)]
2021-03-24 17:01:21,025 [tb-rule-engine-consumer-24-thread-3] INFO  o.t.s.s.q.TbMsgPackProcessingContext - [Main][3f6debf0-8cc0-11eb-bcd9-d343878c0c7f] avg execution time: 1.0. [RuleChain: Thermostat|RuleNode: Message Type Switch(3f6debf0-8cc0-11eb-bcd9-d343878c0c7f)]
2021-03-24 17:01:21,025 [tb-rule-engine-consumer-24-thread-3] INFO  o.t.s.s.q.TbMsgPackProcessingContext - Top Rule Nodes by execution count:
2021-03-24 17:01:21,025 [tb-rule-engine-consumer-24-thread-3] INFO  o.t.s.s.q.TbMsgPackProcessingContext - [Main][3f740670-8cc0-11eb-bcd9-d343878c0c7f] execution count: 2. [RuleChain: Thermostat|RuleNode: Device Profile Node(3f740670-8cc0-11eb-bcd9-d343878c0c7f)]
2021-03-24 17:01:21,028 [tb-rule-engine-consumer-24-thread-3] INFO  o.t.s.s.q.TbMsgPackProcessingContext - [Main][3f6debf0-8cc0-11eb-bcd9-d343878c0c7f] execution count: 1. [RuleChain: Thermostat|RuleNode: Message Type Switch(3f6debf0-8cc0-11eb-bcd9-d343878c0c7f)]

Logs

Read logs

Regardless of the deployment type, ThingsBoard Edge logs stored in the following directory:

1
/var/log/tb-edge

Different deployment tools provide different ways to view logs:

View last logs in runtime:

1
tail -f /var/log/tb-edge/tb-edge.log

You can use grep command to show only the output with desired string in it. For example, you can use the following command in order to check if there are any errors on the service side:

1
cat /var/log/tb-edge/tb-edge.log | grep ERROR

View last logs in runtime:

1
docker compose logs -f tb-edge

If you still rely on Docker Compose as docker-compose (with a hyphen) execute next command:

docker-compose logs -f tb-edge

You can use grep command to show only the output with desired string in it. For example, you can use the following command in order to check if there are any errors on the backend side:

1
docker compose logs tb-edge | grep ERROR

If you still rely on Docker Compose as docker-compose (with a hyphen) execute next command:

docker-compose logs tb-edge | grep ERROR

Tip: you can redirect logs to file and then analyze with any text editor:

1
docker compose logs -f tb-edge > tb-edge.log

If you still rely on Docker Compose as docker-compose (with a hyphen) execute next command:

docker-compose logs -f tb-edge > tb-edge.log

Note: you can always log into the ThingsBoard Edge container and view logs there:

1
2
docker ps
docker exec -it NAME_OF_THE_CONTAINER bash

Enable certain logs

ThingsBoard provides the ability to enable/disable logging for certain parts of the system depending on what information do you need for troubleshooting.

You can do this by modifying logback.xml file. As logs itself, it is stored in the following directory:

1
/usr/share/tb-edge/conf

Here’s an example of the logback.xml configuration:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
<!DOCTYPE configuration>
<configuration scan="true" scanPeriod="10 seconds">

    <appender name="fileLogAppender"
              class="ch.qos.logback.core.rolling.RollingFileAppender">
        <file>/var/log/tb-edge/tb-edge.log</file>
        <rollingPolicy
                class="ch.qos.logback.core.rolling.SizeAndTimeBasedRollingPolicy">
            <fileNamePattern>/var/log/tb-edge/tb-edge.%d{yyyy-MM-dd}.%i.log</fileNamePattern>
            <maxFileSize>100MB</maxFileSize>
            <maxHistory>30</maxHistory>
            <totalSizeCap>3GB</totalSizeCap>
        </rollingPolicy>
        <encoder>
            <pattern>%d{ISO8601} [%thread] %-5level %logger{36} - %msg%n</pattern>
        </encoder>
    </appender>

    <logger name="org.thingsboard.server" level="INFO" />
    <logger name="org.thingsboard.js.api" level="TRACE" />
    <logger name="com.microsoft.azure.servicebus.primitives.CoreMessageReceiver" level="OFF" />

    <root level="INFO">
        <appender-ref ref="fileLogAppender"/>
    </root>
</configuration>

The most useful for the troubleshooting parts of the config files are loggers. They allow you to enable/disable logging for the certain class or group of classes. In the example above the default logging level is INFO (it means that logs will contain only general information, warnings and errors), but for the package org.thingsboard.js.api we enabled the most detailed level of logging. There’s also a possibility to completely disable logs for some part of the system, in the example above we did it to com.microsoft.azure.servicebus.primitives.CoreMessageReceiver class using OFF log-level.

To enable/disable logging for some part of the system you need to add proper </logger> configuration and wait up to 10 seconds.

Different deployment tools provide different ways to update logs:

For standalone deployment you need to update /usr/share/tb-edge/conf/logback.xml in order to change logging configuration.

For docker-compose deployment we are mapping /config folder to your local system (./tb-edge/conf folder). So in order to change logging configuration you need to update ./tb-edge/conf/logback.xml file.

Metrics

You may enable prometheus metrics by setting environment variables METRICS_ENABLED to value true and METRICS_ENDPOINTS_EXPOSE to value prometheus in the configuration file.

These metrics exposed at the path: https://<yourhostname>/actuator/prometheus which can be scraped by prometheus (No authentication required).

Prometheus metrics

Some internal state metrics can be exposed by the Spring Actuator using Prometheus.

Here’s the list of metrics ThingsBoard pushes to Prometheus.

tb-edge metrics:

  • attributes_queue_${index_of_queue} (statsNames - totalMsgs, failedMsgs, successfulMsgs): stats about writing attributes to the database. Note that there are several queues (threads) for persisting attributes in order to reach maximum performance.
  • ruleEngine_${name_of_queue} (statsNames - totalMsgs, failedMsgs, successfulMsgs, tmpFailed, failedIterations, successfulIterations, timeoutMsgs, tmpTimeout): stats about processing of the messages inside of the Rule Engine. They are persisted for each queue (e.g. Main, HighPriority, SequentialByOriginator etc). Some stats descriptions:
    • tmpFailed: number of messages that failed and got reprocessed later
    • tmpTimeout: number of messages that timed out and got reprocessed later
    • timeoutMsgs: number of messages that timed out and were discarded afterwards
    • failedIterations: iterations of processing messages pack where at least one message wasn’t processed successfully
  • ruleEngine_${name_of_queue}_seconds (for each present tenantId): stats about the time message processing took for different queues.
  • core (statsNames - totalMsgs, toDevRpc, coreNfs, sessionEvents, subInfo, subToAttr, subToRpc, deviceState, getAttr, claimDevice, subMsgs): stats about processing of the internal system messages. Some stats descriptions:
    • toDevRpc: number of processed RPC responses from Transport services
    • sessionEvents: number of session events from Transport services
    • subInfo: number of subscription infos from Transport services
    • subToAttr: number of subscribes to attribute updates from Transport services
    • subToRpc: number of subscribes to RPC from Transport services
    • getAttr: number of ‘get attributes’ requests from Transport services
    • claimDevice: number of Device claims from Transport services
    • deviceState: number of processed changes to Device State
    • subMsgs: number of processed subscriptions
    • coreNfs: number of processed specific ‘system’ messages
  • jsInvoke (statsNames - requests, responses, failures): stats about total, successful and failed requests to the JS executors
  • attributes_cache (results - hit, miss): stats about how much attribute requests went to the cache

transport metrics:

  • transport (statsNames - totalMsgs, failedMsgs, successfulMsgs): stats about requests received by Transport from Core
  • ruleEngine_producer (statsNames - totalMsgs, failedMsgs, successfulMsgs): stats about pushing messages from Transport to the Rule Engine.
  • core_producer (statsNames - totalMsgs, failedMsgs, successfulMsgs): stats about pushing messages from Transport to the TB node Device actor.
  • transport_producer (statsNames - totalMsgs, failedMsgs, successfulMsgs): stats about requests from Transport to the Core.

PostgreSQL-specific metrics:

  • ts_latest_queue_${index_of_queue} (statsNames - totalMsgs, failedMsgs, successfulMsgs): stats about writing latest telemetry to the database. Note that there are several queues (threads) in order to reach maximum performance.
  • ts_queue_${index_of_queue} (statsNames - totalMsgs, failedMsgs, successfulMsgs): stats about writing telemetry to the database. Note that there are several queues (threads) in order to reach maximum performance.

Getting help

If your problem isn’t answered by any of the guides above, feel free to contact ThingsBoard team.

Contact us