c++jsonmqttesp32arduinojson

ESP32, ArduinoJSON and MQTT: Stuck in process chain


I have some trouble for a while with my application and can not figure out why the ESP32 get stuck in the processing chain. The issue randomly occurs after 3 or 4 weeks in production mode. Sometimes after 1 week. It's hard to tell.


UPDATE It's probably some kind of a memory leak. I updated the code and don't expect any memory leak anymore.

So first, let me explain how the process chain looks like.

On ESP startup, the MQTT client (AsyncMQTT_Generic Library) connects to the MQTT broker Mosquitto, subscribes to the topic esp/config/<ESP32_WIFI_MAC_ADDRESS>, receives a message payload as a JSON object

{"idsl":900000000,"dsle":true,"ihttp":1800000,"time":1712487701726,"addr":["C4:7C:8D:6D:31:2A"],"syse":true}

that is deserialized by ArduinoJson (version 6.21.5) and stores it to the RTC storage rtcConfig of ESP32 as follows:

#define MAC_ADDR_LENGTH 18
#define MAX_BLE_NUMBER 5

const char* MQTT_TOPIC_CONFIG = "esp/config/";

char* topicConfigMac;

struct RtcData {
    unsigned int wakeupCnt;
    uint32_t channel;
    uint8_t bssid[6];
    uint8_t systemEnabled;
    uint8_t deepSleepEnabled;
    uint64_t deepSleepInterv;
    uint32_t httpRequestInterv;
    uint8_t hasConfig;
    uint64_t time;
    char bleAddr[MAX_BLE_NUMBER][MAC_ADDR_LENGTH];
};

// Real Time Clock memory
RTC_DATA_ATTR RtcData rtcConfig;

StaticJsonDocument<512> jsonConfig;

// Receiving MQTT message from topic
void onMqttMessage(char* topic, char* payload, const AsyncMqttClientMessageProperties& properties,
    const size_t& len, const size_t& index, const size_t& total) {

    Serial.printf("MQTT received message on topic '%s'\n", topic);
    Serial.printf("  QoS: %u", properties.qos);
    Serial.printf(", dup: %d", properties.dup);
    Serial.printf(", retain: %d", properties.retain);
    Serial.printf(", len: %u", len);
    Serial.printf(", index: %u", index);
    Serial.printf(", total: %u\n", total);

    if (strcmp(topic, topicConfigMac) == 0) {
        deserializePayload(payload);
    } else {
        Serial.printf("MQTT received message topic %s is unequal to %s\n", topic, topicConfigMac);
        enterSleepMode();
    }
}

void deserializePayload(char* payload) {
    Serial.print("Deserialize payload: ");
    jsonConfig.clear();
    DeserializationError error = deserializeJson(jsonConfig, payload);

    if (error) {
        Serial.printf("failed: %s", error.c_str());
        return;
    } else {
        size_t size = serializeJson(jsonConfig, Serial); // printing JSON to Serial for debugging
        Serial.printf(", %u bytes\n", size);

        // Returns false on initial boot as expected
        // Should return true after storing configuration to RTC
        if (isConfigUpToDate()) {
            Serial.println("Configuration is up-to-date");
            publishSensorData();
        } else {
            storeConfigToRtc();
            reboot = true;
            mqttClient.disconnect();
            // Rebooting ESP32; After reboot subscribe to topic again
            // Now, we use the configuration stored in RTC storage
            // isConfigUpToDate() should return true
        }
    }
}

bool isConfigUpToDate() {
    // Is this a valid statement to compare a uint64_t with jsonConfig?
    return rtcConfig.time > 0 && rtcConfig.time == (uint64_t)jsonConfig["time"];
}

void storeConfigToRtc() {
    Serial.println("Store payload to RTC");
    rtcConfig.hasConfig = true;
    rtcConfig.systemEnabled = jsonConfig["syse"];
    rtcConfig.deepSleepEnabled = jsonConfig["dsle"];
    rtcConfig.deepSleepInterv = jsonConfig["idsl"];
    rtcConfig.httpRequestInterv = jsonConfig["ihttp"];
    rtcConfig.time = jsonConfig["time"];
    // BLE addresses
    JsonArray bleAddresses = jsonConfig["addr"].as<JsonArray>();
    for (size_t i = 0; i < MAX_BLE_NUMBER; i++) {
        if (i < bleAddresses.size()) {
            String addr = bleAddresses[i].as<String>();
            strcpy(rtcConfig.bleAddr[i], addr.c_str());
        } else {
            strcpy(rtcConfig.bleAddr[i], "\0");
        }
    }
}

void resetRtc() {
    Serial.println("Reset RTC data");
    rtcConfig.hasConfig = false;
    rtcConfig.systemEnabled = false;
    rtcConfig.deepSleepEnabled = true;
    rtcConfig.deepSleepInterv = 60000000;
    rtcConfig.httpRequestInterv = 60000;
    rtcConfig.time = 0;
    rtcConfig.bleAddr[MAX_BLE_NUMBER][MAC_ADDR_LENGTH] = { 0 };
}

void setup() {
    ...
    if (rtcConfig.wakeupCnt == 0) {
        resetRtc();
    }
    ...
    topicConfigMac = (char*)malloc(strlen(MQTT_TOPIC_CONFIG) + MAC_ADDR_LENGTH);
    strcpy(topicConfigMac, MQTT_TOPIC_CONFIG);
    strcat(topicConfigMac, WiFi.macAddress().c_str());
    ...
}

void loop() {}

Because rtcConfig.time is 0 initially, isConfigUpToDate() returns false and stores the configuration into RTC after that. After a reboot, rtcConfig.time is not 0 anymore and should publish the sensor data to the MQTT server.

From the log files of Mosquitto and analyses, I can tell that the process get stuck on the above method isConfigUpToDate() which will return false after the first expected reboot. But I'm expecting true at this point.

Now, I'm not sure why it get stuck there. There can be several reasons:

Is the logical expression

rtcConfig.time > 0 && rtcConfig.time == (uint64_t)jsonConfig["time"]

technically valid at all time?

Is there any way to confirm without USB debugging that the received JSON attribute time is set as I expect it to be?

Unfortunately, I can not debug the issue via USB port in 'production mode' because it's powered by a battery and solar panel. If I debug it via USB port, the issue never occurs probably because I'm not patient enough to wait for the failing iteration. It only occurs after 3 or 4 weeks in production mode. Sometimes after 1 week. It's hard to tell.

Any idea or input is appreciated! I'm facing the issue over months and can not get rid of it...

Thanks in advance

EDIT

In the following, here is a valid Mosquitto log where everything just works fine:

2024-04-07 04:46:27: New connection from 192.168.178.36:61317 on port 1883.
2024-04-07 04:46:27: New client connected from 192.168.178.36:61317 as esp_dev (p2, c1, k15, u'esp32').
2024-04-07 04:46:27: No will message specified.
2024-04-07 04:46:27: Sending CONNACK to esp_dev (0, 0)
2024-04-07 04:46:27: Received SUBSCRIBE from esp_dev
2024-04-07 04:46:27:    esp/config/0C:B8:15:F4:1F:FC (QoS 1)
2024-04-07 04:46:27: esp_dev 1 esp/config/0C:B8:15:F4:1F:FC
2024-04-07 04:46:27: Sending SUBACK to esp_dev
2024-04-07 04:46:27: Sending PUBLISH to esp_dev (d0, q0, r1, m0, 'esp/config/0C:B8:15:F4:1F:FC', ... (108 bytes))
2024-04-07 04:46:27: Received PUBLISH from esp_dev (d0, q1, r0, m2, 'testTopic', ... (86 bytes))
2024-04-07 04:46:27: Sending PUBLISH to paho82806266301 (d0, q0, r0, m0, 'testTopic', ... (86 bytes))
2024-04-07 04:46:27: Sending PUBACK to esp_dev (m2, rc0)
2024-04-07 04:46:27: Received DISCONNECT from esp_dev
2024-04-07 04:46:27: Client esp_dev disconnected.

And this is the failing Mosquitto log where it gets stuck in an infinite loop. Here you can see that the ESP is actively disconnecting from the broker after receiving the published message. This must be the code in deserializePayload() if isConfigUpToDate() returns false. After 5-7 seconds, it will do it again and again.

2024-04-07 05:31:34: New connection from 192.168.178.36:51254 on port 1883.
2024-04-07 05:31:34: New client connected from 192.168.178.36:51254 as esp_dev (p2, c1, k15, u'esp32').
2024-04-07 05:31:34: No will message specified.
2024-04-07 05:31:34: Sending CONNACK to esp_dev (0, 0)
2024-04-07 05:31:34: Received SUBSCRIBE from esp_dev
2024-04-07 05:31:34:    esp/config/0C:B8:15:F4:1F:FC (QoS 1)
2024-04-07 05:31:34: esp_dev 1 esp/config/0C:B8:15:F4:1F:FC
2024-04-07 05:31:34: Sending SUBACK to esp_dev
2024-04-07 05:31:34: Sending PUBLISH to esp_dev (d0, q0, r1, m0, 'esp/config/0C:B8:15:F4:1F:FC', ... (108 bytes))
2024-04-07 05:31:34: Received DISCONNECT from esp_dev
2024-04-07 05:31:34: Client esp_dev disconnected.
2024-04-07 05:31:40: New connection from 192.168.178.36:60107 on port 1883.
2024-04-07 05:31:41: New client connected from 192.168.178.36:60107 as esp_dev (p2, c1, k15, u'esp32').
2024-04-07 05:31:41: No will message specified.
2024-04-07 05:31:41: Sending CONNACK to esp_dev (0, 0)
2024-04-07 05:31:41: Received SUBSCRIBE from esp_dev
2024-04-07 05:31:41:    esp/config/0C:B8:15:F4:1F:FC (QoS 1)
2024-04-07 05:31:41: esp_dev 1 esp/config/0C:B8:15:F4:1F:FC
2024-04-07 05:31:41: Sending SUBACK to esp_dev
2024-04-07 05:31:41: Sending PUBLISH to esp_dev (d0, q0, r1, m0, 'esp/config/0C:B8:15:F4:1F:FC', ... (108 bytes))
2024-04-07 05:31:41: Received DISCONNECT from esp_dev
...
infinite loop here until the battery is drained from WiFi connecting
...

mosquitto.conf

...
autosave_interval 1800
autosave_on_changes true
persistence true
persistence_location /var/lib/mosquitto/
...

Solution

  • Since, I updated the code due to the memory leak, no more errors occurred in this context. I removed the instruction memset because it corrupted the RTC memory.

    So I'm closing the thread. Thanks @Brits for helping me out, cheers!