dockerdebiannode-redazure-iot-edgebullseye

Azure IoT edge device hosting node-red module stopped working: error no space on device


I installed Azure IoT edge runtime on a Raspberry Pi 3 (Debian bullseye) to run my edge models. Everything worked well until I had an error:

A module runtime error occurred
        caused by: A module runtime error occurred
        caused by: connection error: Connection reset by peer (os error 104)
        caused by: Connection reset by peer (os error 104)

Thus I decided to run sudo iotedge check which gave the following:

Configuration checks (aziot-identity-service)
---------------------------------------------
√ keyd configuration is well-formed - OK
√ certd configuration is well-formed - OK
√ tpmd configuration is well-formed - OK
√ identityd configuration is well-formed - OK
√ daemon configurations up-to-date with config.toml - OK
√ identityd config toml file specifies a valid hostname - OK
√ aziot-identity-service package is up-to-date - OK
√ host time is close to reference time - OK
√ preloaded certificates are valid - OK
√ keyd is running - OK
√ certd is running - OK
√ identityd is running - OK
√ read all preloaded certificates from the Certificates Service - OK
√ read all preloaded key pairs from the Keys Service - OK
√ check all EST server URLs utilize HTTPS - OK
√ ensure all preloaded certificates match preloaded private keys with the same ID - OK

Connectivity checks (aziot-identity-service)
--------------------------------------------
√ host can connect to and perform TLS handshake with iothub AMQP port - OK
√ host can connect to and perform TLS handshake with iothub HTTPS / WebSockets port - OK
√ host can connect to and perform TLS handshake with iothub MQTT port - OK

Configuration checks
--------------------
√ aziot-edged configuration is well-formed - OK
√ configuration up-to-date with config.toml - OK
√ container engine is installed and functional - OK
× configuration has correct URIs for daemon mgmt endpoint - Error
    docker: Error response from daemon: mkdir /var/lib/docker/overlay2/90aaeea51acd3c6e7d8281710a36b5b9ceddff484170687e9364688d06956d6a-init: no space left on device.
    See 'docker run --help'.
√ aziot-edge package is up-to-date - OK
× container time is close to host time - Error
    Could not query local time inside container
√ DNS server - OK
‼ production readiness: logs policy - Warning
    Container engine is not configured to rotate module logs which may cause it run out of disk space.
    Please see https://aka.ms/iotedge-prod-checklist-logs for best practices.
    You can ignore this warning if you are setting log policy per module in the Edge deployment.
‼ production readiness: Edge Agent's storage directory is persisted on the host filesystem - Warning
    The edgeAgent module is not configured to persist its /tmp/edgeAgent directory on the host filesystem.
    Data might be lost if the module is deleted or updated.
    Please see https://aka.ms/iotedge-storage-host for best practices.
‼ production readiness: Edge Hub's storage directory is persisted on the host filesystem - Warning
    The edgeHub module is not configured to persist its /tmp/edgeHub directory on the host filesystem.
    Data might be lost if the module is deleted or updated.
    Please see https://aka.ms/iotedge-storage-host for best practices.
√ Agent image is valid and can be pulled from upstream - OK
√ proxy settings are consistent in aziot-edged, aziot-identityd, moby daemon and config.toml - OK

Connectivity checks
-------------------
× container on the default network can connect to upstream AMQP port - Error
    Container on the default network could not connect to dc-hub-rnd.azure-devices.net:5671
× container on the default network can connect to upstream HTTPS / WebSockets port - Error
    Container on the default network could not connect to dc-hub-rnd.azure-devices.net:443
× container on the IoT Edge module network can connect to upstream AMQP port - Error
    Container on the azure-iot-edge network could not connect to dc-hub-rnd.azure-devices.net:5671
× container on the IoT Edge module network can connect to upstream HTTPS / WebSockets port - Error
    Container on the azure-iot-edge network could not connect to dc-hub-rnd.azure-devices.net:443
26 check(s) succeeded.
3 check(s) raised warnings. Re-run with --verbose for more details.
6 check(s) raised errors. Re-run with --verbose for more details.
2 check(s) were skipped due to errors from other checks. Re-run with --verbose for more details.

I see that there is a problem with docker. Why this error occurs? More about my edge modules running in a docker container: I use node-red module to connect to PLC via Ethernet and collect data from there for the delivery to IoT hub. I also make some message processing in the node-red module. Everything worked fine and by some reason this error appeared. I use several such IoT edge devices and want to be sure that such error won't happen again. Anyone has a guess on the root cause?

I also wonder how I can resolve this problem.

UPDATE: Further investigations revealed that the disk space was full according to df -h. I tried to free up the space using the first popping up googled methods, though it didn't help me out. Other similar devices with the same modules take 10%-ish of the space. I assume maybe node-red module logs flooded out the disk space because sometimes I was facing such messages as 'the logs are too big' and thus it was not showing me any. There might be some autotuning of cashed memory of node-red, might not?


Solution

  • I think all of these errors have the exact root cause: your device is out of disk space. Can you verify if this is the case?

    × configuration has correct URIs for daemon mgmt endpoint - Error
        docker: Error response from daemon: mkdir /var/lib/docker/overlay2/90aaeea51acd3c6e7d8281710a36b5b9ceddff484170687e9364688d06956d6a-init: no space left on device.
        See 'docker run --help'.
    

    After you've cleaned up some space, be sure to check out the production checklist for Azure IoT Edge. It configures image garbage collection and setting up default logging that includes log-rotation.