I'm trying to create a small home monitoring system. I have a series of wireless transmitters that transmit measurement data to a base station. I can query that base station using Modbus RTU to find out the latest measurement values from each transmitter.
To store the measurements and visualize, I'm using InfluxDB and Grafana. I have everything running on Raspberry Pi Model 3B+, including the RS-485 communication to the base station.
I have chosen to use Python to read the data from Modbus RTU and then forward it to InfluxDB for storage because Python has ready-made libraries for both. However, I'm struggling to get the Python script stable. Inevitably I get CRC errors in the Modbus transmission every now and then and the script seems to get stuck when minimalmodbus library raises one of these exceptions.
I'm not sure how I should tackle this problem.
At the moment I'm using try-except-else structure, but because I'm a complete newbie in Python I can't get it to work the way I want. It's okay if I lose a single measurement point. This means that if I get a CRC error, I can just forget about that measurement and carry on like nothing ever happened.
The code (minimalized) that I'm using at the moment looks like this:
#!/user/bin/env python
import minimalmodbus
import time
from influxdb import InfluxDBClient
# influxdb stuff
influx = InfluxDBClient(host='localhost', port=8086)
influx.switch_database('dbname')
# minimalmodbus stuff
minimalmodbus.BAUDRATE = 9600
instrument = minimalmodbus.Instrument('/dev/ttyUSB0', 1)
errorcounter = 0
cyclecounter = 0
while True:
try:
sid1te = instrument.read_register(247, 1, 4)
print "SID 1 TE:", sid1te
influxquery = [
{"measurement": "sid1", "fields": { "te": sid1te}},
{"measurement": "system", "fields": { "errorcounter": errorcounter}},
{"measurement": "system", "fields": { "cyclecounter": cyclecounter}}
]
print "InfluxDB query result:", influx.write_points(influxquery)
except Exception as error:
print "[!] Exception occurred: ", error
errorcounter = errorcounter + 1
else:
print "[i] One cycle completed."
cyclecounter = cyclecounter + 1
time.sleep(30)
What ends up happening is that the script can run for hours like a dream, and then, when a single CRC error occurs in the transmission it enters a never ending loop of exceptions like this:
[!] Exception occurred: Checksum error in rtu mode: '\xeb\xf9' instead of 'p\x97' . The response is: '\x7f\x01\x04\x02\x00\xeb\xf9' (plain response: '\x7f\x01\x04\x02\x00\xeb\xf9')
[!] Exception occurred: Checksum error in rtu mode: '\xeb\xf9' instead of 'p\x97' . The response is: '\x7f\x01\x04\x02\x00\xeb\xf9' (plain response: '\x7f\x01\x04\x02\x00\xeb\xf9')
[!] Exception occurred: Checksum error in rtu mode: '\xeb\xf9' instead of 'p\x97' . The response is: '\x7f\x01\x04\x02\x00\xeb\xf9' (plain response: '\x7f\x01\x04\x02\x00\xeb\xf9')
[!] Exception occurred: Checksum error in rtu mode: '\xeb\xf9' instead of 'p\x97' . The response is: '\x7f\x01\x04\x02\x00\xeb\xf9' (plain response: '\x7f\x01\x04\x02\x00\xeb\xf9')
[!] Exception occurred: Checksum error in rtu mode: '\xeb\xf9' instead of 'p\x97' . The response is: '\x7f\x01\x04\x02\x00\xeb\xf9' (plain response: '\x7f\x01\x04\x02\x00\xeb\xf9')
[!] Exception occurred: Checksum error in rtu mode: '\xeb\xf9' instead of 'p\x97' . The response is: '\x7f\x01\x04\x02\x00\xeb\xf9' (plain response: '\x7f\x01\x04\x02\x00\xeb\xf9')
When I back out of this using CTRL-C the script actually looks to be in the sleep command:
^CTraceback (most recent call last):
File "temp.py", line 92, in <module>
time.sleep(30)
KeyboardInterrupt
So I'm puzzled as to why it's not outputting normal print commands to the console if it's actually in the program loop.
In the actual script I have three dozen instrument.read_register calls, so I'm not sure if I should make a distinct function where I handle exception on per-read_register call or what? I've tried half a dozen variations of this code over the past week but the data I get in Grafana is just abysmal due to the script getting stuck in exception loops.
Any suggestions?
I tried this with different USB/RS-485 converter as well as different Modbus RTU gear. Same problem persists. I'm 99% confident the problem is with Linux serial port handling. I'm not sure how/why, but it sometimes gives pyserial/minimalmodbus incomplete byte sequences, causing minimalmodbus to evaluate erroneus sequences as responses. Example is that minimalmodbus complains that 0x11 is too short a response, and then immediately after that minimalmodbus complains that 0x03 0x06 0xAE 0x41 0x56 0x52 0x43 0x40 0x49 0xAD has error in CRC. In reality, if these two messages were received as one, it would be a perfectly valid response.
I do not know how to navigate this problem, but I'm pretty sure the problem is deeper than Python level.
EDIT: It is not a hardware/Linux problem, but instead with Python/pyserial/minimalmodbus. I hacked together a Python script which executes an external C language Modbus RTU query program and parses the output. Works like a charm 100% of the time, for more than a week now.