I've got some Python 3.7 BLE scanning code which typically runs fantastic on a RPi3 device in production. However, recently I've seen devices* introduced into the environment that will crash the BLE Scanner in ways that I don't know how to prevent/detect.
*Note: A Windows 10 Lenovo laptop with Bluetooth chip Qualcomm Atheros QCA61x4 Bluetooth 4.1
can bring this code to its knees. I've also heard that folks running the above next to a BLE Beacon site also crash frequently.
The crash occurs during the pkt = my_sock.recv(255)
command and the exception is a _bluetooth.error (32, 'Broken Pipe')
Here is a minimal code example below that demonstrates the problem:
import logging
import select
import struct
import sys
import time
from subprocess import check_output, STDOUT, CalledProcessError
import bluetooth._bluetooth as bluez
ROOT_LOGGER = logging.getLogger()
ROOT_LOGGER.setLevel(logging.DEBUG)
ROOT_LOGGER.addHandler(logging.NullHandler())
SYS_CTL_LOG_FMT = '%(filename)-28s [%(lineno)-4d] %(levelname)-7s %(message)s'
SYSCTL_HANDLER = logging.StreamHandler(sys.stdout)
SYSCTL_HANDLER.setLevel(logging.DEBUG)
SYSCTL_HANDLER.setFormatter(logging.Formatter(SYS_CTL_LOG_FMT))
ROOT_LOGGER.addHandler(SYSCTL_HANDLER)
logger = logging.getLogger(__name__)
DEV_ID = 0
OGF_LE_CTL = 0x08
OCF_LE_SET_SCAN_PARAMETERS = 0x000B
OCF_LE_SET_SCAN_ENABLE = 0x000C
SCAN_RANDOM = 0x01
OWN_TYPE = SCAN_RANDOM
SCAN_TYPE = 0x01
INTERVAL = 0x10
WINDOW = 0x10
FILTER = 0x00 # all advertisements, not just whitelisted devices
ENABLE = 0x01
if __name__ == '__main__':
while True:
my_sock = bluez.hci_open_dev(DEV_ID)
my_sock.settimeout(30.0)
cmd_pkt = struct.pack("<BBBBBBB", SCAN_TYPE, 0x0, INTERVAL, 0x0, WINDOW, OWN_TYPE, FILTER)
bluez.hci_send_cmd(my_sock, OGF_LE_CTL, OCF_LE_SET_SCAN_PARAMETERS, cmd_pkt)
cmd_pkt = struct.pack("<BB", ENABLE, 0x00)
bluez.hci_send_cmd(my_sock, OGF_LE_CTL, OCF_LE_SET_SCAN_ENABLE, cmd_pkt)
flt = bluez.hci_filter_new()
bluez.hci_filter_all_events(flt)
bluez.hci_filter_set_ptype(flt, bluez.HCI_EVENT_PKT)
my_sock.setsockopt(bluez.SOL_HCI, bluez.HCI_FILTER, flt)
try:
packets_received = 0
while True:
ready_to_read, ready_to_write, in_error = select.select([my_sock, ], [my_sock, ], [my_sock, ], 5)
if len(ready_to_read) == 0:
time.sleep(0.001)
continue
try:
pkt = my_sock.recv(255)
except bluez.error as exc_data:
logger.error(f'Received a _bluetooth.error while trying to read. Aborting: {exc_data}')
raise
packets_received += 1
ptype, event, plen = struct.unpack("BBB", pkt[:3])
logger.info(f'{packets_received} {ptype}, {event}, {plen}')
except bluez.error:
my_sock.close()
while True:
# this loops until hciconfig is able to successfully restart
try:
check_output('sudo hciconfig hci0 up', shell=True, stderr=STDOUT)
except CalledProcessError as exc_data:
logger.warning(f'{type(exc_data)}: {exc_data}')
continue
time.sleep(1)
break
Output looks like this:
pi@raspberrypi:~/my_test $ sudo python3 distilled_test.py
distilled_test.py [63 ] INFO 1 4, 14, 4
distilled_test.py [63 ] INFO 2 4, 14, 4
distilled_test.py [63 ] INFO 3 4, 62, 27
distilled_test.py [63 ] INFO 4 4, 62, 26
distilled_test.py [63 ] INFO 5 4, 62, 12
distilled_test.py [63 ] INFO 6 4, 62, 31
distilled_test.py [63 ] INFO 7 4, 62, 31
distilled_test.py [63 ] INFO 8 4, 62, 31
distilled_test.py [63 ] INFO 9 4, 62, 31
distilled_test.py [63 ] INFO 10 4, 62, 31
distilled_test.py [63 ] INFO 11 4, 62, 31
distilled_test.py [63 ] INFO 12 4, 62, 31
distilled_test.py [63 ] INFO 13 4, 62, 31
distilled_test.py [63 ] INFO 14 4, 62, 31
distilled_test.py [63 ] INFO 15 4, 62, 31
distilled_test.py [63 ] INFO 16 4, 62, 31
distilled_test.py [63 ] INFO 17 4, 62, 31
distilled_test.py [63 ] INFO 18 4, 62, 31
distilled_test.py [63 ] INFO 19 4, 62, 31
distilled_test.py [63 ] INFO 20 4, 62, 31
distilled_test.py [63 ] INFO 21 4, 62, 31
distilled_test.py [63 ] INFO 22 4, 62, 31
distilled_test.py [59 ] ERROR Received a _bluetooth.error while trying to read. Aborting: (32, 'Broken pipe')
distilled_test.py [72 ] WARNING <class 'subprocess.CalledProcessError'>: Command 'sudo hciconfig hci0 up' returned non-zero exit status 1.
distilled_test.py [72 ] WARNING <class 'subprocess.CalledProcessError'>: Command 'sudo hciconfig hci0 up' returned non-zero exit status 1.
distilled_test.py [72 ] WARNING <class 'subprocess.CalledProcessError'>: Command 'sudo hciconfig hci0 up' returned non-zero exit status 1.
distilled_test.py [72 ] WARNING <class 'subprocess.CalledProcessError'>: Command 'sudo hciconfig hci0 up' returned non-zero exit status 1.
distilled_test.py [72 ] WARNING <class 'subprocess.CalledProcessError'>: Command 'sudo hciconfig hci0 up' returned non-zero exit status 1.
distilled_test.py [63 ] INFO 1 4, 14, 4
distilled_test.py [63 ] INFO 2 4, 14, 4
distilled_test.py [63 ] INFO 3 4, 62, 27
distilled_test.py [63 ] INFO 4 4, 62, 26
distilled_test.py [63 ] INFO 5 4, 62, 40
distilled_test.py [63 ] INFO 6 4, 62, 39
My theory is that the new BLE Broadcasting device is flooding the RPi with Bluetooth traffic to the point where I'm not ingesting it fast enough and the Bluetooth service closes the socket. Advice?
Raspbian Buster Lite bluez-5.52.tar.xz gattlib-0.20150805 pybluez-0.23 Python 3.7.3
I should also note that this Lenovo/Qualcomm laptop Bluetooth advertising is enough to cause my go-to Android app nRF Connect
to repeatedly cycle Bluetooth. Though I realize I can't prevent the Lenovo/Qualcomm from being naughty, I still feel like I need to protect my app from crashing due to the Bluetooth noise.
So, as it turns out, the broken pipe
really does in-fact mean broken pipe
... imagine that.
I wired up the project to a RPi4 and was able to see the code processing Bluetooth messages fast enough to keep up. As I had supposed in the original question, the RPi3 code was not keeping up with the rate that the Bluetooth chip was receiving messages and at some point, some sort of buffer/pipe/queue filled up and the Bluetooth (Bluez, probably) broke the pipe.