I have a device on a CAN bus that is either not sending CRCs or sending corrupted CRCs in certain scenarios. The device is 3rd-party so cannot be altered. I don't mind if Linux ignores the malformed frame, but it needs to not generate an error frame. Does anyone know if this is possible? Does this happen in the driver or in the CAN controller chip?
Also not sure if this is something I can control using a SocketCAN API or via a command line call to ip link.
It turns out that the problem was that the timing of the other device was slightly off, and setting the SJW (sync jump width) to 4 (instead of 1) fixed the problem. Apparently, some clocks on CAN controllers can be problematic, or, in this particular case, can start to get problematic when heated. On slower CAN settings (I'm running my bus at 250k), having a larger SJW can give the CAN chips more tolerance and help prevent this from causing problems.
The actual command I issued to fix it was (using canE as the device - yours is probably can0):
ip link set down canE
ip link set canE type can tq 50 prop-seg 37 phase-seg1 32 phase-seg2 10 sjw 4
ip link set up canE
The last part ("sjw 4") is the change, and other values in this command I just got from showing the details from the can port:
ip -statistics -details link show canE
I just repeated those values but set the SJW to 4. I don't know much about these timings, but the Kernel CAN README has some information as well as links to other documents.