linuxkernellinux-device-driverpci-ehotplugging

pci_Driver.probe not being called


I'm getting started in Linux Device Driver development for a PCI device connected via a laptop's PCIe expansion slot.

On boot, everything works beautifully. However, I'm trying to get basic Hotplug support online. When I eject the card, I can see (in dmesg) that the proper remove stuff is called. However, when the card is re-inserted, nothing happens. If I manually remove the module, and then insert the card (or insert the card after boot), then I can see the module's init is called, but not probe. Also, the device doesn't appear in lspci output.

However, if I echo 1 > /sys/bus/pci/rescan then it appears in lspci output, but the module fails to load with errors (pci_enable_device failed with code -22).

Any ideas where to even start diagnosing this? The failure to exec .probe is what's really puzzling me.

I should mention that this is an FPGA board connected here, so it's possible there's something wrong in the Device itself, but i would still expect probe to run and then fail with a weird error later.


Solution

  • If the device doesn't show in lspci there no chance that .probe function of your driver will be called because it does get listed in the kernel device tree.

    When you do pci bus rescan and it is seen by lspci, that doesn't mean that the device is accessible. In fact, try to do an lspci -vv -s BB:DD (where BB:DD is the device bus id and device id as reported by lspci. I expect that you get 0xFF for many register (in particular the BARs). I guess this would be the reason why pci_enable_device fails.

    I have a similar problem with an FPGA device when I reload the bitfile while running. One possible cause to your problem is that the configuration space registers are reset. You could try to save the configuration space before you remove the board (as root):

    cp /sys/bus/pci/devices/0000\:BB\:DD.0/config ~/config.save
    

    then to restore it:

    cp ~/config.save /sys/bus/pci/devices/0000\:BB\:DD.0/config
    

    I've had this method work on some hardware but not on other (newer hardware).