These days, I'm studying kernel internal network code, especially RPS code. You know, there are a lot of functions about that. But I am focusing on some functions about SMP queue processing such as enqueue_to_backlog
and process_backlog
.
I wonder about synchronization btw two cores(or single core) by using two functions -enqueue_to_backlog
and process_backlog
-.
In that functions, A core(A) holds a spin_lock
of the other core(B) for queueing packets into input_pkt_queue
and scheduling napi of the core(B). And A Core(B) also holds a spin_lock
for splicing input_pkt_queue
to process_queue
of the core(B) and removing napi schedule by itself. I know that spin_lock
should be held to prevent two core from accessing the same queue each other during processing queue.
But I can't understand why spin_lock
is called with local_irq_disable
(or local_irq_save
). I think that there is no accessing the queues or rps_lock
of the core(B) by Interrupts Context(TH), when interrupts(TH) preempt current context(softirq, BH). - Of course, napi struct can be accessed for scheduling napi by TH, but it holds disabling irq until queueing packet- So I wonder about why spin_lock
is called with irq disable.
I think it is impossible to preempt current context(napi, softirq) by other BH such as tasklet. Is it true? And I want to know whether local_irq_disable disable all cores irq or just current core's irq literally? Actually, I read a book about kernel development, but I think i don't understand preemption enough.
Would explain the reasons why rps procedure use spin_lock
with local_irq_disable
?
Disabling interrupts affects the current core (only). When disabled, therefore, no other code on the same core will be able to interfere with an update to a data structure. The point of spinlocks is to extend the "lock-out" to other cores (although it's cooperative, not hardware-enforced).
It's dangerous/irresponsible to take a spin lock in the kernel without disabling interrupts because, when an interrupt then occurs, the current code will be suspended, and now you are preventing other cores from making progress while some unrelated interrupt handler is running (even if another user process or tasklet on the original core won't be able to preempt). Other cores might be in an interrupt or BH context themselves and now you're delaying the entire system. Spin locks are supposed to be held for very brief periods to do critical updates to shared data structures.
It's also a good way to generate deadlocks. Consider if the scenario above were replicated in another subsystem (or possibly another device in the same subsystem, but I'll describe the former).
Here, core A takes a spinlock in subsystem 1 without disabling interrupts. At the same time, core B takes a spinlock in subsystem 2 also without disabling interrupts. Now what happens if an interrupt related to subsystem 2 happens on core A, and while executing the subsystem 2 interrupt handler, core A needs to update a structure protected by the spinlock held in core B. But at about the same time, a subsystem 1 interrupt happens on core B, which needs to update a data structure in that subsystem. Now both cores are busy-waiting for a spinlock held by the other core, and the entire system is frozen until you do a hard reset.