I have been trying to get SMP support working again on a port of Linux/MIPS kernel to the SGI Octane (IP30) for the last few weeks now. Uniprocessor support works fine, but I am running into a lot of problems working with the second CPU. I can boot the machine to the init
process, but that dies with either a SIGSEGV or SIGBUS a majority of the time. I have most of the support code in place from patches written 5+ years ago, but I suspect I am either not locking things properly or I am re-enabling IRQs unexpectedly.
Some background of the hardware:
The MIPS R10000-series CPU implements 8 interrupts, IP0
to IP7
:
IP0
and IP1
: Software interrupts only and are currently not used for much.IP2
to IP6
: Generally routed to some other hardware function for handlingIP7
: The R10K timer/counter/compare interrupt.
R10K supports the MIPS-IV ISA, and has both an I-cache and D-cache.
Octane has an ASIC called HEART
as both its memory controller and interrupt controller. HEART
was designed to support up to 4 processors and has 64 interrupts (IRQs) available. These 64 IRQs are divided into several priority levels and are mapped to the R10K CPU IPx IRQs above:
0
to 15
-> CPU IP2
16
to 31
-> CPU IP3
31
to 49
-> CPU IP4
50
-> CPU IP5
51
to 63
-> CPU IP6
There are some notes about these priority levels:
Level 0 and Level 1 IRQs are primarily assigned to devices in the system (SCSI, ethernet, etc).
Level 2 has several uses:
32
to 40
are also available for use by devices in the system (Especially those that need a higher priority).41
is hardwired for power button presses.42
to 45
are for debugger signals to the 4 possible CPUs.46
to 49
are SMP interprocessor interrupts (IPI) for the 4 possible CPUs.Level 3, IRQ 50
, is specifically for the counter/compare timer on the HEART
itself. It runs at 12.5MHz (80ns, I think). It has a single count register and compare register. From a Linux clockevent
standpoint, I think this is a better resolution timer for use as the system timer (52-bit counter, 24-bit compare).
Level 4 is for error IRQs:
51
to 58
are error IRQs for each of the 8 available Xtalk
widgets on the XIO Bus (a high-speed bus arranged in a star topology, serviced by the XBOW
ASIC).59
to 62
are bus error IRQs for the 4 possible CPUs.63
is the exception error IRQ for HEART
itself.HEART
presents several registers for working with interrupts. Each register is 64-bits wide, one bit-per interrupt:
HEART_ISR
- Read-only register to get the list of pending interrupts.HEART_SET_ISR
- Write-only register to set a specific interrupt bit.HEART_CLR_ISR
- Write-only register to clear a specific interrupt bitHEAR_IMR(x)
- Read/write register to set or clear the interrupt mask for a specific interrupt on a specific CPU, represented by x
.
I use the following code for the basic IRQ ack/mask/unmasking operations
u64 *imr; /* Address of the mask register to work on */
static int heart_irq_owner[64]; /* Which CPU owns which IRQ? (global) */
Ack: writeq((1UL << irq), HEART_CLR_ISR);
Mask: imr = HEART_IMR(heart_irq_owner[irq]);
writeq(readq(imr) & (~(1UL << irq)), imr);
Unmask: imr = HEART_IMR(heart_irq_owner[irq]);
writeq(readq(imr) | (1UL << irq), imr);
These basic operations are implemented using the struct irq_chip
accessors within the 3.1x-series Linux kernel, and I protect access to the HEART
registers using spin_lock_irqsave
and spin_unlock_irqrestore
. I am not 100% certain if I should be using those locking functions in these accessors.
For processing all interrupts, the standard Linux/MIPS platform dispatch function takes the following actions:
IP7
-> Calls do_IRQ()
to handle the CPU timer IRQ.IP6
-> Calls ip30_do_error_irq()
to report any HEART
errors to syslog.IP5
-> Calls do_IRQ()
to handle the clockevent IRQ assigned to the HEART
timer.IP4
, IP3
, and IP2
-> Calls ip30_do_heart_irq()
to handle all HEART
IRQs from 0 to 49.
This is the code currently used for ip30_do_heart_irq()
:
static noinline void ip30_do_heart_irq(void)
{
int irqnum = 49;
int cpu = smp_processor_id();
u64 heart_isr = readq(HEART_ISR);
u64 heart_imr = readq(HEART_IMR(cpu));
u64 irqs = (heart_isr & 0x0003ffffffffffffULL &
heart_imr);
/* Poll all IRQs in decreasing priority order */
do {
if (irqs & (1UL << irqnum))
do_IRQ(irqnum);
irqnum--;
} while (likely(irqnum >= 0));
}
When it comes to SMP support, unlike other Linux/MIPS platforms, I do not have something akin to a mailbox register in the hardware to store what kind of IPI action should be taken. The original code uses a global int array (ip30_ipi_mailbox
), indexed by the CPUID, for specifying what IPI action to pass on to the other processor.
Additionally, even though HEART
was designed to support up to 4 processors, SGI only ever produced a dual CPU module. Therefore, IRQs 44
-45
, 48
-49
, and 61
-62
are never actually used for anything.
Given these global variables:
#define IPI_CPU(x) (46 + (x))
static DEFINE_SPINLOCK(ip30_ipi_lock);
static u32 ip30_ipi_mailbox[4];
This is the code currently used to send an IPI to the other CPUs:
static void ip30_send_ipi_single(int cpu, u32 action)
{
unsigned long flags;
spin_lock_irqsave(&ip30_ipi_lock, flags);
ip30_ipi_mailbox[cpu] |= action;
spin_unlock_irqrestore(&ip30_ipi_lock, flags);
writeq(1UL << IPI_CPU(cpu)), HEART_SET_ISR);
}
To respond to an IPI, each CPU calls request_irq
in its initialization code and registers an interrupt handler. This is the code currently used in the handler to service the IPI interrupt:
static irqreturn_t ip30_ipi_irq(int irq, void *dev_id)
{
u32 action;
int cpu = smp_processor_id();
unsigned long flags;
spin_lock_irqsave(&ip30_ipi_lock, flags);
action = ip30_ipi_mailbox[cpu];
ip30_ipi_mailbox[cpu] = 0;
spin_unlock_irqrestore(&ip30_ipi_lock, flags);
if (action & SMP_RESCHEDULE_YOURSELF)
scheduler_ipi();
if (action & SMP_CALL_FUNCTION)
smp_call_function_interrupt();
return IRQ_HANDLED;
}
And that's the background info.
My current kernel configuration has everything except the framebuffer and the "Impact" video driver stripped out. No PCI, no block layer, no networking, no serial, no keyboard/mouse. I have a ~7 year old initramfs I am loading up that, if everything works, should drop to a bash prompt. However, because it loads into RAM, it's capable of exposing memory corruption rather quickly, and I either get the aforementioned SIGSEGV or SIGBUS errors as a result.
Using remote GDB or the built-in KGDB is not an option at present because of the IOC3 PCI device. IOC3 is a multifunction PCI device that claims to be a single function device and behind it lie the hardware bits for the keyboard/mouse, serial ports, real-time clock, and the ethernet. Code does not exist yet to get around the IOC3 and access the serial ports directly for remote GDB, and KGDB doesn't know how to talk to the standard i8042 keyboard controller on the IOC3, either.
I have a standard PCI serial card added (Moschip-based), but that driver is apparently not endian safe, thus probing for serial ports panics the kernel.
Getting the following questions answered will, I hope, put me on the right path to getting SMP working by allowing me to better identify the faulty code and focus on making it work right:
smp_rmb()
, smp_wmb()
, etc)?Any information that can put me on the right path to figuring this out would be appreciated. My hope is to get SMP into a usable state (efficiency is irrelevant, I just need it to work), so I can start working on breaking things up into patches and see about getting it included in the mainline kernel at some point. If I can't get SMP to work, I'll just drop its support and focus on getting the uniprocessor code sent upstream instead.
The bug was ultimately worked out to not assigning the IRQ numbers to their correct handler. I was initially assigning ALL 64 IRQs to use handle_level_irq
, which is incorrect for SMP interprocessor interrupts (IPIs). The fix turned out to assign the 8 CPU-specific interrupts, 42-45 and 46-49, to handle_percpu_irq
instead.