I'm making my kernel on raspberry pi 3 (no Bluetooth). My kernel uses arm assembly language(32bit) and c and uboot boots my kernel.
I found interrupt vector table and applied it to my code like this.
.globl _ram_entry
_ram_entry:
bl kernel_init
b _ram_entry //
ldr pc,=print_mem1
b print_mem1
b print_mem1
b print_mem2
b print_mem3
b print_mem4
b print_mem1
b print_mem2
b print_mem3
b print_mem4
#define svc_stack 0xa0300000
#define irq_stack 0xa0380000
#define sys_stack 0xa0400000
.global kernel_init
kernel_init:
ldr r0,=0x00080008
mov r1,#0x0000
ldmia r0!,{r2,r3,r4,r5}
stmia r1!,{r2,r3,r4,r5}
ldmia r0!,{r2,r3,r4,r5}
stmia r1!,{r2,r3,r4,r5}
ldmia r0!,{r2,r3,r4,r5}
stmia r1!,{r2,r3,r4,r5}
ldmia r0!,{r2,r3,r4,r5}
stmia r1!,{r2,r3,r4,r5}
bl main
b _ram_entry
.global print_mem1
print_mem1:
bl print_c_mem1
.global print_mem2
print_mem2:
bl print_c_mem2
.global print_mem3
print_mem3:
bl print_c_mem3
.global print_mem4
print_mem4:
bl print_c_mem4
_ram_entry starts at 0x00080008, which is my interrupt vector table. When I print my memory, 0x00 has the bl kernel_init. All interrupt handlers just print simple number.
But if I use swi like this main code, reset handler called.
int main()
{
R_GPIO_REGS * gp_regs= (R_GPIO_REGS*)GPIO_BASE_ADDRESS;
gp_regs->GPFSEL[1] =0x1000000;
uart_init();
printf("hellow world\n");
vector_memory_dump();
unsigned int destrst=0xea020000;
unsigned int destirq=0xea020000;
unsigned int destswi=0xea020000;
PUT32(MEMZERO,destrst);
PUT32(MEMY,destirq);
PUT32(MEMSWI,destswi);
vector_memory_dump();
//asm("b 0x04");
asm("swi 0"); //which call swi handler on 0x08. I thought.
while(1)
{
gp_regs->GPSET[0]=0x40000;
}
return 0;
}
What's the problem?
So from the tags and such I assume this is a Raspberry Pi 3, in aarch32 mode, probably HYP mode. Note I do appreciate you reading/borrowing some of my code directly or indirectly.
With your code, let's start here:
ldr r0,=0x00080008
mov r1,#0x0000
This isn't technically a bug, but it kind of missed the point of what that copy does.
b print_mem1
b print_mem1
b print_mem2
b print_mem3
b print_mem4
b print_mem1
b print_mem2
b print_mem3
b print_mem4
Combined with these then yes, it is a problem, as they are position dependent and the whole idea of having the toolchain create the table for you then copying it is lost.
Disassembly of section .text:
00080000 <_ram_entry>:
80000: eb00000a bl 80030 <kernel_init>
80004: eafffffd b 80000 <_ram_entry>
80008: e59ff074 ldr pc, [pc, #116] ; 80084 <print_c_mem4+0x4>
8000c: ea000013 b 80060 <print_mem1>
80010: ea000012 b 80060 <print_mem1>
80014: ea000012 b 80064 <print_mem2>
80018: ea000012 b 80068 <print_mem3>
8001c: ea000012 b 8006c <print_mem4>
80020: ea00000e b 80060 <print_mem1>
80024: ea00000e b 80064 <print_mem2>
80028: ea00000e b 80068 <print_mem3>
8002c: ea00000e b 8006c <print_mem4>
When I assemble then disassemble, the ldr pc, which is the right way to do this, but landing in the wrong place shows 0x80084 which is 84-8 = 0x7C ahead which is 1111100 0x1F registers used to do a copy to get that far so ...
ldmia r0!,{r2,r3,r4,r5}
stmia r1!,{r2,r3,r4,r5}
ldmia r0!,{r2,r3,r4,r5}
stmia r1!,{r2,r3,r4,r5}
ldmia r0!,{r2,r3,r4,r5}
stmia r1!,{r2,r3,r4,r5}
ldmia r0!,{r2,r3,r4,r5}
stmia r1!,{r2,r3,r4,r5}
32 registers, 0x80 bytes copied. technically that covers the first vector maybe the second, but certainly not the swi vector.
When you look at the arm documentation (again)(armv7-ar since this is aarch32 or armv7-a compatibility mode) 0x00000008 is where the entry point is for a supervisor/svc/swi call.
So you need an instruction that gets from 0x00000008 to the desired address/label.
So if you revert back to this example or whatever example you learned from.
.globl _start
_start:
ldr pc,reset_handler
ldr pc,undefined_handler
ldr pc,swi_handler
ldr pc,prefetch_handler
ldr pc,data_handler
ldr pc,unused_handler
ldr pc,irq_handler
ldr pc,fiq_handler
reset_handler: .word reset
undefined_handler: .word hang
swi_handler: .word hang
prefetch_handler: .word hang
data_handler: .word hang
unused_handler: .word hang
irq_handler: .word irq
fiq_handler: .word hang
reset:
mov r0,#0x80000
mov r1,#0x0000
ldmia r0!,{r2,r3,r4,r5,r6,r7,r8,r9}
stmia r1!,{r2,r3,r4,r5,r6,r7,r8,r9}
ldmia r0!,{r2,r3,r4,r5,r6,r7,r8,r9}
stmia r1!,{r2,r3,r4,r5,r6,r7,r8,r9}
Disassembly of section .text:
00080000 <_stack>:
80000: e59ff018 ldr pc, [pc, #24] ; 80020 <reset_handler>
80004: e59ff018 ldr pc, [pc, #24] ; 80024 <undefined_handler>
80008: e59ff018 ldr pc, [pc, #24] ; 80028 <swi_handler>
8000c: e59ff018 ldr pc, [pc, #24] ; 8002c <prefetch_handler>
80010: e59ff018 ldr pc, [pc, #24] ; 80030 <data_handler>
80014: e59ff018 ldr pc, [pc, #24] ; 80034 <unused_handler>
80018: e59ff018 ldr pc, [pc, #24] ; 80038 <irq_handler>
8001c: e59ff018 ldr pc, [pc, #24] ; 8003c <fiq_handler>
00080020 <reset_handler>:
80020: 00080040 andeq r0, r8, r0, asr #32
00080024 <undefined_handler>:
80024: 00080058 andeq r0, r8, r8, asr r0
00080028 <swi_handler>:
80028: 00080058 andeq r0, r8, r8, asr r0
0008002c <prefetch_handler>:
8002c: 00080058 andeq r0, r8, r8, asr r0
00080030 <data_handler>:
80030: 00080058 andeq r0, r8, r8, asr r0
00080034 <unused_handler>:
80034: 00080058 andeq r0, r8, r8, asr r0
00080038 <irq_handler>:
80038: 0008005c andeq r0, r8, ip, asr r0
0008003c <fiq_handler>:
8003c: 00080058 andeq r0, r8, r8, asr r0
00080040 <reset>:
80040: e3a00702 mov r0, #524288 ; 0x80000
80044: e3a01000 mov r1, #0
80048: e8b003fc ldm r0!, {r2, r3, r4, r5, r6, r7, r8, r9}
8004c: e8a103fc stmia r1!, {r2, r3, r4, r5, r6, r7, r8, r9}
80050: e8b003fc ldm r0!, {r2, r3, r4, r5, r6, r7, r8, r9}
80054: e8a103fc stmia r1!, {r2, r3, r4, r5, r6, r7, r8, r9}
00080058 <hang>:
80058: eafffffe b 80058 <hang>
0008005c <irq>:
8005c: eafffffe b 8005c <irq>
It both forces the 8 words of entry points to launch out of the exception handler table, and puts those addresses for pc relative access right after in the next 8 words so you need to copy 16 words to let the assembler do the work for you and not have to compute these things. 32 words, 4 instructions 8 registers each, that's 32 words. Or, if you prefer, 8 sets of instructions 4 words each; that works too.
This is what you are after with this whole approach
80008: e59ff018 ldr pc, [pc, #24] ; 80028 <swi_handler>
00080028 <swi_handler>:
80028: 00080058
making the tool do the work for you
What if I do this:
.globl _start
_start:
ldr pc,reset_handler
ldr pc,undefined_handler
ldr pc,swi_handler
ldr pc,prefetch_handler
ldr pc,data_handler
ldr pc,unused_handler
b irq
ldr pc,fiq_handler
reset_handler: .word reset
undefined_handler: .word hang
swi_handler: .word hang
prefetch_handler: .word hang
data_handler: .word hang
unused_handler: .word hang
irq_handler: .word irq
fiq_handler: .word hang
reset:
mov r0,#0x80000
mov r1,#0x0000
ldmia r0!,{r2,r3,r4,r5,r6,r7,r8,r9}
stmia r1!,{r2,r3,r4,r5,r6,r7,r8,r9}
ldmia r0!,{r2,r3,r4,r5,r6,r7,r8,r9}
stmia r1!,{r2,r3,r4,r5,r6,r7,r8,r9}
hang:
b hang
irq:
b irq
I get this
80018: ea00000f b 8005c <irq>
instead of this
80018: e59ff018 ldr pc, [pc, #24] ; 80038 <irq_handler>
The latter is saying read from pc+24 the pc in this case is 8 ahead of that so instruction address + 32 which is instruction address+0x20.
And this
80018: ea00000f b 8005c <irq>
is saying to branch to the address 0x44 ahead of the instruction address
Now let's disassemble from a different base address, the object for example (rather than the linked elf binary) is an excellent choice
00000000 <_start>:
0: e59ff018 ldr pc, [pc, #24] ; 20 <reset_handler>
4: e59ff018 ldr pc, [pc, #24] ; 24 <undefined_handler>
8: e59ff018 ldr pc, [pc, #24] ; 28 <swi_handler>
c: e59ff018 ldr pc, [pc, #24] ; 2c <prefetch_handler>
10: e59ff018 ldr pc, [pc, #24] ; 30 <data_handler>
14: e59ff018 ldr pc, [pc, #24] ; 34 <unused_handler>
18: ea00000f b 5c <irq>
1c: e59ff018 ldr pc, [pc, #24] ; 3c <fiq_handler>
Notice the machine code for all the others, load the word 0x20 bytes ahead of this instruction into the pc.
Where the branch says branch 0x44 byte ahead of the program counter.
We used the toolchain to make that table
00080020 <reset_handler>:
80020: 00080040 andeq r0, r8, r0, asr #32
00080024 <undefined_handler>:
80024: 00080058 andeq r0, r8, r8, asr r0
00080028 <swi_handler>:
80028: 00080058 andeq r0, r8, r8, asr r0
0008002c <prefetch_handler>:
8002c: 00080058 andeq r0, r8, r8, asr r0
00080030 <data_handler>:
80030: 00080058 andeq r0, r8, r8, asr r0
00080034 <unused_handler>:
80034: 00080058 andeq r0, r8, r8, asr r0
00080038 <irq_handler>:
80038: 0008005c andeq r0, r8, ip, asr r0
0008003c <fiq_handler>:
8003c: 00080058 andeq r0, r8, r8, asr r0
If we copy 0x40 bytes from 0x80000 to 0x00000 then when it hits the machine code at 0x18 that says read from 0x38 and put that in the program counter then it will get 0008005c which is the right place
But if instead it finds
18: ea00000f b 5c <irq>
That means branch to 0x5c where we have no handler.
So other than not setting a stack pointer and how did your code make it to the swi, but anyway, if you built this
80008: e59ff074 ldr pc, [pc, #116] ; 80084 <print_c_mem4+0x4>
8000c: ea000013 b 80060 <print_mem1>
80010: ea000012 b 80060 <print_mem1>
80014: ea000012 b 80064 <print_mem2>
80018: ea000012 b 80068 <print_mem3>
8001c: ea000012 b 8006c <print_mem4>
80020: ea00000e b 80060 <print_mem1>
80024: ea00000e b 80064 <print_mem2>
80028: ea00000e b 80068 <print_mem3>
8002c: ea00000e b 8006c <print_mem4>
or something like it since your print_mems are not just placeholders to get this example to build for this answer. but still pc relative branches.
and you copied from 0x80008 for a while to 0x00000 then the instruction that ends up being at address 0x00000008 which is the svc/swi handler is
80010: ea000012 b 80060 <print_mem1>
A branch to print_mem1, but it isn't going to go anywhere near print_mem1 because it is going to branch some number of bytes after 0x00000 which is going to be 0x80008 bytes away from the address you really wanted it to land on.
Now saying ALL of that, if you search for HVBAR in the arm documentation you will find that you don't have to do any of that copying you can setup an exception table in memory and change the base address of where the processor goes when an exception (other than reset) occurs. But notice the lower 5 bits have to be zero so 0x80008 will not work. So use .balign in your code, build the table there, use labels to get the address of that and stick it in HVBAR you can then use branch instead of ldr pc. For armv6 and older the copy or build the table(s) needs to be done because other than a processor strap for a high address the vectors have to be at 0x00000000. For armv7 and a number of the cortex-ms you can instead move/point at the table at some other address (until reset).
It's good to understand that copy trick I demonstrated, but you have to use it correctly for it to work. It's not an uncommon solution. Note another way to have done that and you can do that here to is:
.globl _start
_start:
b 0x80000
b 0x80004
b 0x80008
b 0x8000C
when linked at 0x0000
00000000 <_start>:
0: ea01fffe b 80000 <_stack>
4: ea01fffe b 80004 <*ABS*0x80004>
8: ea01fffe b 80008 <*ABS*0x80008>
c: ea01fffe b 8000c <*ABS*0x8000c>
so this machine code ea01fffe means branch to 0x80000 relative to the address of that instruction, so instead of the copy you could just write the first 8 words starting at 0x00000000 and the processor will branch to your 0x80000 table. If you want to build it at 0x80008 then let the tools do the work for you:
.globl _start
_start:
b 0x80008
b 0x8000c
b 0x80010
b 0x80014
As expected, the immediate is number of words, 0x8 is two words add 2 to 1fffe you get 0x20000
00000000 <_start>:
0: ea020000 b 80008 <*ABS*0x80008>
4: ea020000 b 8000c <*ABS*0x8000c>
8: ea020000 b 80010 <*ABS*0x80010>
c: ea020000 b 80014 <*ABS*0x80014>
Also we know that the pc is two ahead so when executing at address 0 the pc when used in this way is 8 we want to go to 0x80008 that is 0x80000 ahead of the pc, the immediate in the instruction is in units of words so 0x20000 words ahead.
So instead of the copy
ldr r0,=0xEA020000
ldr r1,=0x00000000
str r0,[r1],#4
str r0,[r1],#4
str r0,[r1],#4
str r0,[r1],#4
str r0,[r1],#4
str r0,[r1],#4
str r0,[r1],#4
str r0,[r1],#4
or some other solution that will fill those 8 locations with a branch to the right place.
EDIT
Yet another approach that letting the tools do this for us:
Disassembly of section .text:
00080000 <_stack>:
80000: e59ff018 ldr pc, [pc, #24] ; 80020 <reset_handler>
80004: e59ff018 ldr pc, [pc, #24] ; 80024 <undefined_handler>
80008: e59ff018 ldr pc, [pc, #24] ; 80028 <swi_handler>
8000c: e59ff018 ldr pc, [pc, #24] ; 8002c <prefetch_handler>
80010: e59ff018 ldr pc, [pc, #24] ; 80030 <data_handler>
80014: e59ff018 ldr pc, [pc, #24] ; 80034 <unused_handler>
80018: e59ff018 ldr pc, [pc, #24] ; 80038 <irq_handler>
8001c: e59ff018 ldr pc, [pc, #24] ; 8003c <fiq_handler>
Is that we can fill the first 8 words of memory with e59ff018 and then at some point before we need them can fill in the addresses later, before creating interrupts fill in 0x00000038 with the address to the handler, can use C or ASM or whatever. Can change the handler each time, put 0xe59ff018 in memory at 0x00000008 and the address to your swi handler at 0x00000028 before executing an svc/swi instruction, change the handler at 0x00000028 and try again.