The manual says
Trap Instruction
When a program issues the trap instruction, the processor generates a software trap exception. A program typically issues a software trap when the program requires servicing by the operating system. The general exception handler for the operating system determines the reason for the trap and responds appropriately.
However when I previously asked about it, the answer says it's a software interrupt:
What does the assembly instruction trap do?
It also seems that the difference between exceptions and interrupts can differ slightly for different architectures so that there can be 4 combinations (?) of hw, sw, exception and interrupt.
Now I'm studying this assembly for a small system and I think that I can learn individual instructions myself but I'm looking for help to understand the bigger picture, why an event is exactly a software exception and not a hardware exception, a hardware interrupt, a software interrupt.
# The label alt_main is defined in this file.
# There is a call to this label in the Altera-supplied startup code for Nios II.
# At label alt_main, interrupts and handlers are initialized; thereafter,
# the label main is called, starting the main program.
################################################################
#
# Definitions for important devices and addresses in this system.
#
# Uart_0 at 0x860
.equ de2_uart_0_base,0x860
# Timer_1 at 0x920, interrupt index 10 (mask 2^10 = 0x400)
.equ de2_timer_1_base,0x920
.equ de2_timer_1_intmask,0x400
# Timeout value for 0,1 ms tick-count interval (CHANGED in every version)
.equ de2_timer_1_timeout_value,4999
# Required tick count per time-slice, meaning
# the number of timer-interrupts before a thread-switch is performed
.equ oslab_ticks_per_timeslice,100
# Interrupt address at 0x800020
.equ de2_nios2_interrupt_address,0x800020
#
# End of device-address definitions
#
################################################################
################################################################
#
# Definition of variables for keeping system time etcetera.
#
.data
.align 2
.global oslab_internal_globaltime
oslab_internal_globaltime: .word 0
# Definition of variable for remembering the number of
# timer-interrupts since the last thread-switch
.data
.align 2
.global oslab_internal_tickcount
oslab_internal_tickcount: .word 0
# Definition of system (interrupt) stack, sp, and gp
.data
.align 2
oslab_internal_gp: .word 0
oslab_internal_sp: .word 0
oslab_system_stack: .fill 256,1,0
oslab_system_stacktop:
# Definition of the end-of-timeslice message.
oslab_internal_yield_message:
.asciz "\n#### Thread yielded after using %d tick%c."
#
# End of system-time variable definitions.
#
################################################################
################################################################
#
# Interrupt handling code.
#
# Stub for interrupt handler
.text
oslab_internal_stub:
movia et,oslab_exception_handler
jmp et
# The interrupt handler
oslab_exception_handler:
# Check source of exception, following the procedure
# described in the Nios II Processor Reference Handbook.
rdctl et,estatus # Check ESTATUS
andi et,et,1 # Test EPIE
beq et,r0,oslab_exception_was_not_an_interrupt
rdctl et,ipending # Check IPENDING
beq et,r0,oslab_exception_was_not_an_interrupt
# If control comes here, we have established that the
# exception was caused by an interrupt.
# Subtract 4 from ea, so that the interrupted instruction
# will be re-run when we return.
subi ea,ea,4
# Check the source of the interrupt.
# Possible source No. 1: Timer_1 (currently the only source).
rdctl et,ipending
andi et,et,de2_timer_1_intmask
bne et,r0,oslab_timer_1_interrupt
# If control comes here, we have an interrupt from an unknown source.
# This condition is IGNORED in this version of OSLAB.
eret
oslab_exception_was_not_an_interrupt:
# Test if the interrupted instruction was a TRAP
subi sp,sp,4 # PUSH r8 (instruction 1)
stw r8,0(sp) # PUSH r8 (instruction 2)
movia r8,0x003b683a # binary code for TRAP
ldw et,-4(ea) # Load interrupted instruction
cmpeq et,et,r8 # Compare to binary code for TRAP
# Result from comparison is now in et.
ldw r8,0(sp) # POP r8 (instruction 1)
addi sp,sp,4 # POP r8 (instruction 2)
# Use the comparison result in et as branch condition.
# The value in et will also be used later, to tell if the
# exception was a trap or an interrupt.
bne et,r0,oslab_trap_handler
# If control comes here, we have an exception which was not a TRAP.
# This should not normally happen.
# However, someone writing programs for the OSLAB micro-operating system
# could perhaps use unimplemented instructions. To catch unimplemented
# instructions, we insert a BREAK instruction here. This will stop execution
# unless the program is run through the debugger.
break 0
eret
oslab_timer_1_interrupt:
# Acknowledge the timer_1 interrupt.
movia et,de2_timer_1_base
stw r0,0(et)
# Save contents of R8, to get a free register for
# temporary values.
subi sp,sp,4
stw r8,0(sp) # PUSH r8
# Increase system clock.
movia r8,oslab_internal_globaltime
ldw et,0(r8)
addi et,et,1
stw et,0(r8)
# Increase tick counter.
movia r8,oslab_internal_tickcount
ldw et,0(r8)
addi et,et,1
stw et,0(r8)
# Restore original contents of R8.
ldw r8,0(sp) # POP r8
addi sp,sp,4
# Check value of tick counter,
# against the required number of ticks per time-slice.
# Note: oslab_ticks_per_timeslice is an assembler constant,
# and not a variable. Hence, no load/store-instructions here.
subi et,et,oslab_ticks_per_timeslice
# If the result from the subtraction is zero (or perhaps positive),
# then it is time to switch threads.
bge et,r0,oslab_time_to_switch
# If we fall-through here, then we have had one of those many
# timer interrupts on which we should not switch threads.
# Return to caller.
eret
oslab_time_to_switch:
# This code will now fall-through into the TRAP handler
# which performs a context switch.
#
# We will print out a message for each timer interrupt.
# To be able to tell that we had a timer interrupt, and not
# a TRAP, we set et to zero.
movi et,0
oslab_trap_handler:
# Save registers r1 through r23, plus fp, gp, ra and ea
.set noat # R1 is used here.
subi sp,sp,108 # Make room for all registers.
stw r1, 4(sp) # R1 is saved in slot 1, not slot 0.
stw r2, 8(sp)
stw r3,12(sp)
stw r4,16(sp)
stw r5,20(sp)
stw r6,24(sp)
stw r7,28(sp)
stw r8,32(sp)
stw r9,36(sp)
stw r10,40(sp)
stw r11,44(sp)
stw r12,48(sp)
stw r13,52(sp)
stw r14,56(sp)
stw r15,60(sp)
stw r16,64(sp)
stw r17,68(sp)
stw r18,72(sp)
stw r19,76(sp)
stw r20,80(sp)
stw r21,84(sp)
stw r22,88(sp)
stw r23,92(sp)
stw r26,96(sp)
stw r28,100(sp)
stw r31,104(sp)
stw ea,0(sp) # Special case, saved in slot 0.
mov r4,sp # Copy stack pointer to param1 register
movia sp,oslab_system_stacktop # Use system stack instead
# Test et to see if this was a timeout event or a TRAP.
beq et,r0,oslab_not_a_trap
# If this was a trap event, we fall through here.
# Our simplified printf is used to print a message,
# saying that the previous thread yielded parts of its time-slice.
################################################################
#
# The following code prints a nice message. Nothing more.
# This code saves and restores all registers it uses.
# You can safely ignore the following code, up to
# (but NOT including) the label oslab_not_a_trap.
#
subi sp,sp,4 # Contents of r4 must be preserved.
stw r4,0(sp) # PUSH r4.
movia r4,oslab_internal_yield_message
movia r5,oslab_internal_tickcount
ldw r5,0(r5)
movi r6,0 # Gold-plating: check if 1 tick or several ticks.
subi et,r5,1 # Do not print the s if only 1 tick.
beq et,r0,oslab_no_plural_ticks
movi r6,'s' # If 0 ticks, or 2 or more ticks, print the s.
oslab_no_plural_ticks:
call printf
ldw r4,0(sp) # POP r4
addi sp,sp,4
#
# This comment marks the end of the code for printing a nice message.
# Now comes other code, which is potentially much more interesting.
#
################################################################
# Move on to thread-switch code.
oslab_not_a_trap:
# Clear tick counter, since we are going to switch threads.
movia et,oslab_internal_tickcount
stw r0,0(et)
# Now it is time to execute the thread-switch code.
# We use the more general callr, rather than call.
movia et,oslab_internal_threadswitch
callr et # Call thread switch routine written in C
mov sp,r2 # Copy return value to stack pointer
# Yes, the system stack pointer is lost,
# but who cares? We will not need it any more.
# restore registers
ldw r1, 4(sp)
ldw r2, 8(sp)
ldw r3,12(sp)
ldw r4,16(sp)
ldw r5,20(sp)
ldw r6,24(sp)
ldw r7,28(sp)
ldw r8,32(sp)
ldw r9,36(sp)
ldw r10,40(sp)
ldw r11,44(sp)
ldw r12,48(sp)
ldw r13,52(sp)
ldw r14,56(sp)
ldw r15,60(sp)
ldw r16,64(sp)
ldw r17,68(sp)
ldw r18,72(sp)
ldw r19,76(sp)
ldw r20,80(sp)
ldw r21,84(sp)
ldw r22,88(sp)
ldw r23,92(sp)
ldw r26,96(sp)
ldw r28,100(sp)
ldw r31,104(sp)
ldw ea,0(sp) # Special case
addi sp,sp,108
eret # Return from exception
#
# End of exception handling code.
#
################################################################
################################################################
#
# Startup code.
#
# When the system is started, Altera-supplied code initializes the
# Nios II CPU and cache memories, and then calls alt_main.
#
.global alt_main
alt_main:
wrctl status,r0 # Disable interrupts. status is register 0
wrctl ienable,r0 # Clear all bits in IENABLE. ienable is internal interrupt-enable bits
# Now copy the stub.
movia r8,oslab_internal_stub
movia r9,de2_nios2_interrupt_address
ldw r10,0(r8)
stw r10,0(r9)
ldw r10,4(r8)
stw r10,4(r9)
ldw r10,8(r8)
stw r10,8(r9)
# Initialize timer_1.
movia r8,de2_timer_1_base
movia r9,de2_timer_1_timeout_value
srli r10,r9,16
stw r10,12(r8) # Write periodh
andi r10,r9,0xffff
stw r10,8(r8) # Write periodl
movi r10,7 # Continuous, interrupt on timeout, and start
stw r10,4(r8)
# Initialize CPU for interrupts from timer_1.
movi r10,de2_timer_1_intmask
wrctl ienable,r10
movi r10,1
wrctl status,r10
# Call to main. Do not jump, main is a subroutine,
# and may execute a ret instruction.
subi sp,sp,4
stw ra,0(sp) # PUSH r31
movia r8,main
callr r8
ldw ra,0(sp) # POP r31
addi sp,sp,4
# If main returns, we will return directly to the routine
# that called us (that called alt_main).
ret
#
# End of startup code.
#
################################################################
################################################################
#
# Helper functions for initialization and thread handling.
#
.text
.align 2
.global oslab_internal_get_gp
oslab_internal_get_gp:
mov r2,gp
ret
.global oslab_begin_critical_region
oslab_begin_critical_region:
wrctl status,r0
ret
.global oslab_end_critical_region
oslab_end_critical_region:
movi r8,1
wrctl status,r8
ret
.global oslab_get_internal_globaltime
oslab_get_internal_globaltime:
movia r2,oslab_internal_globaltime
ldw r2,0(r2)
ret
.global oslab_get_internal_tickcount
oslab_get_internal_tickcount:
movia r2,oslab_internal_tickcount
ldw r2,0(r2)
ret
.global oslab_yield
oslab_yield:
trap
ret
#
# End of helper functions.
#
################################################################
#
# ********************************************************
# *** You don't have to study the code below this line ***
# ********************************************************
#
################################################################
#
# A simplified printf() replacement.
# Implements the following conversions: %c, %d, %s and %x.
# No format-width specifications are allowed,
# for example "%08x" is not implemented.
# Up to four arguments are accepted, i.e. the format string
# and three more. Any extra arguments are silently ignored.
#
# The printf() replacement relies on routines
# out_char_uart_0, out_hex_uart_0,
# out_number_uart_0 and out_string_uart_0
# in file oslab_lowlevel_c.c
#
# We need the macros PUSH and POP - definitions follow.
# PUSH reg - push a single register on the stack
.macro PUSH reg
subi sp,sp,4 # reserve space on stack
stw \reg,0(sp) # store register
.endm
# POP reg - pop a single register from the stack
.macro POP reg
ldw \reg,0(sp) # fetch top of stack contents
addi sp,sp,4 # return previously reserved space
.endm
.text
.global printf
printf:
PUSH ra # PUSH return address register r31.
PUSH r16 # R16 will point into format string.
PUSH r17 # R17 will contain the argument number.
PUSH r18 # R18 will contain a copy of r5.
PUSH r19 # R19 will contain a copy of r6.
PUSH r20 # R20 will contain a copy of r7.
mov r16,r4 # Get format string argument
movi r17,0 # Clear argument number.
mov r18,r5 # Copy r5 to safe place.
mov r19,r6 # Copy r6 to safe place.
mov r20,r7 # Copy r7 to safe place.
asm_printf_loop:
ldb r4,0(r16) # Get a byte of format string.
addi r16,r16,1 # Point to next byte
# End of format string is marked by a zero-byte.
beq r4,r0,asm_printf_end
cmpeqi r9,r4,92 # Check for backslash escape.
bne r9,r0,asm_printf_backslash
cmpeqi r9,r4,'%' # Check for percent-sign escape.
bne r9,r0,asm_printf_percentsign
asm_printf_doprint:
# No escapes present, just print the character.
movia r8,out_char_uart_0
callr r8
br asm_printf_loop
asm_printf_backslash:
# Preload address to out_char_uart_0 into r8.
movia r8,out_char_uart_0
ldb r4,0(r16) # Get byte after backslash
addi r16,r16,1 # Increase byte count.
# Having a backslash at the end of the format string
# is illegal, but must not crash our printf code.
beq r4,r0,asm_printf_end
cmpeqi r9,r4,'n' # Newline
beq r9,r0,asm_printf_backslash_not_newline
movi r4,10 # Newline
callr r8
br asm_printf_loop
asm_printf_backslash_not_newline:
cmpeqi r9,r4,'r' # Return
beq r9,r0,asm_printf_backslash_not_return
movi r4,13 # Return
callr r8
br asm_printf_loop
asm_printf_backslash_not_return:
# Unknown character after backslash - ignore.
br asm_printf_loop
asm_printf_percentsign:
addi r17,r17,1 # Increase argument count.
cmpgei r8,r17,4 # Check against maximum argument count.
# If maximum argument count exceeded, print format string.
bne r8,r0,asm_printf_doprint
cmpeqi r9,r17,1 # Is argument number equal to 1?
beq r9,r0,asm_printf_not_r5 # beq jumps if cmpeqi false
mov r4,r18 # If yes, get argument from saved copy of r5.
br asm_printf_do_conversion
asm_printf_not_r5:
cmpeqi r9,r17,2 # Is argument number equal to 2?
beq r9,r0,asm_printf_not_r6 # beq jumps if cmpeqi false
mov r4,r19 # If yes, get argument from saved copy of r6.
br asm_printf_do_conversion
asm_printf_not_r6:
cmpeqi r9,r17,3 # Is argument number equal to 3?
beq r9,r0,asm_printf_not_r7 # beq jumps if cmpeqi false
mov r4,r20 # If yes, get argument from saved copy of r7.
br asm_printf_do_conversion
asm_printf_not_r7:
# This should not be possible.
# If this strange error happens, print format string.
br asm_printf_doprint
asm_printf_do_conversion:
ldb r8,0(r16) # Get byte after percent-sign.
addi r16,r16,1 # Increase byte count.
cmpeqi r9,r8,'x' # Check for %x (hexadecimal).
beq r9,r0,asm_printf_not_x
movia r8,out_hex_uart_0
callr r8
br asm_printf_loop
asm_printf_not_x:
cmpeqi r9,r8,'d' # Check for %d (decimal).
beq r9,r0,asm_printf_not_d
movia r8,out_number_uart_0
callr r8
br asm_printf_loop
asm_printf_not_d:
cmpeqi r9,r8,'c' # Check for %c (character).
beq r9,r0,asm_printf_not_c
# Print character argument.
br asm_printf_doprint
asm_printf_not_c:
cmpeqi r9,r8,'s' # Check for %s (string).
beq r9,r0,asm_printf_not_s
movia r8,out_string_uart_0
callr r8
br asm_printf_loop
asm_printf_not_s:
asm_printf_unknown:
# We do not know what to do with other formats.
# Print the format string text.
movi r4,'%'
movia r8,out_char_uart_0
callr r8
ldb r4,-1(r16)
br asm_printf_doprint
asm_printf_end:
POP r20
POP r19
POP r18
POP r17
POP r16
POP ra
ret
#
# End of simplified printf() replacement code.
#
################################################################
.end
Let's talk about "context" first; then traps are easy to understand.
When you write conventional assembly code, you end up filling the various registers with values of importance to that part of the code.
Subroutine calls are special instructions that explicly designate the location of a subroutine to be called, often by directly designating the address of the subroutine. When that code calls a subroutine, the subroutine call itself saves the Program Counter somewhere well-known so that the subroutine, on completion, can set the PC to the saved PC, thus returning control to the calling point.
It is often useful that the already-filled registers at the point of the subroutine call contain those same values after the subroutine completes. The subroutine itself likely damages some or most of them, and typically doesn't know what was important at the calling point. To preserve already-filled registers, there may be instructions inserted by the programmer before the call instruction to save those registers in safe place, and inserted after the call by the programmer to restore the saved registers. Sometimes even the condition codes or other mode bits controlling the processor state (on the x86, the "direction" bit is one of these) need to be saved an restored.
All of this information to be saved is the "context" of the computation at the point of the subroutine call.
Often the place that the saved PC and other registers is stored, is on a push down stack managed by the machine, or managed by mere (but useful) programming convention. On machines that don't have stacks or where no such convention is widespread, these locations can simply be locations chosen by the programmer. What matters is there is space set aside somewhere to store the "context", and from which the context is restored.
Traps are now easy to understand. A trap instruction (or so-called "software interrupt") works just like a subroutine call, except the target location is not coded directly into the trap. A trap instruction is like a call instruction in that there is an opcode and perhaps an operand field to differentiate traps. Such traps may be placed by the programmer in his code, to cause the trap routine (whatever it might be, usually an OS service function) to run. Thus, like a subroutine call, it saves the PC somewhere safe. One could insist that the programmer decide what registers/context to save/restore before/after the trap call. However, trap routines are often built by programmers with no relation to the programmer writing the trap call, and usually do complex tasks, for use by a large audience, and so convenience is a key. Consequently this work is usually done by the trap routine which saves all the registers, condition codes and mode bits ("the full context"), does its job (modifying the saved full context to show any result), restores the full context, and passes control back to the invoker; thus the user programmer conveniently doesn't have to do this work. This is so useful and common that many processors provide special support for traps, which cause critical parts of the context to be saved (and the trap routine does the rest).
So, traps are easy to use: one simply codes them where one might have coded a subroutine call to service, and their effect happens. (Usually the trap routine provider has to set up special vectors somewhere in the hardware, which are automatically used by the trap instruction to find the trap routine. This is all done well before the user program starts).
A trap (note: not instruction!) is an action which occurs because of program condition. For instance, if the user program divides by zero, the CPU may force an implied trap call to a special routine to handle division by zero. (It may patch up the result, or abort the program, or do whatever else is deemed useful). Such a trap works just trap instruction; the particular vector it chooses is usually determined by the specific trap condition. Note that traps occur synchronously with respect to the stream of instructions seen by the processor, as provided by the programmer. Notice also that for traps one cannot conveniently establish user-level context saves and restores, especially if the variety of traps is wide and varied. Thus for such traps, the context save/restore is virtually always done by the trap routine.
An (hardware) interrupt is an asynchronous event. It is implemented as a trap at an arbitrary place in the programmed instruction stream. By saving the full context, the trap routine for the interrupts (called the "interrupt routine") can handle the asynchronous condition (service the disk, etc.), restore the full context, and the user code can continue as though the interrupt had never occurred.
So, traps are fundamentally just subroutine calls that save and restore context. The question is, when can it occur, how much context is saved/restored, and who does the saving/restoring.