assemblyarmreverse-engineeringinterpretationarm7

What are the best practicises to interpret an assembly program without knowing what it's supposed to do?


I'm studying to pass an assembly exam where we are supposed to interpret an assembly source file provided without comments or anything, which is a task i find particularly difficult. My question is if in these cases, which i assume happen quite often in a professional environment, there are clues to look for to determine what the flow and the purpose of the program is. I am able to recognize loops through branching and jump instructions, but not much else apart from those, and i can't seem to find any proper source online. I'll post here an example program for reference, which is the current one i'm having a hard time with, in case anyone would want to point out something.

.data
  data_:  .byte  20, -40, -80
  pow_:  .alloc  804
  d:  .alloc  8

.global main

main:
 sub:  
   MOV  r0, #-100
   MOV  r2, #-1
   EOR  r1, r2, r0
   ADD  r1, r1, #1
   LDR  r2, =data_
   ADR  r9, d
   ADR  r8, pow_
   LDRB  r3, [r2]
   LDRB  r4, [r2, #1]
   LDRB  r5, [r2, #2]
 mpt:  
   MUL  r6, r0, r0
   MUL  r6, r3, r6
   MUL  r7, r0, r4
   ADD  r6, r5, r6
   ADD  r6, r7, r6
   STR  r6, [r8]
   CMP  r0, #-100
   BGT  aft 
 r:  
   STR  r6, [r9, #4]
   STR  r0, [r9]
 cyc:
   ADD  r8, r8, #4
   ADD  r0, r0, #1
   CMP  r0, r1
   BLE  mpt 
   MOV  r15, r14
 aft: 
   LDR  r2, [r9, #4]
   CMP  r6, r2
   BLT  r
   B  cyc 

This code is ARM7, the .alloc instruction isn't a real instruction but it's there just to signify an allocation of n bits under the alias provided. From my understanding the program is trying to perform a loop of some sort while keeping an iteration counter, but i can't get my head around even why there is an EOR there, which converts to #-100 and #-1 to a #99

Any kind of suggestion is welcome


Solution

  • You can translate each assembly instruction to some high level form, and reduce them to a simpler form. Have a look at the following example translating your mpt, and you'll get the basic idea.

     mpt:  
       MUL  r6, r0, r0
       MUL  r6, r3, r6
       MUL  r7, r0, r4
       ADD  r6, r5, r6
       ADD  r6, r7, r6
       STR  r6, [r8]
       CMP  r0, #-100
       BGT  aft 
    

    First, translate line by line.

    r6 = r0 * r0
    r6 = r6 * r3
    r7 = r0 * r4
    r6 = r6 + r5
    r6 = r6 + r7
    *r8 = r6
    if (r0 > -100) goto aft
    

    Then, make it simpler.

    r7 = r0 * r4
    r6 = r0 * r0 * r3 + r5 + r7
    *r8 = r6
    if (r0 > -100) goto aft
    

    You'll get some readable code by applying the same procedure to the other parts of your code.