Okay, so I understand what mov means, I understand what the registers are, I understand what the operation commands. I even understand that the leftmost hexadecimal is the instruction's number. For example, on line 7, the hexadecimal 7f is instruction jg. FINE.
What I don't get is HOW EXACTLY these facts add up, and its incredibly frustrating.
What I know so far:
Like for example, on line 1 does 0d add to line 804839c? No, it jumps to line 17 because 0d is the instruction AFTER line 1. If you add 0d the address 804839e, you get 80483a7. GOOD.
Does this mean that all instructions for the next line are relative to the second 2 bit hexadecimal?
Does that mean the leftmost hexadecimal is the current line's instruction?
I just need a little more direction, I am so close to figuring this out that I can almost taste it.
1 804839c: 7e 0d jle 80483ab <silly+0x17>
2 804839e: 89 d0 mov %edx,%eax
3 80483a0: d1 f8 sar %eax
4 80483a2: 29 c2 sub %eax,%edx
5 80483a4: 8d 14 52 lea (%edx,%edx,2),%edx
6 80483a7: 85 d2 test %edx,%edx
7 80483a9: 7f f3 jg 804839e <silly+0xa>
8 80483ab: 89 d0 mov %edx,%eax
If you are confused about the opcode you are a long way from understanding this. You need to start with documentation on the instruction set. For x86 this is plentiful; it's not great documentation, but still the opcodes are pretty clear. With instruction sets like this, it's not hard to find a web page with a chart of opcodes and then you click on that to find the rest of the instruction definition.
Fairly typical that the relative address is based on the byte after the instruction. If you were working on a team for a brand new processor, then you would just go down to one of the chip folks cubes and ask (since it wouldn't be well documented yet) but since this is an old design there are tools available that will simply give you your answer without asking anyone else.
Try this:
a0: jle a0
a1: jle a1
a2: jle a2
a3: jle a3
a4: jle a4
b0: jle b1
b1: jle b2
b2: jle b3
b3: jle b4
b4: jle b5
b5: nop
c0: jle c0
c1: jle c0
c2: jle c0
c3: jle c0
c4: jle c0
d0: jle d4
d1: jle d4
d2: jle d4
d3: jle d4
d4: jle d4
Assemble and disassemble:
0000000000000000 <a0>:
0: 7e fe jle 0 <a0>
0000000000000002 <a1>:
2: 7e fe jle 2 <a1>
0000000000000004 <a2>:
4: 7e fe jle 4 <a2>
0000000000000006 <a3>:
6: 7e fe jle 6 <a3>
0000000000000008 <a4>:
8: 7e fe jle 8 <a4>
000000000000000a <b0>:
a: 7e 00 jle c <b1>
000000000000000c <b1>:
c: 7e 00 jle e <b2>
000000000000000e <b2>:
e: 7e 00 jle 10 <b3>
0000000000000010 <b3>:
10: 7e 00 jle 12 <b4>
0000000000000012 <b4>:
12: 7e 00 jle 14 <b5>
0000000000000014 <b5>:
14: 90 nop
0000000000000015 <c0>:
15: 7e fe jle 15 <c0>
0000000000000017 <c1>:
17: 7e fc jle 15 <c0>
0000000000000019 <c2>:
19: 7e fa jle 15 <c0>
000000000000001b <c3>:
1b: 7e f8 jle 15 <c0>
000000000000001d <c4>:
1d: 7e f6 jle 15 <c0>
000000000000001f <d0>:
1f: 7e 06 jle 27 <d4>
0000000000000021 <d1>:
21: 7e 04 jle 27 <d4>
0000000000000023 <d2>:
23: 7e 02 jle 27 <d4>
0000000000000025 <d3>:
25: 7e 00 jle 27 <d4>
0000000000000027 <d4>:
27: 7e fe jle 27 <d4>
Without having to look at the documentation it looks pretty clear that 0x7E is an opcode and the byte after is a pc relative offset. The 0xFE on the first items implies that it is a signed offset and relative to the byte after the instruction. The remaining experiments confirm that.
This doesn't mean you should assume that all jump/branch instructions work this way for this instruction set, you can do similar experiments with tools that are known to produce working code.
This is one area where processor documentation is lacking and you usually need to 1) talk to the silicon engineers if you can 2) look at the chip design (source code) 3) documentation 4) experiment with existing tools 5) experiment with the hardware
Most folks don't have access to 1 and 2. Often 3 and 4 are available if you actually have one of these processors and usually to get to 5 you have 3 and you probably have access to 4 but sometimes not. But again the documentation often leaves the relative address unknown, usually it is the byte after the instruction, but like in ARM it is a fixed offset from the address of the instruction, the illusion of a specific pipeline.
804839c: 7e 0d jle 80483ab <silly+0x17>
804839c is the address of the jle instruction yes. 80483ab is the address it will branch to if the condition is met. ab-9c = 0xf = 0xD + 2. 2 is the size of the instruction, 0xD is the offset/immediate in the instruction.
I would assume the other conditional branches of this form (notice the jg later in your code) are an opcode byte and a signed offset byte. But you should always check before making your own assembler or disassembler or simulator. Start with the docs, and confirm with any tools you can find that are known to work for that platform.