I am trying to add custom instructions in X86_64 (amd64) ISA. I will create a binary using these instructions and then I'll run it in gem5 simulator. To find the list of unused opcodes, I am referring to http://ref.x86asm.net/coder64.html. However, the website only lists the single byte unused opcodes. What I am looking for is the decoder grammar of amd64 CPUs so that I can find what sequences of binary would be invalid. I tried to test out some random binary sequences to see if they are invalid and I was able to find some (for example 0xc5000c
is an invalid 3 byte instruction). However, I would like to have the grammar which amd64 CPUs use to decode the instructions to derive even longer invalid binary sequences. Are there any resources available for this task?
Currently, to find binary sequences, I am using a C program which looks like:
int main()
{
asm volatile(".byte 0x06" ::: "memory");
asm volatile(".byte 0x07" ::: "memory");
asm volatile(".byte 0x0E" ::: "memory");
asm volatile(".byte 0x16" ::: "memory");
asm volatile(".byte 0x17" ::: "memory");
asm volatile(".byte 0x1E" ::: "memory");
asm volatile(".byte 0x1F" ::: "memory");
asm volatile(".byte 0x27" ::: "memory");
asm volatile(".byte 0x2F" ::: "memory");
asm volatile(".byte 0x37" ::: "memory");
asm volatile(".byte 0x3F" ::: "memory");
asm volatile(".byte 0x60" ::: "memory");
asm volatile(".byte 0x61" ::: "memory");
asm volatile(".byte 0x62" ::: "memory");
asm volatile(".byte 0x82" ::: "memory");
asm volatile(".byte 0x9A" ::: "memory");
asm volatile(".byte 0xC4" ::: "memory");
asm volatile(".byte 0xC5" ::: "memory");
asm volatile(".byte 0xD4" ::: "memory");
asm volatile(".byte 0xD5" ::: "memory");
asm volatile(".byte 0xD6" ::: "memory");
asm volatile(".byte 0xEA" ::: "memory");
asm volatile(".byte 0xC5, 0x00, 0x00" ::: "memory");
asm volatile(".byte 0xC5, 0x00, 0x01" ::: "memory");
asm volatile(".byte 0xC5, 0x00, 0x02" ::: "memory");
asm volatile(".byte 0xC5, 0x00, 0x03" ::: "memory");
asm volatile(".byte 0xC5, 0x00, 0x04" ::: "memory");
asm volatile(".byte 0xC5, 0x00, 0x05" ::: "memory");
asm volatile(".byte 0xC5, 0x00, 0x06" ::: "memory");
asm volatile(".byte 0xC5, 0x00, 0x07" ::: "memory");
asm volatile(".byte 0xC5, 0x00, 0x08" ::: "memory");
asm volatile(".byte 0xC5, 0x00, 0x09" ::: "memory");
asm volatile(".byte 0xC5, 0x00, 0x0A" ::: "memory");
asm volatile(".byte 0xC5, 0x00, 0x0B" ::: "memory");
asm volatile(".byte 0xC5, 0x00, 0x0C" ::: "memory");
asm volatile(".byte 0xC5, 0x00, 0x0D" ::: "memory");
asm volatile(".byte 0xC5, 0x00, 0x0E" ::: "memory");
asm volatile(".byte 0xC5, 0x00, 0x0F" ::: "memory");
asm volatile(".byte 0xC5, 0x00, 0x10" ::: "memory");
asm volatile(".byte 0xC5, 0x00, 0x11" ::: "memory");
asm volatile(".byte 0xC5, 0x00, 0x13" ::: "memory");
asm volatile(".byte 0xC5, 0x00, 0x17" ::: "memory");
asm volatile(".byte 0xC5, 0x00, 0x18" ::: "memory");
asm volatile(".byte 0xC5, 0x00, 0x19" ::: "memory");
asm volatile(".byte 0xC5, 0x00, 0x1A" ::: "memory");
asm volatile(".byte 0xC5, 0x00, 0x1B" ::: "memory");
asm volatile(".byte 0xC5, 0x00, 0x1C" ::: "memory");
asm volatile(".byte 0xC5, 0x00, 0x1D" ::: "memory");
asm volatile(".byte 0xC5, 0x00, 0x1E" ::: "memory");
asm volatile(".byte 0xC5, 0x00, 0x1F" ::: "memory");
asm volatile(".byte 0xC5, 0x00, 0x01" ::: "memory");
return 0;
}
To see if the instruction is invalid, I compile the program to an executable and then dump its binary with objdump
using the following command:
gcc main.c && objdump -D -j .text a.out
. When I run this with the C program I get:
...
10ed: c6 05 1c 2f 00 00 01 movb $0x1,0x2f1c(%rip) # 4010 <__TMC_END__>
10f4: 5d pop %rbp
10f5: c3 ret
10f6: 66 2e 0f 1f 84 00 00 cs nopw 0x0(%rax,%rax,1)
10fd: 00 00 00
1100: c3 ret
1101: 66 66 2e 0f 1f 84 00 data16 cs nopw 0x0(%rax,%rax,1)
1108: 00 00 00 00
110c: 0f 1f 40 00 nopl 0x0(%rax)
0000000000001110 <frame_dummy>:
1110: f3 0f 1e fa endbr64
1114: e9 67 ff ff ff jmp 1080 <register_tm_clones>
0000000000001119 <main>:
1119: 55 push %rbp
111a: 48 89 e5 mov %rsp,%rbp
111d: 06 (bad)
111e: 07 (bad)
111f: 0e (bad)
1120: 16 (bad)
1121: 17 (bad)
1122: 1e (bad)
1123: 1f (bad)
1124: 27 (bad)
1125: 2f (bad)
1126: 37 (bad)
1127: 3f (bad)
1128: 60 (bad)
1129: 61 (bad)
112a: 62 82 (bad)
112c: 9a (bad)
112d: c4 (bad)
112e: c5 d4 d5 (bad)
1131: d6 (bad)
1132: ea (bad)
1133: c5 00 00 (bad)
1136: c5 00 01 (bad)
1139: c5 00 02 (bad)
113c: c5 00 03 (bad)
113f: c5 00 04 (bad)
1142: c5 00 05 (bad)
1145: c5 00 06 (bad)
1148: c5 00 07 (bad)
114b: c5 00 08 (bad)
114e: c5 00 09 (bad)
1151: c5 00 0a (bad)
1154: c5 00 0b (bad)
1157: c5 00 0c (bad)
115a: c5 00 0d (bad)
115d: c5 00 0e (bad)
1160: c5 00 0f (bad)
1163: c5 00 10 (bad)
1166: c5 00 11 (bad)
1169: c5 00 13 (bad)
116c: c5 00 17 (bad)
116f: c5 00 18 (bad)
1172: c5 00 19 (bad)
1175: c5 00 1a (bad)
1178: c5 00 1b (bad)
117b: c5 00 1c (bad)
117e: c5 00 1d (bad)
1181: c5 00 1e (bad)
1184: c5 00 1f (bad)
1187: c5 00 01 (bad)
118a: b8 00 00 00 00 mov $0x0,%eax
118f: 5d pop %rbp
1190: c3 ret
What I am looking for is a faster way to find such invalid binary sequences which would preferably not involve the brute force approach that I am using.
There are two "official undefined" instructions, UD1
(0F B9) and UD2
(0F 0B). These exist for your specific purpose.
You seem to assume that every x86-64 CPU decodes instructions in the same way. This is certainly not the case. We know Intel and AMD have differences. You can't even assume Intel or AMD is consistent with itself. And with Alder Lake, you can't even assume that a CPU is consistent with itself. We know the P-cores of Alder Lake have AVX-512 logic, but the E-cores don't. Presumably only the P-core decoders understand the EVEX
encodings.
What you're doing now with objdump
just tells how objdump
interprets opcodes, which may not match what AMD and/or Intel are doing.