[SOLVED] How to determine if a word(4 bytes) is a 16-bit instruction or 32-bit instruction?

How to determine if a word(4 bytes) is a 16-bit instruction or 32-bit instruction?

How do I know if the bytes in the word represent a 16-bit instruction or a 32-bit instruction ?
I referred the ARM ARMv7M and I am not clear how to distinguish if it a 16-bit instruction or a 32-bit instruction.
It says
If bits [15:11] of the halfword being decoded take any of the following values, the halfword is the first halfword of a 32-bit instruction: • 0b11101 • 0b11110 • 0b11111. Otherwise, the halfword is a 16-bit instruction

Does it mean that the processor always fetches halfwords, examines them and decides if it's 16 or 32-bit ?
What does the first halfword mean ? Bit[31-16] or Bit[15-0] in a word ?

If I have 32-bits then can I know if it's a 32-bit instruction or a 16-bit instruction ?

Thanks.

Solution

In Thumb, "32-bit" instructions are still composed of two separate halfwords, so the "first halfword" is the first halfword of the encoding, which says nothing about the layout in memory. Thumb instructions are halfword-aligned, so any given word of memory could hold two 16-bit instructions, a 16-bit instruction and one half of a 32-bit instruction, two halves of two different 32-bit instructions, or one whole 32-bit instruction.

Conceptually, the processor decodes one halfword at a time, thus if it sees one of the above bit patterns, it knows it needs to also decode the next halfword before it can actually execute this instruction. Reality complicates this somewhat since the Cortex-M3/M4 only ever actually fetch whole 32-bit words from memory, so the correlation between the number of "instruction fetches" and the number of instructions actually decoded and executed depends on the code itself. Just imagine that those fetches are to refill a 4-byte buffer that the pipeline slurps individual halfwords out of (which may not be all that far off the truth, for all I know).

So, if you have a halfword containing one of those values in its top bits, then you know it's the first half of a 32-bit encoding, and you need to interpret it in conjunction with the next halfword. Conversely, if you have a halfword with any other value in its top bits, then it's either a 16-bit encoding, or the second half of a 32-bit encoding, depending on what the previous halfword was.

Note that instructions are always little-endian, so the actual in-memory layout of a 32-bit encoding looks like this, where address A is an even number:

          --------------------------------
address A | bits 7:0 of first halfword   |
          --------------------------------
      A+1 | bits 15:8 of first halfword  |
          --------------------------------
      A+2 | bits 7:0 of second halfword  |
          --------------------------------
      A+3 | bits 15:8 of second halfword |
          --------------------------------