For instance, the 0x123 value is stored to a register. What do the bits [7:3] mean for in the value? Are they talking about the binary value of 0x123?
The value 0x123 is 12316, which is 29110, which is 0001001000112.
The most sensible way to number bits is giving the LSB — Least Significant Bit — the bit position number of 0. The next bit to the right gets 1 and so on. This way each bit offers the opportunity to contribute 2N to the value of the number, where N is its bit position number. If the bit is 1 it contributes that value, otherwise no contribution is made to the value.
Base 10 works the same: a number like 405 decomposes as 4×102 + 0×10^1 + 5×100.
And to be clear in the old days some computers numbered bits in the other direction, which worked alright when only one size of item is considered, but modern computers now work with bytes, shorts, words, etc.., so keeping the LSB as bit position number 0 regardless of data size makes the most sense.
9876543210 bit position # (decimal numbers)
000100100011 binary digits
So this number is 28 + 25 + 21 + 20, which is 256 + 32 + 2 + 1 = 29110
Bits [7:3] are the *'ed ones:
*****
9876543210 bit position # (decimal numbers)
000100100011 binary digits
*****
We might write that bits [7:3] of that number is 00100.
Let's say we have an 10-bit binary number, where we represent each digit with a letter. So we have:
9876543210 bit position # (decimal numbers)
abcdefghij binary number represeted by 8 variables (each is one bit)
0011110000 mask in your example (0xF0)
----------& and operation
00cdef0000 result after and
---------->>4 shift operation
000000cdef result after shift right by 4
This number, 000000cdef will be a number between 0 and 1510.
That sequence has "extracted" the 4-bit field as an unsigned number.
Remember also that in some cases, the 4-bit field [7:4] may not be the leftmost field: if the value were 16-bits, then there are 8 bits above 7. The mask of 0xF0 will remove those upper 8 bits as well as clearing the lower 4 bits. Turns out clearing the lower 4 bits isn't necessary there, since the shifting will do that on its own.
If the field you're interested in is leftmost or rightmost, fewer operations are necessary to extract it.
There are other sequences that can do the same extraction. For one, we can shift first, then mask:
9876543210 bit position # (decimal numbers)
abcdefghij binary number represeted by 8 variables (each is one bit)
---------->>4 shift operation
0000abcdef result after shift right by 4
0000001111 mask (0xF: the one's need to move over compared to 0xF0)
----------& mask operation
000000cdef result after mask