python numpy matlab bit-manipulation data-conversion

In Matlab how to unpack a buffer of 12 bit values into an array of normalized float32

A measurement system (in our lab) produces data of 12 bits per sample in a packed format, i.e. 2 samples of 12 bits each are packed into 3 bytes:

   buf[l + 2]  |   buf[l + 1]  |   buf[l + 0]  
7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0
-----------------------|-----------------------
B A 9 8 7 6 5 4 3 2 1 0|B A 9 8 7 6 5 4 3 2 1 0
    sample[2*i + 1]    |    sample[2*i + 0]

For NumPy I created the following unpacking function that will take a Python byte buffer apply some stride tricks and bit manipulations to it, returning the desired float32 array:

def unpack_s12p_to_f32(buf):
    import numpy
    import numpy.lib.stride_tricks as npst
    s12p = numpy.frombuffer(buf, dtype=numpy.int32)
    s12p_sv = numpy.copy( numpy.transpose(
        npst.as_strided(s12p,
            shape=(2, int((s12p.size*4)/3)),
            strides=(0,3), writeable=False) ))
    m12b = (1<<12)-1
    s12p_sv[:,0] &= m12b
    s12p_sv[:,0] <<= 20
    s12p_sv[:,1] >>= 12
    s12p_sv[:,1] &= m12b
    s12p_sv[:,1] <<= 20
    return s12p_sv.reshape(-1).astype(numpy.float32) * (2.**-31)

We now seek a method to replicate this function within Matlab. However, I was unsuccessful finding/identifying equivalent functions that would allow me to manipulate Matlab array objects in the same way.

Example data and conversion result

From one of our datasets I extracted 200 samples. When written as Python buffer literal, passed through the above unpacking function and plotted using Matplotlib it looks like this

from matplotlib.pyplot import plot,show

data = \
b'\x19P\x05g\x90\x05I\x10\x01\xcf_\xfa\x87\x7f\xf5a\xbf\xf7\xb7\xff\xff9\xf0\x04]P' \
b'\x04)\x90\xfe\xad\xdf\xf6M\x1f\xf4s\x7f\xfb\xf5\x7f\x02=\xd0\x04W\xf0\x01\xfb\x7f' \
b'\xfb\x81\xff\xf6\x85\x7f\xf7\x8f\xbf\xfb\x05p\x03O\x90\x045\x90\x02\xf7\x7f\xfb' \
b'\x7f\xff\xf6q_\xf7\xb7\xff\xff!p\x03G\x90\x02\xdd\x7f\xfb\xc3\xdf\xf9s\xff\xf6' \
b'\x91?\xfb\xeb?\x011\xf0\x035\xb0\xff\xe1\xff\xfa\x81\xbf\xf6\x89\x9f\xf9\xc7\x9f' \
b'\xfe\r\x90\x02=\xf0\x02\x19\x90\xfe\xc3?\xf9\xa3\x1f\xfa\x8d\x7f\xf7\x99\xbf\xfc' \
b'\t\x10\x03=0\x02\x13\xb0\xfe\xb3?\xf8\x8b\xff\xf7\x83\xbf\xf9\xcd_\xfe\x050\x01' \
b'\x130\xff\xd3?\xfc\xb9\xbf\xfa\xa5\xdf\xf9\xa5_\xfc\xe9\xdf\xfe\xdb\xff\xfd\xf7' \
b'\x9f\xff\xe7\x9f\xfb\xcb?\xfc\xbd\xff\xf9\xab\xff\xfc\xfd\xdf\xff\xf2O\xfe\xe2\xcf' \
b'\xfc\xdc\xcf\xfc\xd4\xaf\xfd\xe8\xcf\xfd\xdc\xcf\xfd\xfc\x8f\x00\xfa\xaf\xfe\xec' \
b'\x0f\xfd\xc6/\xfc\xde\x0f\xff\xf2\xcf\xfd\xe2\x8f\xfe\xe8/\xff\xf4o\xfc\xce\xcf\xff' \
b'\x08@\xff\xf8\xef\xfd\xe8o\x00\x10`\xff\xe6/\xfd\xeco\xff\x06`\x00\xfe\x8f\xfe\xfa' \
b'\xef\xff\xe2\xaf\xfc\xdao\xfe\x00\x00\x00\xf8\x8f\xfd\xe0\xef\xfe\x00\xc0\xfe\xeao\xfe'

plot( unpack_s12p_to_f32(data) )
show()

producing the following output

Solution

Your code does not match your layout schematic. Your code assumes the following:

   buf[l + 2]  |   buf[l + 1]  |   buf[l + 0]  
7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0
-----------------------|-----------------------
B A 9 8 7 6 5 4 3 2 1 0|B A 9 8 7 6 5 4 3 2 1 0
    sample[2*i + 1]    |    sample[2*i + 0]

The bytes are reversed, you are using a little-endian system, and thus the first byte in the buffer contains the lowest-valued bits of your 24-bit value.

I converted your data to decimal values, and imported it into MATLAB thus:

data = [
    25    80     5   103   144     5    73    16     1   207    95   250   135   127   245    97   191   247   183   255   255 ...
    57   240     4    93    80     4    41   144   254   173   223   246    77    31   244   115   127   251   245   127     2 ...
    61   208     4    87   240     1   251   127   251   129   255   246   133   127   247   143   191   251     5   112     3 ...
    79   144     4    53   144     2   247   127   251   127   255   246   113    95   247   183   255   255    33   112     3 ...
    71   144     2   221   127   251   195   223   249   115   255   246   145    63   251   235    63     1    49   240     3 ...
    53   176   255   225   255   250   129   191   246   137   159   249   199   159   254    13   144     2    61   240     2 ...
    25   144   254   195    63   249   163    31   250   141   127   247   153   191   252     9    16     3    61    48     2 ...
    19   176   254   179    63   248   139   255   247   131   191   249   205    95   254     5    48     1    19    48   255 ...
   211    63   252   185   191   250   165   223   249   165    95   252   233   223   254   219   255   253   247   159   255 ...
   231   159   251   203    63   252   189   255   249   171   255   252   253   223   255   242    79   254   226   207   252 ...
   220   207   252   212   175   253   232   207   253   220   207   253   252   143     0   250   175   254   236    15   253 ...
   198    47   252   222    15   255   242   207   253   226   143   254   232    47   255   244   111   252   206   207   255 ...
     8    64   255   248   239   253   232   111     0    16    96   255   230    47   253   236   111   255     6    96     0 ...
   254   143   254   250   239   255   226   175   252   218   111   254     0     0     0   248   143   253   224   239   254 ...
     0   192   254   234   111   254];

This code reproduces your plot:

data = int32(reshape(data, 3, []));
values = zeros(2, size(data, 2), 'int32');
values(2, :) = bitshift(data(3, :), 4) + bitshift(data(2, :), -4);
values(1, :) = bitshift(bitand(data(2, :), 15), 8) + data(1, :);  % the `bitand` is not really necessary, as we shift right later by 20, dropping the top 4 bits
values = single(bitshift(values, 20)) * 2^-31;
values = reshape(values, 1, []);
plot(values)

We're converting the individual values to int32, because your bit shift operations work on that bit length. We then combine the data by shifting according to the schematic, shift left by 20 bits (causing values where the top bit is 1 to become negative) and multiplying by 2^-31 for correct scaling.

To make the indexing easy, we're reshaping the array of 300 values to 3x100, and writing the result in an array 2x100. We could also have used simple indexing instead:

data = int32(data);
values = zeros(1, numel(data) / 3 * 2, 'int32');
values(2:2:end) = bitshift(data(3:3:end), 4) + bitshift(data(2:3:end), -4);
values(1:2:end) = bitshift(data(2:3:end), 8) + data(1:3:end);
values = single(bitshift(values, 20)) * 2^-31;
plot(values)