I am writing code for an embedded processor (ARM Cortex-M4)
The purpose of this code is to decode 4-bit ADPCM in Intel/DVI format (also called IMA format). I have encoded an ADPCM sample of a square wave using Python's audioop
module. I have then decoded this sample successfully using the same audioop
module, and it is a good match for the input.
However, I am unable to decode the input data correctly on my embedded processor. The valpred
value, which represents the output, seems to run-away and oscillate between a large positive and large negative value. This seems to be driven by the behaviour of the sign
value. The problem I have, is that this code is effectively a carbon copy of the C implementation code of audioop
, with Python parts removed. The algorithm is, as far as I can tell, identical. Yet it still seems to go into an oscillatory state, for virtually every input data value. This is clearly driven by sign
flipping the vpdiff
value but I can't see how this would be avoided, given the quantization step is so high (at max step 88 typically) and the data does seem to have alternating signs.
This is the implementation I am working with now. The adpcm_step_size
array contains the quantization steps (e.g. 7, 8, 9 ... 29794, 32767), whereas adpcm_step_size_adapt
contains the step increments (-1, -1, -1, -1, 2, 4, 6, 8, duplicated).
void audio_adpcm_play(uint8_t *sample_data, uint16_t sample_size)
{
int sign, delta, step, vpdiff, valpred, index, half;
uint32_t debug_data;
uint32_t result;
uint8_t data = 0x00;
// Initial state
half = 0;
valpred = 0;
index = 0;
step = adpcm_step_size[index];
while(sample_size > 0) {
// Extract the appropriate word
if(half) {
delta = data & 0x0f;
} else {
data = *sample_data++;
delta = (data >> 4) & 0x0f;
sample_size--;
}
half = !half;
debug_data = delta;
// Find new index value
index += adpcm_step_size_adapt[delta];
if(index < 0)
index = 0;
if(index > 88)
index = 88;
// Separate sign and magnitude
sign = delta & 8;
delta = delta & 7;
// Compute difference and the new predicted value
vpdiff = step >> 3;
if(delta & 4)
vpdiff += step;
if(delta & 2)
vpdiff += step >> 1;
if(delta & 1)
vpdiff += step >> 2;
if(sign)
valpred -= vpdiff;
else
valpred += vpdiff;
// Clamp values that exceed the valid range
if(valpred > 32767)
valpred = 32767;
else if(valpred < -32768)
valpred = -32768;
step = adpcm_step_size[index];
result = (valpred + 32767) >> AUDIO_CODE_SHIFT;
uart_printf(DBG_LVL_INFO, \
"data=%02x, source_byte=%02x, samples_rem=%5d, valpred=%7d, vpdiff=%5d, sign=%02x, delta=%02x, index=%3d, step=%3d, adapt=%3d, res=%5d/%5d\r\n", \
debug_data, data, sample_size, valpred, vpdiff, sign, delta, index, step, \
adpcm_step_size_adapt[delta], result, AUDIO_CODE_DUTY_MAX);
}
}
Here's the output from inputting a square wave input; as can be seen, valpred rapidly oscillates between two values, when it should settle at a given value.
data=07, source_byte=f7, samples_rem= 7999, valpred= 19, vpdiff= 30, sign=00, delta=07, index= 16, step= 34, adapt= 8, res= 128/ 256
data=0f, source_byte=f7, samples_rem= 7998, valpred= -44, vpdiff= 63, sign=08, delta=07, index= 24, step= 73, adapt= 8, res= 127/ 256
data=07, source_byte=f7, samples_rem= 7998, valpred= 92, vpdiff= 136, sign=00, delta=07, index= 32, step=157, adapt= 8, res= 128/ 256
data=0f, source_byte=f7, samples_rem= 7997, valpred= -201, vpdiff= 293, sign=08, delta=07, index= 40, step=337, adapt= 8, res= 127/ 256
data=07, source_byte=f7, samples_rem= 7997, valpred= 430, vpdiff= 631, sign=00, delta=07, index= 48, step=724, adapt= 8, res= 129/ 256
data=0f, source_byte=f7, samples_rem= 7996, valpred= -927, vpdiff= 1357, sign=08, delta=07, index= 56, step=1552, adapt= 8, res= 124/ 256
data=07, source_byte=f7, samples_rem= 7996, valpred= 1983, vpdiff= 2910, sign=00, delta=07, index= 64, step=3327, adapt= 8, res= 135/ 256
data=0f, source_byte=f7, samples_rem= 7995, valpred= -4253, vpdiff= 6236, sign=08, delta=07, index= 72, step=7132, adapt= 8, res= 111/ 256
data=07, source_byte=f7, samples_rem= 7995, valpred= 9119, vpdiff=13372, sign=00, delta=07, index= 80, step=15289, adapt= 8, res= 163/ 256
data=0d, source_byte=d5, samples_rem= 7994, valpred= -11903, vpdiff=21022, sign=08, delta=05, index= 84, step=22385, adapt= 4, res= 81/ 256
data=05, source_byte=d5, samples_rem= 7994, valpred= 18876, vpdiff=30779, sign=00, delta=05, index= 88, step=32767, adapt= 4, res= 201/ 256
data=0b, source_byte=b3, samples_rem= 7993, valpred= -9793, vpdiff=28669, sign=08, delta=03, index= 87, step=29794, adapt= -1, res= 89/ 256
data=03, source_byte=b3, samples_rem= 7993, valpred= 16276, vpdiff=26069, sign=00, delta=03, index= 86, step=27086, adapt= -1, res= 191/ 256
data=0c, source_byte=c4, samples_rem= 7992, valpred= -14195, vpdiff=30471, sign=08, delta=04, index= 88, step=32767, adapt= 2, res= 72/ 256
data=04, source_byte=c4, samples_rem= 7992, valpred= 22667, vpdiff=36862, sign=00, delta=04, index= 88, step=32767, adapt= 2, res= 216/ 256
data=09, source_byte=9c, samples_rem= 7991, valpred= 10381, vpdiff=12286, sign=08, delta=01, index= 87, step=29794, adapt= -1, res= 168/ 256
data=0c, source_byte=9c, samples_rem= 7991, valpred= -23137, vpdiff=33518, sign=08, delta=04, index= 88, step=32767, adapt= 2, res= 37/ 256
data=04, source_byte=4c, samples_rem= 7990, valpred= 13725, vpdiff=36862, sign=00, delta=04, index= 88, step=32767, adapt= 2, res= 181/ 256
data=0c, source_byte=4c, samples_rem= 7990, valpred= -23137, vpdiff=36862, sign=08, delta=04, index= 88, step=32767, adapt= 2, res= 37/ 256
data=04, source_byte=4c, samples_rem= 7989, valpred= 13725, vpdiff=36862, sign=00, delta=04, index= 88, step=32767, adapt= 2, res= 181/ 256
data=0c, source_byte=4c, samples_rem= 7989, valpred= -23137, vpdiff=36862, sign=08, delta=04, index= 88, step=32767, adapt= 2, res= 37/ 256
data=04, source_byte=4c, samples_rem= 7988, valpred= 13725, vpdiff=36862, sign=00, delta=04, index= 88, step=32767, adapt= 2, res= 181/ 256
data=0c, source_byte=4c, samples_rem= 7988, valpred= -23137, vpdiff=36862, sign=08, delta=04, index= 88, step=32767, adapt= 2, res= 37/ 256
data=04, source_byte=4c, samples_rem= 7987, valpred= 13725, vpdiff=36862, sign=00, delta=04, index= 88, step=32767, adapt= 2, res= 181/ 256
data=0c, source_byte=4c, samples_rem= 7987, valpred= -23137, vpdiff=36862, sign=08, delta=04, index= 88, step=32767, adapt= 2, res= 37/ 256
data=04, source_byte=4c, samples_rem= 7986, valpred= 13725, vpdiff=36862, sign=00, delta=04, index= 88, step=32767, adapt= 2, res= 181/ 256
data=0c, source_byte=4c, samples_rem= 7986, valpred= -23137, vpdiff=36862, sign=08, delta=04, index= 88, step=32767, adapt= 2, res= 37/ 256
data=04, source_byte=4c, samples_rem= 7985, valpred= 13725, vpdiff=36862, sign=00, delta=04, index= 88, step=32767, adapt= 2, res= 181/ 256
data=0c, source_byte=4c, samples_rem= 7985, valpred= -23137, vpdiff=36862, sign=08, delta=04, index= 88, step=32767, adapt= 2, res= 37/ 256
data=04, source_byte=4c, samples_rem= 7984, valpred= 13725, vpdiff=36862, sign=00, delta=04, index= 88, step=32767, adapt= 2, res= 181/ 256
data=0c, source_byte=4c, samples_rem= 7984, valpred= -23137, vpdiff=36862, sign=08, delta=04, index= 88, step=32767, adapt= 2, res= 37/ 256
data=01, source_byte=14, samples_rem= 7983, valpred= -10851, vpdiff=12286, sign=00, delta=01, index= 87, step=29794, adapt= -1, res= 85/ 256
data=04, source_byte=14, samples_rem= 7983, valpred= 22667, vpdiff=33518, sign=00, delta=04, index= 88, step=32767, adapt= 2, res= 216/ 256
data=0c, source_byte=c4, samples_rem= 7982, valpred= -14195, vpdiff=36862, sign=08, delta=04, index= 88, step=32767, adapt= 2, res= 72/ 256
data=04, source_byte=c4, samples_rem= 7982, valpred= 22667, vpdiff=36862, sign=00, delta=04, index= 88, step=32767, adapt= 2, res= 216/ 256
data=0c, source_byte=c4, samples_rem= 7981, valpred= -14195, vpdiff=36862, sign=08, delta=04, index= 88, step=32767, adapt= 2, res= 72/ 256
data=04, source_byte=c4, samples_rem= 7981, valpred= 22667, vpdiff=36862, sign=00, delta=04, index= 88, step=32767, adapt= 2, res= 216/ 256
data=0c, source_byte=c4, samples_rem= 7980, valpred= -14195, vpdiff=36862, sign=08, delta=04, index= 88, step=32767, adapt= 2, res= 72/ 256
data=04, source_byte=c4, samples_rem= 7980, valpred= 22667, vpdiff=36862, sign=00, delta=04, index= 88, step=32767, adapt= 2, res= 216/ 256
data=0c, source_byte=c4, samples_rem= 7979, valpred= -14195, vpdiff=36862, sign=08, delta=04, index= 88, step=32767, adapt= 2, res= 72/ 256
data=04, source_byte=c4, samples_rem= 7979, valpred= 22667, vpdiff=36862, sign=00, delta=04, index= 88, step=32767, adapt= 2, res= 216/ 256
data=0c, source_byte=c4, samples_rem= 7978, valpred= -14195, vpdiff=36862, sign=08, delta=04, index= 88, step=32767, adapt= 2, res= 72/ 256
data=04, source_byte=c4, samples_rem= 7978, valpred= 22667, vpdiff=36862, sign=00, delta=04, index= 88, step=32767, adapt= 2, res= 216/ 256
data=0c, source_byte=c4, samples_rem= 7977, valpred= -14195, vpdiff=36862, sign=08, delta=04, index= 88, step=32767, adapt= 2, res= 72/ 256
data=04, source_byte=c4, samples_rem= 7977, valpred= 22667, vpdiff=36862, sign=00, delta=04, index= 88, step=32767, adapt= 2, res= 216/ 256
data=0c, source_byte=c4, samples_rem= 7976, valpred= -14195, vpdiff=36862, sign=08, delta=04, index= 88, step=32767, adapt= 2, res= 72/ 256
data=04, source_byte=c4, samples_rem= 7976, valpred= 22667, vpdiff=36862, sign=00, delta=04, index= 88, step=32767, adapt= 2, res= 216/ 256
data=09, source_byte=9c, samples_rem= 7975, valpred= 10381, vpdiff=12286, sign=08, delta=01, index= 87, step=29794, adapt= -1, res= 168/ 256
data=0c, source_byte=9c, samples_rem= 7975, valpred= -23137, vpdiff=33518, sign=08, delta=04, index= 88, step=32767, adapt= 2, res= 37/ 256
data=04, source_byte=4c, samples_rem= 7974, valpred= 13725, vpdiff=36862, sign=00, delta=04, index= 88, step=32767, adapt= 2, res= 181/ 256
If I take only every second sample, it almost works acceptably for square waves, but problems occur for other waveforms. That's still not an acceptable solution, but perhaps it's a clue as to the cause of the issue.
If anyone has any ideas, I'd appreciate it. I've been tearing my hair out on this for a good few days.
Edit: Source for the audioop
module can be found here https://github.com/python/cpython/blob/master/Modules/audioop.c, the ADPCM decoder is audioop_adpcm2lin_impl
.
I managed to fix this issue. It was caused by a silly error, reading the 16-bit input data one byte at a time, then decompressing the data using the same error produced a correct result in Python. But this was obviously no good for the C implementation of the decoder.
In hindsight, I'm not sure why I didn't notice that the audio file was twice as big as it should have been.