I am trying to convert audio sample rate (from 44100 to 22050) of a numpy.array with 88200 samples in which I have already done some process (such as add silence and convert to mono). I tried to convert this array with audioop.ratecv
and it work, but it return a str instead of a numpy array and when I wrote those data with scipy.io.wavfile.write
the result was half of the data are lost and the audio speed is twice as fast (instead of slower, at least that would make kinda sense).
audio.ratecv
works fine with str arrays such as wave.open
returns, but I don't know how to process those, so I tried to convert from str to numpy with numpy.array2string(data)
to pass this on ratecv and get correct results, and then convert again to numpy with numpy.fromstring(data, dtype)
and now len of data is 8 samples. I think this is due to complication of formats, but I don't know how can I control it. I also haven't figure out what kind of format str does wave.open
returns so I can force format on this one.
Here is this part of my code
def conv_sr(data, srold, fixSR, dType, chan = 1):
state = None
width = 2 # numpy.int16
print "data shape", data.shape, type(data[0]) # returns shape 88200, type int16
fragments = numpy.array2string(data)
print "new fragments len", len(fragments), "type", type(fragments) # return len 30 type str
fragments_new, state = audioop.ratecv(fragments, width, chan, srold, fixSR, state)
print "fragments", len(fragments_new), type(fragments_new[0]) # returns 16, type str
data_to_return = numpy.fromstring(fragments_new, dtype=dType)
return data_to_return
and I call it like this
data1 = numpy.array(data1, dtype=dType)
data_to_copy = numpy.append(data1, data2)
data_to_copy = _to_copy.sum(axis = 1) / chan
data_to_copy = data_to_copy.flatten() # because its mono
data_to_copy = conv_sr(data_to_copy, sr, fixSR, dType) #sr = 44100, fixSR = 22050
scipy.io.wavfile.write(filename, fixSR, data_to_copy)
After a bit more of research I found my mistake, it seems that 16 bit audio are made of two 8 bit 'cells', so the dtype I was putting on was false and that's why I had audio speed issue. I found the correct dtype here. So, in conv_sr def, I am passing a numpy array, convert it to data string, pass it to convert sample rate, converting again to numpy array for scipy.io.wavfile.write
and finally, converting 2 8bits to 16 bit format
def widthFinder(dType):
try:
b = str(dType)
bits = int(b[-2:])
except:
b = str(dType)
bits = int(b[-1:])
width = bits/8
return width
def conv_sr(data, srold, fixSR, dType, chan = 1):
state = None
width = widthFinder(dType)
if width != 1 and width != 2 and width != 4:
width = 2
fragments = data.tobytes()
fragments_new, state = audioop.ratecv(fragments, width, chan, srold, fixSR, state)
fragments_dtype = numpy.dtype((numpy.int16, {'x':(numpy.int8,0), 'y':(numpy.int8,1)}))
data_to_return = numpy.fromstring(fragments_new, dtype=fragments_dtype)
data_to_return = data_to_return.astype(dType)
return data_to_return
If you find anything wrong, please feel free to correct me, I am still a learner