I am coding a basic frequency analisys of WAVE audio files, but I have trouble when it comes to convertion from WAVE frames to integer.
Here is the relevant part of my code:
import wave
track = wave.open('/some_path/my_audio.wav', 'r')
byt_depth = track.getsampwidth() #Byte depth of the file in BYTES
frame_rate = track.getframerate()
buf_size = 512
def byt_sum (word):
#convert a string of n bytes into an int in [0;8**n-1]
return sum( (256**k)*word[k] for k in range(len(word)) )
raw_buf = track.readframes(buf_size)
'''
One frame is a string of n bytes, where n = byt_depth.
For instance, with a 24bits-encoded file, track.readframe(1) could be:
b'\xff\xfe\xfe'.
raw_buf[n] returns an int in [0;255]
'''
sample_buf = [byt_sum(raw_buf[byt_depth*k:byt_depth*(k+1)])
- 2**(8*byt_depth-1) for k in range(buf_size)]
Problem is: when I plot sample_buf
for a single sine signal, I get
an alternative, wrecked sine signal.
I can't figure out why the signal overlaps udpside-down.
Any idea?
P.S.: Since I'm French, my English is quite hesitating. Feel free to edit if there are ugly mistakes.
It might be because you need to use an unsigned value for representing the 16bit samples. See https://en.wikipedia.org/wiki/Pulse-code_modulation
Try to add 32767 to each sample.
Also you should use the python struct module to decode the buffer.
import struct
buff_size = 512
# 'H' is for unsigned 16 bit integer, try 'h' also
sample_buff = struct.unpack('H'*buf_size, raw_buf)