In audio processing, say the underling library (PortAudio, in my case) gives me a binary, which represents a few seconds of audio captured from a mic, and when it is captured, it used a sample rate of sr
, and the underling library tells me that this binary consists nf
number of frames, can I safely assume that the duration of the audio that the binary represents is nf / sr
seconds?
In another word, if I use a sample rate of sr
, can I safely assume that I will get sr
samples per seconds? Will the hardware drop some samples due to some factors (like for limiting power consumption, etc.)?
Your assumption that the duration of the audio is nf / sr
samples is correct. Be aware that this assumes the samplerate of your playback is also sr
. This may be not necessarily be the case.
Most audio drivers support a limited set of sampling outputs (44.1 kHZ, 48 kHz, 96kHz, etc). So if say your playback sample rate is psr
then the actual duration will be nf / (psr / sr)
seconds.
Note that most audio drivers usually do not drop samples which would result in undesirable audio clicks but, rather simply have higher latency to free up computation for other tasks.
Note that PortAudio isn't meant for playback or recording files as specified in their FAQ.