I have a use case where I have to shorten silences which are longer than 2 seconds and shrink them to 1 second in ffmpeg wasm
I can remove silences which are longer than 2 seconds but couldn't shrink them.
const silenceDuration = 2;
await ffmpeg.run(
"-i",
inputFileName,
"-af",
`silenceremove=stop_periods=-1:stop_duration=${silenceDuration}:stop_threshold=-30dB`,
outputFileName,
)
any idea how to achieve this in ffmpeg wasm?
Thank you
It seems to me that using
silenceremove=detection=peak:window=1:stop_periods=-1:stop_duration=1:stop_silence=0
does exactly what you want.
peak
and window=1
means that instant t
is considered to be silence only if the whole second between [t and t+1] is 0 (to the specified threshold, of course).
So, no silence less than 1 second is even a silence from the standpoint of this filter.
stop_periods
means that silence of more than 1 second (but remember, because of window=1:dectetion=peak
, the first second of a silence isn't even detected as silence. So it is only for 2 seconds of actual silence that it will consider that there was 1 second of silence and the filter applies.
So it acts only on silence of more than 2 seconds.
And then, because of stop_duration
, let 1 second of this silence in the file.
The window
usage here is clearly not the intended one. But what you do is quite strange, so it is not surprising that it is not what dev of ffmpeg had in mind: you want to keep silence of 0.8 seconds. That is a quite natural thing to do. You want that a silence of 2.3 seconds is reduced to 1 second. That is also a quite natural thing to do. But you also want that a silence of 1.5 second is left untouched. Which means that if I call f
a function that maps the initial duration of a silence, the its new duration, that f
is not monotonous. Which is very unusual.
Graphically, that means that new duration of a silence vs old old looks like this
It is not surprising that this is not easy to do with a tool made to specify "natural" action (like truncating any silence over than 1 second to 1 second, including those lasting from 1 to 2 seconds. Or, truncating any silence of than 2 seconds to 2 seconds. But "truncating any silence of more than 2 seconds to 1 second, that is trucating any silence to 1 second at most, except if they last between 1 and 2 second" is a quite strange expected behaviour).