I currently have to compress several thousand files (~40-80MB each) with brotli and get them ready for an s3 bucket. From what i've researched so far, brotli can't multithread the compression so, brotli.exe uses ~10% of the cpu. How can I iterate through the files in a folder and spawn multiple (brotli).exe files to work at the same time (8-10 processes should fill the cpu)? windows/powershell/vbs, I can try any suggestions
At the moment, I'm running this batch
for /R %%f in (*.) do (
"brotli" -Z "--output=E:\output\brotli\%%~nf" "%%f"
)
@ECHO OFF
SETLOCAL
:: set limit to #jobs
SET /a limit=8
:: establish a subdirectory in %temp%
SET "control=%temp%\brotlicontrol"
MD "%control%" 2>NUL
:: Dummy for testing
for %%f IN (fred anna george bill betty carl celia daphne john kelly ian zoe brian
tracey susan colin jane selina valerie david stephen) DO (
rem for /R %%f in (*.) do (
CALL :wait
START /min "brotli %%~nf" q75403766_2 "%%f"
)
GOTO :EOF
:wait
SET /a running=0
FOR /f %%y IN ('DIR /a-d /b "%control%\*.flg" 2^>nul ^|FIND /c ".flg" ') DO SET /a running=%%y
IF %running% geq %limit% timeout /t 1 >nul&GOTO wait
GOTO :eof
Here's a main batch which start
s a subsidiary batch
@echo off
setlocal
ECHO.>"%control%\%~n1.flg"
REM "brotli" -Z "--output=E:\output\brotli\%~n1" %1
:: Dummy - variable timeout 5-20 seconds
SET /a exectime=(%RANDOM% %% 16) + 5
timeout /t %exectime% >nul
del "%control%\%~n1.flg"
EXIT
I had %%f
iterate through a list of names for testing. All you need to do is to remove that test code and use your original code which I rem
med out to process your list of files.
The process calls the :wait
routine, which counts the .flg
files in the temporary directory, and sets running
to that value.
If the number running is greater than or equal to (geq
) the limit established in the initialisation, wait 1 second and try again, otherwise the :wait
routine terminates and the subsidiary batch q75403766_2
is start
ed /min
minimised and with the name brotli nameoffile
. It's important that the first quoted parameter to start
exists as it's used as the title of the start
ed process. You could use ""
if you want (for no title) but you should not omit this title string.
The sub-process started (q75403766_2
) first creates a .flg
file with the name of the file being processed in the control
directory, then runs the brotli
job (rem
med out again) - I added a few lines to create a variable timeout to simulate the brotli
process-time - and deletes the control
file and exits.
The carets before the redirectors in the for
loops tell cmd
that the redirection is to be applied to the command being executed, not the for
. 2>nul
(+caret) says "redirect error messages (file not found) to nowhere (ie. discard them)".