batch-filecmdfindstr

How to use CMD regex to limit output in recursive directory search


I need to create a dynamic report for a set of folders in a code repository that are inconsistently stored; some have the key feature two levels down, some three, and a few are four+ levels down.

Here is a short example of a few instances where I need the results prefixed with +++ and need to not include the listings prefixed with ---

+++ A1B2C3/ABC_Core/ABC_HIJ_R71_00_00/
--- A1B2C3/ABC_Core/ABC_HIJ_R71_00_00/QR-HIJ-Outbound-123-Svc/SharedResources/WXYZ/Client/TU4_987_864X22Dat/
+++ A1B2C3/ABC_Core/ABC_HIJ_R72_00_00/
--- A1B2C3/ABC_Core/ABC_HIJ_R72_00_00/QR-HIJ-Outbound-123-Svc/SharedResources/WXYX/Client/TU4_987_864X22Dat/
+++ A1B2C3/ABC_Core/ABC_HIJ_R73_00_00_WidgetMod/
--- A1B2C3/ABC_Core/ABC_HIJ_R73_00_00_WidgetMod/QR-HIJ-Outbound-123-Svc/
+++ D4E5F6/QRWidgetFlow_R_1_0_0_DMND0903212-ErrorReports/

I've tried several variants on this

findstr /e /r /c:"[0-9][0-9]_[0-9][0-9][^0-9/]*/" /c:"[0-9]_[0-9][^0-9/]*/"

but each time I change it around, I either gain extra subfolders or lose key folders I had before.

Any help would be greatly appreciated.


Solution

  • @ECHO Off
    SETLOCAL ENABLEDELAYEDEXPANSION 
    
    rem The following setting for the directory is a name
    rem that I use for testing and deliberately includes spaces to make sure
    rem that the process works using such names. These will need to be changed to suit your situation.
    
    SET "sourcedir=u:\your files"
    SET "tempfile=%tmp%\afilename"
    SET "numerics=0-9"
    SET "lastkey=?"
    
    (FOR /d /r "%sourcedir%" %%e IN (*_*) DO ECHO %%e)>"%tempfile%"
    FOR /f "delims=" %%e IN ('sort "%tempfile%"') DO (
     FOR %%y IN ("%%e\.") do (
      rem does leaf pass test does not start 9_9 or 99_99 
      ECHO %%~nxy|FINDSTR /b /r /c:"[%numerics%]_[%numerics%]" /c:"[%numerics%][%numerics%]_[%numerics%][%numerics%]">NUL
      IF ERRORLEVEL 1 (
       rem does not start 9_9 or 99_99 - does it contain 9_9 or 99_99 ?
       ECHO %%~nxy|FINDSTR /r /c:"[%numerics%]_[%numerics%]" /c:"[%numerics%][%numerics%]_[%numerics%][%numerics%]">NUL
       IF NOT ERRORLEVEL 1 (
        rem contains 9_9 or 99_99
        CALL :report "%%e"
       )
      )
     )
    )
    
    DEL "%tempfile%"
    
    GOTO :EOF
    
    :report
    SET "reportme=%~1"
    SET "reportme=!reportme:%lastkey%=!"
    IF "%reportme%" neq %1 GOTO :eof
    ECHO %~1
    SET "lastkey=%~1"
    GOTO :eof
    

    Always verify against a test directory before applying to real data.

    Obtain a full subdirectory list and store it in a tempfile.

    Read each directoryname from a sorted version of the tempfile, and derive the leafname. If the leafname does not start with the target strings, but does contain one of the strings, then it's a candidate to be reported.

    The report sees whether the quoted directoryname passed as %1 contains the last-reported name. If it does, ignore it, otherwise report it and set it as the last-reported name.

    Since the names are sorted, all subdirectories of a "key" directory will follow that directory in the list.

    I believe that the strings to match may actually need to be _9_9 or _99_99 as 9_9 on its own would match "A1B2C3/ABC_Core/ABC_HIJ_R7xxxxx/QR-HIJ-Outbound-123-Svc/SharedResources/WXYZ/Client/TU4_987_864X22Dat/"