windowspowershellsearchfull-text-searchfile-search

Windows search for all filenames having "string1" in the contents and not having "dislike" in the contents


Below windows powershell command helps list all files [excluding a few filenames] that contain string1

Get-ChildItem -Path "D:\Jenkins" -File -Recurse | Where-Object { $_.Name -notlike "*log*" -and $_.DirectoryName -notlike "*\\Backup\\*" } | Select-String -Pattern "string1"

I wish to enhance this command so it displays all files containing string1 but should not contain the string dislike.

Can you please suggest?


Solution

  • Select-String does not support combining positive and negative matching - you can use either positive matching (by default) or negative matching (with -NotMatch).

    The simplest solution - which assumes that each file fits into memory in full (usually a safe bet for text files) - is to read each file into memory in full with Get-Content's -Raw switch and use -match, the regular-expression matching operator and its negating variant (-notmatch):

    Get-ChildItem -LiteralPath D:\Jenkins -File -Recurse | 
      Where-Object { 
        $_.Name -notlike '*log*' -and 
        $_.DirectoryName -notlike '*\Backup\*' -and
        (
          ($content = $_ | Get-Content -Raw) -match 'string1' -and 
          $content -notmatch 'dislike'
        )
      }
    

    If you do run out of memory,[1] combine two Select-String calls as follows:

    Get-ChildItem -LiteralPath D:\Jenkins -File -Recurse | 
      Where-Object { 
        $_.Name -notlike '*log*' -and 
        $_.DirectoryName -notlike '*\Backup\*' -and
        (
          ($_ | Select-String -Quiet 'string1') -and 
          -not ($_ | Select-String -Quiet 'dislike')
        )
      }
    

    If you don't mind using an advanced regex, a single Select-String call is sufficient:

    Get-ChildItem -LiteralPath D:\Jenkins -File -Recurse | 
      Where-Object { 
        $_.Name -notlike '*log*' -and 
        $_.DirectoryName -notlike '*\Backup\*' -and
        (
          $_ | Select-String -Quiet '(?s)(?<!dislike.*?)string1(?!.*?dislike)'
        )
      }
    

    For an explanation of the regex and the option to experiment with it, see this regex101.com page.


    [1] Given that text files are rarely so large that they wouldn't fit into memory, a potential way to avoid the out-of-memory problems is to restrict the files to search through to text files only, via their filename extensions, if feasible; e.g., you could add -Include *.txt, *.csv, ... to the Get-ChildItem call.