The following findstr.exe
command almost does what I want, but not quite:
findstr /s /i /c:"word1 word2 word3" *.abc
I have used:
/s
for searching all subfolders./c:
Uses specified text as a literal search string
/i
Specifies that the search is not to be case-sensitive.*.abc
Files of type abc.The above looks for word1 word2 word3
as a literal, and therefore only finds the words in that exact order.
By contrast, I want all words to match individually, in any order (AND logic, conjunction).
If I remove /c:
from the command above, then lines matching any of the words are returned (OR logic, disjunction), which is not what I want.
Can this be done in PowerShell?
You can use Select-String
to do a regex based search through multiple files.
To match all of multiple search terms in a single string with regular expressions, you'll have to use a lookaround assertion:
Get-ChildItem -Filter *.abc -Recurse |Select-String -Pattern '^(?=.*\bword1\b)(?=.*\bword2\b)(?=.*\bword3\b).*$'
In the above example, this is what's happening with the first command:
Get-ChildItem -Filter *.abc -Recurse
Get-ChildItem
searches for files in the current directory
-Filter *.abc
shows us only files ending in*.abc
-Recurse
searches all subfolders
We then pipe the resulting FileInfo objects to Select-String
and use the following regex pattern:
^(?=.*\bword1\b)(?=.*\bword2\b)(?=.*\bword3\b).*$ ^ # start of string (?= # open positive lookahead assertion containing .* # any number of any characters (like * in wildcard matching) \b # word boundary word1 # the literal string "word1" \b # word boundary ) # close positive lookahead assertion ... # repeat for remaining words .* # any number of any characters $ # end of string
Since each lookahead group is just being asserted for correctness and the search position within the string never changes, the order doesn't matter.
If you want it to match strings that contain any of the words, you can use a simple non-capturing group:
Get-ChildItem -Filter *.abc -Recurse |Select-String -Pattern '\b(?:word1|word2|word3)\b'
\b(?:word1|word2|word3)\b \b # start of string (?: # open non-capturing group word1 # the literal string "word1" | # or word2 # the literal string "word2" | # or word3 # the literal string "word3" ) # close positive lookahead assertion \b # end of string
These can of course be abstracted away in a simple proxy function.
I generated the param
block and most of the body of the Select-Match
function definition below with:
$slsmeta = [System.Management.Automation.CommandMetadata]::new((Get-Command Select-String))
[System.Management.Automation.ProxyCommand]::Create($slsmeta)
Then removed unnecessary parameters (including -AllMatches
and -Pattern
), then added the pattern generator (see inline comments):
function Select-Match
{
[CmdletBinding(DefaultParameterSetName='Any', HelpUri='http://go.microsoft.com/fwlink/?LinkID=113388')]
param(
[Parameter(Mandatory=$true, Position=0)]
[string[]]
${Substring},
[Parameter(Mandatory=$true, ValueFromPipelineByPropertyName=$true)]
[Alias('PSPath')]
[string[]]
${LiteralPath},
[Parameter(ParameterSetName='Any')]
[switch]
${Any},
[Parameter(ParameterSetName='Any')]
[switch]
${All},
[switch]
${CaseSensitive},
[switch]
${NotMatch},
[ValidateNotNullOrEmpty()]
[ValidateSet('unicode','utf7','utf8','utf32','ascii','bigendianunicode','default','oem')]
[string]
${Encoding},
[ValidateNotNullOrEmpty()]
[ValidateCount(1, 2)]
[ValidateRange(0, 2147483647)]
[int[]]
${Context}
)
begin
{
try {
$outBuffer = $null
if ($PSBoundParameters.TryGetValue('OutBuffer', [ref]$outBuffer))
{
$PSBoundParameters['OutBuffer'] = 1
}
# Escape literal input strings
$EscapedStrings = foreach($term in $PSBoundParameters['Substring']){
[regex]::Escape($term)
}
# Construct pattern based on whether -Any or -All was specified
if($PSCmdlet.ParameterSetName -eq 'Any'){
$Pattern = '\b(?:{0})\b' -f ($EscapedStrings -join '|')
} else {
$Clauses = foreach($EscapedString in $EscapedStrings){
'(?=.*\b{0}\b)' -f $_
}
$Pattern = '^{0}.*$' -f ($Clauses -join '')
}
# Remove the Substring parameter argument from PSBoundParameters
$PSBoundParameters.Remove('Substring') |Out-Null
# Add the Pattern parameter argument
$PSBoundParameters['Pattern'] = $Pattern
$wrappedCmd = $ExecutionContext.InvokeCommand.GetCommand('Microsoft.PowerShell.Utility\Select-String', [System.Management.Automation.CommandTypes]::Cmdlet)
$scriptCmd = {& $wrappedCmd @PSBoundParameters }
$steppablePipeline = $scriptCmd.GetSteppablePipeline($myInvocation.CommandOrigin)
$steppablePipeline.Begin($PSCmdlet)
} catch {
throw
}
}
process
{
try {
$steppablePipeline.Process($_)
} catch {
throw
}
}
end
{
try {
$steppablePipeline.End()
} catch {
throw
}
}
<#
.ForwardHelpTargetName Microsoft.PowerShell.Utility\Select-String
.ForwardHelpCategory Cmdlet
#>
}
Now you can use it like this, and it'll behave almost like Select-String
:
Get-ChildItem -Filter *.abc -Recurse |Select-Match word1,word2,word3 -All