Given a hash table (or a dictionary) like:
$HashTable = @{ 'One' = 1; 'Two' = 2; 'Three' = 3 }
(which is case insensitive by default, but doesn't has to be in every situation for my case. In fact, I have no clue of the comparer that is used in the given hashtable or dictionary)
It is easy to get the related value for a specific item:
$HashTable['two'] # Or: $HashTable.get_item('two')
2
But is it also possible to get the actual key (of a specific item) back, something like:
$HashTable.get_Key('two')
Which would return Two
(which a capital T
) rather than an error or two
in lowercase?
Apparently I haven't been clear in my question, so let me rephrase it:
For any given hashtable (or dictionary) with an unknown comparer, I am looking for a way to retrieve the original key
of a specific item similar as I would retrieve the value
of t specific item. Meaning:
Meanwhile, I am starting to believe that the answer to my question is:
That is not possible
Simply because the key of an hashtable isn't reordered but only the .
That doesn't make sense either because the whole key is still available in the hashcode
and collision recovery data but not the whole key (/string, which might virtually unlimited in size)..keys
collection but apparently the (reverse) link with the actual item is gone...
To be honest what I was expecting from this question, was either:
But it turns out that I got a list of interesting alternatives which I might use instead. For my actual use case, the performance is important but more important is the fact whether the provide suggestion reacts according my expectations. Therefore I have setup Pester test bench to confirm this:
Describe 'Retrive the original key from a hashtable or dictionary' { # https://stackoverflow.com/a/78656228
BeforeAll {
$CaseInsensitive = [Ordered]@{
One = 1
Two = 2
Three = 3
4 = 'Four'
}
$CaseSensitive = [System.Collections.Specialized.OrderedDictionary]::new()
$CaseSensitive['One'] = 1
$CaseSensitive['two'] = 2
$CaseSensitive['Two'] = 2
$CaseSensitive['three'] = 3
$CaseSensitive['Three'] = 3
$CaseSensitive['THREE'] = 3
$CaseSensitive[[Object]4] = 'Four'
# mklement0 https://stackoverflow.com/a/78656228/1701026
function GetKey($Dictionary, $lookupKey) {
if ($Dictionary.Contains($lookupKey)) {
$candidateKeys = $Dictionary.Keys.Where({ $_ -eq $lookupKey }, 0, 2)
if ($candidateKeys.Count -ge 2) { $lookupKey }
else { $candidateKeys[0] }
}
}
# Fabrice SANGA https://stackoverflow.com/a/78656281/1701026
# $CaseInsensitive | Add-Member -Force -MemberType ScriptMethod -Name get_Key -Value {
# param([string] $CaseInsensitiveIndex)
# return $this.Keys.Where{ $_ -ceq ([CultureInfo]::InvariantCulture).TextInfo.ToTitleCase($CaseInsensitiveIndex) }
# }
# $CaseSensitive | Add-Member -Force -MemberType ScriptMethod -Name get_Key -Value {
# param([string] $CaseInsensitiveIndex)
# return $this.Keys.Where{ $_ -ceq ([CultureInfo]::InvariantCulture).TextInfo.ToTitleCase($CaseInsensitiveIndex) }
# }
# function GetKey($Dictionary, $lookupKey) {
# $Dictionary.get_Key($lookupKey)
# }
# Abraham Zinala https://stackoverflow.com/a/78657526/1701026
# function GetKey($HashTable, $to_match) {
# [System.Linq.Enumerable]::Where(
# ($entries = [System.Collections.DictionaryEntry[]]@($HashTable.GetEnumerator())),
# [Func[System.Collections.DictionaryEntry, bool]]{
# param($entry)
# $entry.Key -match $to_match -and $entries.Where{$_.Key -match $to_match}.Count -eq 1 -or
# $entry.Key -cmatch $to_match -and $entries.Where{$_.Key -match $to_match}.Count -ge 1
# }
# ).Key
# }
# iRon
# function GetKey($Dictionary, $LookupKey) {
# if ($Dictionary -is [System.Collections.IDictionary]) {
# if (-not $Dictionary.Contains($LookupKey)) { return }
# } else {
# if (-not $Dictionary.ContainsKey($LookupKey)) { return }
# }
# foreach ($Key in $Dictionary.get_Keys()) {
# if ($Key -ceq $LookupKey) { return $Key }
# if ($Key -ieq $LookupKey) { $iKey = $Key }
# }
# if ($Null -ne $iKey) { $iKey } else { $LookupKey }
# }
}
Context 'Case insensitive, existing key' {
It 'One' {
$CaseInsensitive['One'] | Should -Be 1
GetKey $CaseInsensitive 'One' | Should -BeExactly 'One'
}
It 'ONE' {
$CaseInsensitive['ONE'] | Should -Be 1
GetKey $CaseInsensitive 'ONE' | Should -BeExactly 'One'
}
It 'two' {
$CaseInsensitive['two'] | Should -Be 2
GetKey $CaseInsensitive 'two' | Should -BeExactly 'Two'
}
It 'Two' {
$CaseInsensitive['Two'] | Should -Be 2
GetKey $CaseInsensitive 'Two' | Should -BeExactly 'Two'
}
It 'TWO' {
$CaseInsensitive['TWO'] | Should -Be 2
GetKey $CaseInsensitive 'TWO' | Should -BeExactly 'Two'
}
It 'three' {
$CaseInsensitive['three'] | Should -Be 3
GetKey $CaseInsensitive 'three' | Should -BeExactly 'Three'
}
It 'Three' {
$CaseInsensitive['Three'] | Should -Be 3
GetKey $CaseInsensitive 'Three' | Should -BeExactly 'Three'
}
It 'THREE' {
$CaseInsensitive['THREE'] | Should -Be 3
GetKey $CaseInsensitive 'THREE' | Should -BeExactly 'Three'
}
It '4' {
$CaseInsensitive[[Object]4] | Should -Be 'Four'
GetKey $CaseInsensitive 4 | Should -Be 4
}
}
Context 'Case insensitive, missing key' {
It '"4"' {
$CaseInsensitive['4'] | Should -BeNullOrEmpty
GetKey $CaseInsensitive '4' | Should -BeNullOrEmpty
}
}
Context 'Case sensitive, existing key' {
It 'One' {
$CaseSensitive['One'] | Should -Be 1
GetKey $CaseSensitive 'One' | Should -BeExactly 'One'
}
It 'two' {
$CaseSensitive['Two'] | Should -Be 2
GetKey $CaseSensitive 'Two' | Should -BeExactly 'Two'
}
It 'Two' {
$CaseSensitive['Two'] | Should -Be 2
GetKey $CaseSensitive 'Two' | Should -BeExactly 'Two'
}
It 'three' {
$CaseSensitive['three'] | Should -Be 3
GetKey $CaseSensitive 'three' | Should -BeExactly 'three'
}
It 'Three' {
$CaseSensitive['Three'] | Should -Be 3
GetKey $CaseSensitive 'Three' | Should -BeExactly 'Three'
}
It 'THREE' {
$CaseSensitive['THREE'] | Should -Be 3
GetKey $CaseSensitive 'THREE' | Should -BeExactly 'THREE'
}
It '4' {
$CaseInsensitive[[Object]4] | Should -Be 'Four'
GetKey $CaseInsensitive 4 | Should -Be 4
}
}
Context 'Case sensitive, missing key' {
It 'one' {
$CaseSensitive['one'] | Should -BeNullOrEmpty
GetKey $CaseSensitive 'one' | Should -BeNullOrEmpty
}
It 'ONE' {
$CaseSensitive['ONE'] | Should -BeNullOrEmpty
GetKey $CaseSensitive 'ONE' | Should -BeNullOrEmpty
}
It 'TWO' {
$CaseSensitive['TWO'] | Should -BeNullOrEmpty
GetKey $CaseSensitive 'TWO' | Should -BeNullOrEmpty
}
It 'Four' {
$CaseSensitive['Four'] | Should -BeNullOrEmpty
GetKey $CaseSensitive 'Four' | Should -BeNullOrEmpty
}
It '"4"' {
$CaseInsensitive['4'] | Should -BeNullOrEmpty
GetKey $CaseInsensitive '4' | Should -BeNullOrEmpty
}
}
}
Preface:
-eq
operator to perform case-insensitive equality comparison, which implicitly occur in the context of the invariant culture, where as given hashtable / dictionary may use a different culture context, such as the current one.I'm not aware of an efficient way to obtain the case-exact form of the original key, but the following is a concise one (at the expense of performance, 'First'
can be omitted), using the intrinsic .Where()
method:
@{
'One' = 1
'Two' = 2
'Three' = 3
}.Keys.Where({ $_ -eq 'two' }, 'First')[0] # -> 'Two'
Note:
The above only works reliably with case-insensitive dictionaries, which PowerShell's hashtables and [ordered]
hashtable literals are (note that the use of 'First'
isn't strictly necessary, but is a performance optimization: it stops looking after the first - and by definition only in a case-insensitive dictionary - match; [0]
extracts the one and only element of the collection of matches that .Where()
invariably returns).
With case-sensitive dictionaries, the above may report false positives - see below for a solution that handles such dictionaries too.
.Keys.Where({ $_ -eq 'two' }, 'First')[0]
with.Keys.Where({ $_ -eq 'two' })
The reason this is inefficient is the need to perform a linear (O(N)) search through the keys collection, and the fact that a script block must be invoked for each collection element doesn't help.
Dealing with a dictionary whose case-sensitivity is unknown requires more work, including potentially having to examine all keys:
# Note: [System.Collections.Specialized.OrderedDictionary] is the type
# underlying [ordered] hashtable literals in PowerShell.
# It - like [System.Collections.Hashtable] - is case-SENSITIVE by default;
# it is PowerShell that makes them case-INsensitive behind the scenes.
$dictionary = [System.Collections.Specialized.OrderedDictionary]::new()
$dictionary['One'] = 1
$dictionary['TWO'] = 2
$dictionary['two'] = 2 # case variation
$dictionary['Three'] = 2
$lookupKey = 'two'
# -> Stores 'two' in $actualKey, the case-exact match.
$actualKey =
if ($dictionary.Contains($lookupKey)) {
$candidateKeys = $dictionary.Keys.Where({ $_ -eq $lookupKey }, 0, 2)
if ($candidateKeys.Count -ge 2) { $lookupKey }
else { $candidateKeys }
}
.Contains($lookupKey)
, weeds out lookup keys that do not match an existing key, whether case-insensitively or not.
.Where({ $_ -eq $lookupKey }, 0, 2)
looks for at most 2
candidate (potential) key matches, case-insensitively; limiting the search to 2 candidates is a performance optimization: from the presence of two (or more) keys that are case variations of each other it can be inferred that the dictionary at hand is case-sensitive.
If there are (at least) two candidate keys ($candidateKeys.Count -ge 2
), the implication is that the hashtable is case-sensitive and that the lookup key is therefore by definition already in the case-exact form (otherwise .Contains()
wouldn't have returned $true
), so $lookupKey
is returned.
Otherwise, if only a single candidate key is returned, that key is by definition the correct one and is returned (note that there's no need to use [0]
in this case in order to output the one and only element itself, because the use of an if
statement causes any collection (enumerable) to be auto-enumerated, just like in the pipeline).
Alternative, which optimizes for the case where the lookup key already is in the case-exact form:
$actualKey =
if ($dictionary.Contains($lookupKey)) {
if (-1 -ne [Array]::IndexOf($dictionary.Keys, $lookupKey)) { $lookupKey }
else { $dictionary.Keys.Where({ $_ -eq $lookupKey }, 'First') }
}
Once the existence of a matching key has been confirmed via .Contains()
, a case-sensitive (linear) lookup is performed first, using [Array]::IndexOf()
(which performs current-culture, case-sensitive comparisons), which performs much better than a .Where()
filter.
If a case-exact match is found, the lookup key is by definition already in the case-exact form.
Otherwise, the implication is that the dictionary is case-insensitive and that that is therefore sufficient to look for the first - and by definition only - key in the .Keys
collection that case-insensitively matches the lookup key, using .Where()
with -eq
.
An hypothetical efficient solution would require efficient (O(1)) lookups in the .Keys
collection that return the matching keys themselves, but as you note, dictionaries aren't designed for that, as they translate keys into abstract hashes (numbers) for lookup of values (and these hashes have no reverse link to the keys they were derived from).
If it were possible - it isn't[1] - to obtain the System.Collections.IEqualityComparer
/ System.Collections.Generic.IEqualityComparer`1
used by a given dictinonary instance and the specific instance used is of a type exposed via the static properties of the System.StringComparer
class, such as [System.StringComparer]::OrdinalIgnoreCase
, you could at least infer case-sensitivity and optimize based on that: if the dictionary is case-sensitive and a value lookup (.Contains()
) succeeds, you're by definition dealing with the case-exact form of the key; otherwise, it is sufficient to (linearly) look for the first case-insensitively matching key in .Keys
.
If the given dictionary happens to be of type System.Collections.Generic.SortedDictionary`2
, you could perform a binary search of the .Keys
collection, taking advantage of the fact that the collection of keys is by definition sorted, along the lines of:
[Array]::BinarySearch($sortedDict.Keys, $lookupKey, [StringComparer]::InvariantCultureIgnoreCase)
With System.Collections.Generic.SortedDictionary`2
, specifically, you could achieve a relatively efficient solution even without knowing the comparer, via two to three non-linear lookups (though in practice you may not notice a difference from the linear approach except in large dictionaries):
Test if the lookup key matches at all with .Contains()
and stop if not.
If it does, first use the binary search case-sensitively: if it succeeds, the lookup key is by definition already in the case-exact form, so return it.
Otherwise (the dictionary must be case-insensitive), repeat the search case-insensitively, which will by definition find the one and only matching key that is a case variation of the lookup key, and return it.
[1] Neither the specific System.Collections.Hashtable
([hashtable]
) type nor the dictionary interfaces (System.Collections.IDictionary
and System.Collections.Generic.IDictionary`2
) offer (public) members for accessing the comparer (and trying to use reflection to access non-public members is not only generally ill-advised, but would have to be done for each concrete dictionary-like type separately).