I tried
$obj = [System.IO.File]::ReadLines((Convert-Path -LiteralPath names.json)) | ConvertFrom-Json
$keys = @()
foreach ($key in $obj.GetEnumerator()) {
$keys += $key.Key
}
Write-Output $keys
But after over 24 hours it had not completed.
I need the key names so I can
The JSON is a version of this one (though 200 megs smaller): https://kaikki.org/dictionary/All%20languages%20combined/by-pos-name/kaikki_dot_org-dictionary-all-by-pos-name.json
You can Stream the file and process each line, avoiding Get-Content
for large files and minimizing the number of cmdlet invocations:
$allKeys = @{}
[System.IO.File]::ReadLines("names.json") | ForEach-Object {
$obj = ConvertFrom-Json $_
$obj.PSObject.Properties.Name | ForEach-Object {
$allKeys[$_]= $true
}
}
$uniqueKeys = $allKeys.Keys
Write-Output $uniqueKeys
This approach uses the .NET ReadLines
method to efficiently stream the file and process each line.
$allGlosses = @()
[System.IO.File]::ReadLines("sample.json") | ForEach-Object {
$obj = ConvertFrom-Json $_
$obj.senses.glosses | ForEach-Object {
$allGlosses += $_
}
}
Write-Output $allGlosses
This will give you a list of glosses from each JSON object.
Perhaps use a hashtable to store unique keys and avoid using ForEach-Object
unnecessarily, and use a full path for file access:
$allKeys = @{}
$filePath = Resolve-Path "names.json"
foreach ($line in [System.IO.File]::ReadLines($filePath)) {
$obj = ConvertFrom-Json $line
foreach ($key in $obj.PSObject.Properties.Name) {
$allKeys[$key] = $true
}
}
$uniqueKeys = $allKeys.Keys
Write-Output $uniqueKeys
This code avoids growing an array with +=
, uses a foreach loop instead of ForEach-Object
, and resolves the file path before using it in the .NET method.