I have a very large (>100k lines) file that i want to split on :
.
I then want to discard the first item, and leave all the rest. for example, foo:bar:baz
becomes bar:baz
.
If i do cut -d ':' -f2- myfile.txt > newfile.txt
it finishes in a matter of milliseconds.
I have tried a few methods in Powershell, but I have yet to see one finish. After a couple of minutes I abort, because this script cannot afford to wait that long. Surely there is a better/faster way to do this, but I can't seem to find it.
The most promising method I found so far looks like this:
$reader = [System.IO.File]::OpenText("myfile.txt")
try {
for() {
$line = $reader.ReadLine()
if ($line -eq $null) { break }
$split = $line.Split(":")
$join = $split[1..($split.Length-1)] -join ":"
Add-Content -Path "newfile.txt" -Value "$join"
}
}
finally {
$reader.Close()
}
Please help/advise.
In both examples in this answer you can use regex instead of splitting, it would be more efficient that way. For the regex details you can check: https://regex101.com/r/iGfHWp/1.
(Get-Content myfile.txt -Raw) -replace '(?m)^.+?:' |
Set-Content newfile.txt
File.ReadLines
+ StreamWriter
:try {
# use absolute path always in this case, i.e.:
# `newfile.txt` should be `X:\path\to\newfile.txt`
$writer = [System.IO.StreamWriter] 'newfile.txt'
$re = [regex]::new(
'^.+?:', [System.Text.RegularExpressions.RegexOptions]::Compiled)
foreach ($line in [System.IO.File]::ReadLines('myfile.txt')) {
$writer.WriteLine($re.Replace($line, ''))
}
}
finally {
if ($writer) {
$writer.Dispose()
}
}