powershellcsvpowershell-5.1

Prefix lines of a CSV with a string but after every 99 lines, use another string


This is based on another thread. I didn't want to muddy that one with another question as that one was already solved.

Basically, I got burned by the system into which these CSVs get pulled. I think I can improve upon the sorted process. Rather than prepending a number to each item, I think I can make it easier by prepending a unique string for every 99 items. Based on the example from the other thread, let me show you what I mean...

import.csv

Name,Number
John Smith,(555)123-4567
Jane Smith,(555)123-4567
Todd Smith,(555)123-4567
...
Jeremy Smith,(555)123-4567
...
Gen Smith,(555)123-4567

In the sample data above, Jeremy is on line 100 and Gen is on line 198 or something...

desired_output.csv

Name,Number
10000-JohnSmith,5551234567
10000-JaneSmith,5551234567
10000-ToddSmith,5551234567
...
10001-JeremySmith,5551234567
...
10002-GenSmith,5551234567

The idea would be that, after every 99 lines, the prefix increased by 1 so the users could easily be grouped. I'm sure the old script, mentioned in the other thread by @santiago-squarzon, cannot be fully recycled. But the import and export I think would remain the same.


Solution

  • Seems like something like this could work, assuming all output is going to a single file:

    $sample = @'
    Name,Number
    John Smith,(555)123-4567
    Jane Smith,(555)123-4567
    Todd Smith,(555)123-4567
    Jeremy Smith,(555)123-4567
    Gen Smith,(555)123-4567
    '@ | ConvertFrom-Csv
    
    $groupSize = 3 # This would be 99 using the original CSV
    $index = 0     # Don't touch this one
    $chunk = 10000 # This one either
    
    $sample | ForEach-Object {
        if ($index++ -eq $groupSize) {
            $chunk++
            $index = 0
        }
    
        [pscustomobject]@{
            Name   = $chunk.ToString() + '-' + $_.Name
            Number = $_.Number -replace '\D'
        }
    }
    

    The example above would output:

    Name               Number    
    ----               ------    
    10000-John Smith   5551234567
    10000-Jane Smith   5551234567
    10000-Todd Smith   5551234567
    10001-Jeremy Smith 5551234567
    10001-Gen Smith    5551234567
    

    If this is what you're after, for the actual CSV you would be using Import-Csv to read it and only need to change the $groupSize to 99:

    $csvPath = "$Env:OneDriveCommercial\Desktop\master2.csv"
    $CsvOutput = "$Env:OneDriveCommercial\Desktop\master3.csv"
    
    $groupSize = 99
    $index = 0     # Don't touch this one
    $chunk = 10000 # This one either
    
    Import-Csv $csvPath | ForEach-Object {
        if ($index++ -eq $groupSize) {
            $chunk++
            $index = 0
        }
    
        [pscustomobject]@{
            Name   = $chunk.ToString() + '-' + $_.Name
            Number = $_.Number -replace '\D'
        }
    } | Export-Csv $CsvOutput -NoTypeInformation
    

    If you also want to remove the white spaces from each .Name then you can use:

    Name = $chunk.ToString() + '-' + ($_.Name -replace '\s')
    

    Or if you want to remove anything that is not characters in the English Alphabet you can use:

    Name = $chunk.ToString() + '-' + ($_.Name -replace '[^a-z]')