batch-filecsv

Use batch scripting to replace null values with 0 in a specific column in a CSV


I need to prepare an automatically-generated CSV file for import into a database. One of the columns is supposed to contain integers, but the generating program (which I have no control of) doesn't always put anything in that column. The database import won't accept null values in an integer column. There are other string columns that should stay null.

Here's a simplified CSV with the problem:

"UID","Name","Comment","LicenseNo","DateEntered"
"1","Adam","Likes puppies","4451","2014-05-01"
"2","Barbara","","",2014-05-02"
"3","","Reserved","","2014-05-03"
"4","Donna","","4559","2014-05-04"

I'd like to replace the "" occurrences with "0" in column 4 only.

I can get as far as being able to isolate the column in each row and set a non-nil substitute variable:

set inputCSV=%1
set outputCSV=%2

for /f "delims==" %%a IN (%inputCSV%) DO (
    set line=%%a
    for /f "tokens=4 delims=," %%b IN ("!line!") DO (
        if %%b=="" (
            set data="0"
        ) else (
            set data=%%b
        )
    )
)

However, I can't figure out how to output the corrected line to a new file. I was heading towards inserting something like:

if !data!=="0" (
for /f "tokens=1-3 delims=," %%d IN ("!line!") DO set prev=%%d,%%e,%%f
for /f "tokens=5 delims=," %%g IN ("!line!") DO set next=%%g
echo !prev!,!data!,!next! >> %outputCSV%
) else (
echo !line! >> %outputCSV%
)

But the real CSV has dozens of columns, so I'm going to run out of FOR variables, and it just seems like there's a better way I can't see ...

Any insight would be appreciated.


Solution

  • use tokens=1-4,*

    * means "the fifth token is the rest of the line"

    Complete code:

    @echo off
    setlocal enabledelayedexpansion
    set inputCSV=%1
    set outputCSV=%2
    
    (for /f "tokens=1-4,* delims=," %%a IN (%inputCSV%) DO (
        if "%%d"=="""" (set "value="000"") else (set "value=%%d")
        echo %%a,%%b,%%c,!value!,%%e
    ))>%output.csv
    

    EDIT for the additional info in the comment

    @echo off
    setlocal enabledelayedexpansion
    set inputCSV=%1
    set outputCSV=%2
    
    (for /f "tokens=*" %%a IN (%inputCSV%) DO (
      set column=0
      set "line="
      for %%i in ( %%a ) do ( 
        set /a column+=1
        set value=%%~i
        if !column!==4 (   
          if "!value!"=="" set "value=0"
        ) 
        set "line=!line!,"!value!"" 
      )
      echo !line:~1!
    ))>%outputCSV%
    

    change the 4 to the correct column number.

    Attention: there is a limit for the number of characters per line (don't remember how much, could affect %%a)

    Also some special characters will make trouble.