I am a newbie in batch script and I am trying to achieve the following:
loop through multiple files, count the # of commas on each line then remove extra commas if it is greater than 10
. I can only get to the point where I get the count but I am stuck there.
All fields are required. No carriage return. The extra comma will only happen in the field after the 9th comma
Example of data in csv file:
Row 1, (good data)
123,235252,6376,test1,08/11/2022,2,0,1,EA,Required text, pencil ,pen
Row 2, (bad data)
456,235252,6376,test2,08/11/2022,2,0,1,EA,Required,text, pencil ,pen
In row 2, Required text has an extra comma and should be removed. It should look like the row above
So the logic I would like to have is
If the number of commas is 10
for the row, I will go to the next line
If the number of commas greater than 10
, then I will remove the one after the 9th comma since extra commas will only happen in that field
Please note, I cannot put double quote around the field
@echo on
setlocal enabledexpansion enableddelayedexpansion
pause
set "inputFile=test.csv"
set "searchChar=,"
set count16=16
pause
for /f "delims=" %%a in ('
findstr /n "^" "%inputFile%"
') do for /f "delims=:" %%b in ("%%~a") do (
set "line=%%a"
pause
for /f %%c in ('
cmd /u /v /e /q /c"(echo(!line:*:=!)"^|find /c "%searchChar%"
') do set count=%%c echo %%c echo here echo %count% echo %count16% echo %%c line %%b has %%c characters
if %count16% equ %count% (echo ***hit)
)
pause
)
pause
Your question is very confusing. You had not clearly explained the details. More important: you have not posted in the question an example of the input data and the desired output; this would remedy the lack of details. So we can only guess what you want...
I think your problem could be better explained if you pay attention to the columns that both input and output data have. Are you interested in the commas, or in the columns?
This is my (attempt of a) solution. I used the example input file posted by Compo.
@echo off
setlocal EnableDelayedExpansion
rem Process all files with .csv extension in current folder
for %%F in (*.csv) do (
ECHO/
ECHO Input: "%%F"
TYPE "%%F"
rem Each file have comma-separated columns: may be 12 columns or more
rem Keep columns 1-9 the same. After that, generate 3 columns more:
rem the last and one-before-last columns are the same
rem the two-before-last column contain the rest of columns separated by space
(for /F "usebackq tokens=1-9* delims=," %%a in ("%%F") do (
set "restAfter9=%%j"
set "last="
set "lastBut1="
set "lastBut2="
for %%A in ("!restAfter9:,=" "!") do (
set "lastBut2=!lastBut2! !lastBut1!"
set "lastBut1=!last!"
set "last=%%~A"
)
echo %%a,%%b,%%c,%%d,%%e,%%f,%%g,%%h,%%i,!lastBut2:~3!,!lastBut1!,!last!
)) > "%%~NF.out"
ECHO Output: "%%~NF.out"
TYPE "%%~NF.out"
)
Output example:
Input: "test1.csv"
123,235252,6376,test1,08/11/2022,2,0,1,EA,Required text, pencil ,pen
456,235252,6376,test2,08/11/2022,2,0,1,EA,Required,text, pencil ,pen
789,235252,6376,test3,08/11/2022,2,0,1,EA,Re,qu,ir,ed,te,xt, pencil ,pen
012,235252,6376,test4,08/11/2022,2,0,1,,Required,text, pencil ,pen
789,235252,6376,test5,08/11/2022,2,0,1,,Re,qu,,,te,xt, pencil ,pen
Output: "test1.out"
123,235252,6376,test1,08/11/2022,2,0,1,EA,Required text, pencil ,pen
456,235252,6376,test2,08/11/2022,2,0,1,EA,Required text, pencil ,pen
789,235252,6376,test3,08/11/2022,2,0,1,EA,Re qu ir ed te xt, pencil ,pen
012,235252,6376,test4,08/11/2022,2,0,1,Required,text, pencil ,pen
789,235252,6376,test5,08/11/2022,2,0,1,Re,qu te xt, pencil ,pen
Input: "test2.csv"
396,32124191,6376,CD1,08/11/2022,1,0,1,EA,Required Books,08/22/2022,12/10/2022,$60 basic supplies,37246613bA0,11800118,Required Books
396,32124191,6376,CD2,08/11/2022,2,0,1,EA,Required Supplies,08/22/2022,12/10/2022,up to $60.00 basic supplies with comma,37246613bA1,11800118,Required Supplies
396,32124191,6376,CD3,08/11/2022,2,0,1,EA,Required Supplies,08/22/2022,12/10/2022,up to $60.00 basic supplies with comma,37246613bA2,11800118,Required Supplies
Output: "test2.out"
396,32124191,6376,CD1,08/11/2022,1,0,1,EA,Required Books 08/22/2022 12/10/2022 $60 basic supplies 37246613bA0,11800118,Required Books
396,32124191,6376,CD2,08/11/2022,2,0,1,EA,Required Supplies 08/22/2022 12/10/2022 up to $60.00 basic supplies with comma 37246613bA1,11800118,Required Supplies
396,32124191,6376,CD3,08/11/2022,2,0,1,EA,Required Supplies 08/22/2022 12/10/2022 up to $60.00 basic supplies with comma 37246613bA2,11800118,Required Supplies
EDIT: New simpler solution added
@echo off
setlocal EnableDelayedExpansion
rem General method to keep the first N columns the same
rem and group additional fields in column N+1
rem Define the number of "same" and "total" columns:
set /A "same=12, last=17"
rem Process all files with .csv extension in current folder
for %%F in (*.csv) do (
ECHO/
ECHO Input: "%%F"
TYPE "%%F"
rem Process all lines of current file
(for /F "usebackq delims=" %%a in ("%%F") do (
set "line=%%a"
set "head="
set "tail="
set "i=0"
rem Split current line in comma-separated fields
for %%b in ("!line:,=" "!") do (
set /A i+=1
if !i! leq %same% ( rem Accumulate field in "head" columns
set "head=!head!%%~b,"
) else if !i! leq %last% ( rem Accumulate field in "tail" columns
set "tail=!tail!%%~b,"
) else ( rem Combine one field from beginning of "tail" and accumulate last field
for /F "tokens=1* delims=," %%x in ("!tail!") do set "tail=%%x %%y%%~b,"
)
)
echo !head!!tail:~0,-1!
)) > "%%~NF.out"
ECHO Output: "%%~NF.out"
TYPE "%%~NF.out"
)
Output example:
Input: "test1.csv"
field 1, field 2, field 3, field 4, field 5, field 6, field 7, field 8, field 9, field 10, field 11, field 12, field 13, field 14, field 15, field 16, field 17
field 1, field 2, field 3, field 4, field 5, field 6, field 7, field 8, field 9, field 10, field 11, field 12, field 13, field 13a, field 13b, field 14, field 15, field 16, field 17
field 1, field 2, field 3, field 4, field 5, field 6, field 7, field 8, field 9, field 10, field 11, field 12, field 13, field 13a, field 13b, field 13c, field 14, field 15, field 16, field 17
Output: "test1.out"
field 1, field 2, field 3, field 4, field 5, field 6, field 7, field 8, field 9, field 10, field 11, field 12, field 13, field 14, field 15, field 16, field 17
field 1, field 2, field 3, field 4, field 5, field 6, field 7, field 8, field 9, field 10, field 11, field 12, field 13 field 13a field 13b, field 14, field 15, field 16, field 17
field 1, field 2, field 3, field 4, field 5, field 6, field 7, field 8, field 9, field 10, field 11, field 12, field 13 field 13a field 13b field 13c, field 14, field 15, field 16, field 17