coldfusioncfmlrailolucee

How do I replace unwanted commas in a comma delimited file in ColdFusion


I am loading a CSV, and trying to insert the contents in MySQL. One of the fields has commas in it, so for example:

 "Jane Doe","Boston","None","Yes","No"
 "John Doe","New York","No garlic, onions, or nuts","Yes","No"
 "Mary Doe","New York","None","Yes","No"

I start by reading in the file:

<cffile action="read"file="/var/www/html/temp.csv" variable="datacsv"> 

Then I start a loop:

<cfloop index="index" list="#datacsv#" delimiters="#chr(13)#,#chr(10)#">
    <cfset item1 = Replace(listgetAt(index,1), """", "", "ALL")> #item1#<br>
    <cfset item2 = Replace(listgetAt(index,2), """", "", "ALL")> #item2#<br>
    <cfset item3 = Replace(listgetAt(index,3), """", "", "ALL")> #item3#<br>
    <cfset item4 = Replace(listgetAt(index,4), """", "", "ALL")> #item4#<br>
    <cfset item5 = Replace(listgetAt(index,5), """", "", "ALL")> #item5#<br>
</cfloop>

My problem here is that in the second item (John Doe), those commas in the 3rd field are getting parsed out as new fields. So I either need to figure out what I am missing that is causing that, OR to strip the commas in any field and replace them with a different character.


Solution

  • Try using this regex to replace the embedded commas with dashes:

    <cfscript>
    
        // CSV content
        csvContent = '"John Doe","New York","No garlic, onions, or nuts","Yes","No"';
    
        // Replace all comma in between with dash
        writeOutput(
            csvContent.replaceAll(
                ',(?!")|(?<!"),(?=[^"]*?"(?:,|\r?\n|\r|$))', '-'
            )
        );
    </cfscript>
    

    Here is the GIST.

    Edit:

    This works, but it is also stripping out any CR/LF in there. As I loop through my items, I am doing: <cfloop index="index" list="#csvContent#" delimiters="#chr(13)#,#chr(10)#">

    You can simply use CR/LF( chr(13)chr(10) ) as the delimiter. Here is an example:

    <!--- CSV content --->
    <cfset csvContent = '"John Doe","New York","No garlic, onions, or nuts","Yes","No"'>
    
    <!--- Replace all comma in between with dash --->
    <cfset newCsvContent = csvContent.replaceAll(
        ',(?!")|(?<!"),(?=[^"]*?"(?:,|\r?\n|\r|$))', '-'
    )>
    
    <!--- Process records --->
    <cfoutput>
        <cfloop list="#newCsvContent#" index="csvRow" delimiters="#chr(13)##chr(10)#">
            Column 1: #replace( csvRow.listGetAt( 1 ), """", "", "ALL")#<br>
            Column 2: #replace( csvRow.listGetAt( 2 ), """", "", "ALL")#<br>
            Column 3: #replace( csvRow.listGetAt( 3 ), """", "", "ALL")#<br>
            Column 4: #replace( csvRow.listGetAt( 4 ), """", "", "ALL")#<br>
            Column 5: #replace( csvRow.listGetAt( 5 ), """", "", "ALL")#<br>
            <br>
        </cfloop>
    </cfoutput>