I am parsing a csv file with following contents using csv-parse -
userID,sysID
20,50
30,71
However, on the objects returned it isn't possible to access the property created from the first column userID
.
Here is my code --
async function main(){
let systemIDs = await getSystemIds('./systems.csv');
console.log(`Scanning data for ${systemIDs.length} systems..`);
console.log(systemIDs[0]);
console.log(systemIDs[0].userID); // This prints undefined
console.log(systemIDs[0].sysID); // This prints the correct value
}
async function getSystemIds(path){
let ids= [];
await new Promise ((resolve,reject)=>{
const csvParser = csvParse({columns:true, skip_empty_lines: true});
FS.createReadStream(path)
.pipe(csvParser)
.on('readable', ()=>{
let record ;
while(record = csvParser.read()) {
ids.push(record);
}
})
.on('finish',()=>{
resolve();
});
});
return ids;
}
Output -
Scanning data for 2 systems..
{ 'userID': '20', sysID: '50' }
undefined // <== The Problem
50
I notice the first column key userID
has single quotes around it in the console output whereas sysID
doesn't. But don't know what is causing them.
Figured it out myself in the end...
I needed the bom
option. The documentation states it should be set to true for UTF-8 files. But it is set to false by default.
CSV files generated by MS Excel have a hidden BOM character in them. This gets picked up as part of the header (and key name) by the parser. Setting the bom option to true makes csv-parse compatible with utf-8 files with or without bom character in them.
Set the bom option when initializing csv-parse -
const csvParser = csvParse({
columns: true,
skip_empty_lines: true,
bom: true
});
It isn't very clear to me why this bom character is added to the csv files by Excel. Quote from wikipedia BOM -
The Unicode Standard neither requires nor recommends the use of the BOM for UTF-8, but warns that it may be encountered at the start of a file trans-coded from another encoding