javascriptcsvcountlines

Count number of lines in CSV with Javascript


I'm trying to think of a way to count the number of lines in a .csv file using Javascript, any useful tips or resources someone can direct me to?


Solution

  • Depends what you mean by a line. For simple number of newlines, Robusto's answer is fine.

    If you want to know how many rows of CSV data that represents, things may be a little more difficult, as a CSV field may itself contain a newline:

    field1,"field
    two",field3
    

    ...is one row, at least in CSV as defined by RFC4180. (It's one of the aggravating features of CSV that there are so many non-standard variants; the RFC itself was very late to the game.)

    So if you need to cope with that case you'll have to essentially parse each field.

    A field can be raw, or (necessarily if it contains \n or ,) quoted, with " represented as double quotes. So a regex for one field would be:

    "([^"]|"")*"|[^,\n]*
    

    and so for a whole row (assuming it is not empty):

    ("([^"]|"")*"|[^,\n]*)(,("([^"]|"")*"|[^,\n]*))*\n
    

    and to get the number of those:

    var rowsn= csv.match(/(?:"(?:[^"]|"")*"|[^,\n]*)(?:,(?:"(?:[^"]|"")*"|[^,\n]*))*\n/g).length;
    

    If you are lucky enough to be dealing with a variant of CSV that complies with RFC4180's recommendation that there are no " characters in unquoted fields, you can make this a bit more readable. Split on newlines as before and count the number of " characters in each line. If it's an even number, you have a complete line; if it's an odd number you've got a split.

    var lines= csv.split('\n');
    for (var i= lines.length; i-->0;)
        if (lines[i].match(/"/g).length%2===1)
            lines.splice(i-1, 2, lines[i-1]+lines[i]);
    var rowsn= lines.length;