apache-nifiapache-nifi-registry

How to reorder CSV columns in Apache NiFi


Reorder column in a csv in apache nifi.

Input - I have multiple files which have same columns but are in different order.

Output - Scrape some columns and store in same order.


Solution

  • In my case, because I'm sure those columns will be included in all CSV files, I just need to reorder them. So I use QueryRecord to reorder my csv files.

    For example, here're my csv files:

    \\file1
    name, age, location, gender
    Jack, 40, TW, M
    Lisa, 30, CA, F 
    
    \\file2
    name, location, gender, age
    Mary, JP, F, 25
    Kate, DE, F, 23
    

    I'd like to reorder columns to location,name,gender,age, I set a new property in QueryRecord named reorder_data, with the value like:

    SELECT location,name,gender,age FROM FLOWFILE

    Then data in the flowfile will become:

    \\file1 - reordered
    location, name, gender, age
    TW, Jack, M, 40
    CA, Lisa, F, 30
    
    \\file2 - reordered
    location, name, gender, age
    JP, Mary, F, 25
    DE, Kate, F, 23
    

    Thus, I can get reordered data output from QueryRecord as well as original data, it's very convenient.

    BTW, You can also use group variable or attribute to set column order for better maintenance:

    //Group variable or attribute
    column_order   location,name,gender,age
    
    //Property in QueryRecord
    reorder_data   SELECT ${column_order} FROM FLOWFILE