csvmiller

Perform VLOOKUP-like operation between two CSV files using Miller


I am using miller , to work with .csv files. There is an issue though that I dont manage how to get around. So I have 2 .csv files.

product-feeder.csv

name-manufacturer,id-product,name-product,id-manufacturer,id-supplier,image-posted
MANUFACTURER ONE,0,PRODUCT A,0,0,0
MANUFACTURER TWO,0,PRODUCT B,0,0,0
MANUFACTURER ONE,0,UNMATCHED PRODUCT ONE,0,0,0
UNKOWN MANUFACTURER,0,UNMATCHED PRODUCT TWO,0,0,0

and manufacturer-feeder.csv

id-manufacturer,name-manufacturer,link-rewrite,image-posted
114,MANUFACTURER ONE,manu-one,1
116,MANUFACTURER TWO,manu-two,1

I just want to change the values on the column of product-feeder.csv , ${id-manufacturer} , and place them according to what it matches on manufacturer-feeder.csv , the ${name-manufacturer} of product-feeder.csv that ${id-manufacturer} having as a result

name-manufacturer,id-product,name-product,id-manufacturer,id-supplier,image-posted
MANUFACTURER ONE,0,PRODUCT A,114,0,0
MANUFACTURER TWO,0,PRODUCT B,116,0,0
MANUFACTURER ONE,0,UNMATCHED PRODUCT ONE,114,0,0
UNKOWN MANUFACTURER,0,UNMATCHED PRODUCT TWO,0,0,0

I think in the Spreadsheet world this function would be called VLOOKUP Do you know how to tackle this with miller ?


Solution

  • A solution is to use the join verb

    Running

    mlr --csv join --ul \
      -j name-manufacturer \
      --rp "r_" \
      -f product-feeder.csv \
    then unsparsify \
    then put '${id-manufacturer}=${r_id-manufacturer}' \
    then cut -x -r -f "^r_.+" \
    manufacturer-feeder.csv
    
    +---------------------+------------+-----------------------+-----------------+-------------+--------------+
    | name-manufacturer   | id-product | name-product          | id-manufacturer | id-supplier | image-posted |
    +---------------------+------------+-----------------------+-----------------+-------------+--------------+
    | MANUFACTURER ONE    | 0          | PRODUCT A             | 114             | 0           | 0            |
    | MANUFACTURER ONE    | 0          | UNMATCHED PRODUCT ONE | 114             | 0           | 0            |
    | MANUFACTURER TWO    | 0          | PRODUCT B             | 116             | 0           | 0            |
    | UNKOWN MANUFACTURER | 0          | UNMATCHED PRODUCT TWO |                 | 0           | 0            |
    +---------------------+------------+-----------------------+-----------------+-------------+--------------+
    

    Explanation: