jsonlinuxshellcmdmiller

Create and append HASH-ID key with value pairs to a file of JSON objects for each row using miller shell command


I'm looking to create a unique-ID/hash-ID for each row in my JSONs file based on all the values in each JSON object

I started from this but not sure if I will have to be explicit columns name or if there's a way to include all columns without being explicit about column names.

mlr --json put -S '$hash_id=$f_name."".$l_name."".$title' then reorder -e -f job file.json

file.json Input:

{"f_nams":"Hana","title":"Mrs","l_name":"Smith"}
{"f_nams":"Mike","title":"Mr","l_name":"Larry"}
{"f_nams":"Jhon","title":"Mr","l_name":"Doe"}

Desired output:

{"f_nams":"Hana","title":"Mrs","l_name":"Smith","hash_id":"hash_value_based_on_all_columns"}
{"f_nams":"Mike","title":"Mr","l_name":"Larry","hash_id":"hash_value_based_on_all_columns"}
{"f_nams":"Jhon","title":"Mr","l_name":"Doe","hash_id":"hash_value_based_on_all_columns"}

Solution

  • Assuming the input file.json is formatted as shown:

    cat file.json
    {"f_nams":"Hana","title":"Mrs","l_name":"Smith"}
    {"f_nams":"Mike","title":"Mr","l_name":"Larry"}
    {"f_nams":"Jhon","title":"Mr","l_name":"Doe"}
    

    Then one way is to use the following perl script to produce the desired output:

    perl -MMIME::Base64 -ne '
    /"f_nams":"(\w+)","title":"(\w+)","l_name":"(\w+)"/ && do {
    ($fn,$tt,$ln)=($1,$2,$3);
    $x=$fn . $tt . $ln;
    chomp($hashvalue = encode_base64($x));
    s/\}/,"hash_id":"$hashvalue"\}/;print}' file.json
    

    Produces:

    {"f_nams":"Hana","title":"Mrs","l_name":"Smith","hash_id":"SGFuYU1yc1NtaXRo"}
    {"f_nams":"Mike","title":"Mr","l_name":"Larry","hash_id":"TWlrZU1yTGFycnk="}
    {"f_nams":"Jhon","title":"Mr","l_name":"Doe","hash_id":"Smhvbk1yRG9l"}