I'm looking to create a unique-ID/hash-ID
for each row in my JSONs
file based on all the values in each JSON object
I started from this but not sure if I will have to be explicit columns name or if there's a way to include all columns without being explicit about column names.
mlr --json put -S '$hash_id=$f_name."".$l_name."".$title' then reorder -e -f job file.json
file.json
Input:
{"f_nams":"Hana","title":"Mrs","l_name":"Smith"}
{"f_nams":"Mike","title":"Mr","l_name":"Larry"}
{"f_nams":"Jhon","title":"Mr","l_name":"Doe"}
Desired output:
{"f_nams":"Hana","title":"Mrs","l_name":"Smith","hash_id":"hash_value_based_on_all_columns"}
{"f_nams":"Mike","title":"Mr","l_name":"Larry","hash_id":"hash_value_based_on_all_columns"}
{"f_nams":"Jhon","title":"Mr","l_name":"Doe","hash_id":"hash_value_based_on_all_columns"}
Assuming the input file.json is formatted as shown:
cat file.json
{"f_nams":"Hana","title":"Mrs","l_name":"Smith"}
{"f_nams":"Mike","title":"Mr","l_name":"Larry"}
{"f_nams":"Jhon","title":"Mr","l_name":"Doe"}
Then one way is to use the following perl script to produce the desired output:
perl -MMIME::Base64 -ne '
/"f_nams":"(\w+)","title":"(\w+)","l_name":"(\w+)"/ && do {
($fn,$tt,$ln)=($1,$2,$3);
$x=$fn . $tt . $ln;
chomp($hashvalue = encode_base64($x));
s/\}/,"hash_id":"$hashvalue"\}/;print}' file.json
Produces:
{"f_nams":"Hana","title":"Mrs","l_name":"Smith","hash_id":"SGFuYU1yc1NtaXRo"}
{"f_nams":"Mike","title":"Mr","l_name":"Larry","hash_id":"TWlrZU1yTGFycnk="}
{"f_nams":"Jhon","title":"Mr","l_name":"Doe","hash_id":"Smhvbk1yRG9l"}