I have a delimited file with hundreds of thousands of lines that must be migrated to a new system. One of the fields is required in the new system and is required to be unique, but is often missing in the data file. I would like to use awk to insert uuids for this purpose. My normal way of substituting a default value (presuming $2 is missing) is
awk 'BEGIN{FS="\t"}{OFS="\t"}!$2{$2="defaultvalue"}1' myfile
which works well for text literals, but I'd like to use the same mechanism to use command substitution within awk. If I start with the file:
field1 requiredfield field3
value1 networkId value3
value2 value4
the desired output would be:
field1 requiredfield field3
value1 networkId value3
value2 6b34b312-8169-4539-b52f-a075542fb063 value4
where I'm using uuidgen to generate the value. I can almost get things to work but not quite
awk 'BEGIN{FS="\t"}{OFS="\t"}!$2{$2=system("uuidgen")}1'
returns a zero along with with the uuidgen output
awk 'BEGIN{FS="\t"}{OFS="\t"}!$2{"uuidgen"|getline d}{$2=d}1'
and
awk -v uuid=$(uuidgen) 'BEGIN{FS="\t"}{OFS="\t"}!$2{$2=uuid}1'
don't have a problem with the zero, but they give the same uuid for all records. What am I missing?
Assuming your input file is like this:
cat -vte file
field1^Irequiredfield^Ifield3$
value1^InetworkId^Ivalue3$
value2^I ^Ivalue4$
value5^I ^Ivalue6$
value7^I ^Ivalue8$
You may try this awk
that runs given command for each record and used an output from shell if inner command fails:
cmd="uuidgen"
awk -v cmd="$cmd" -v uuid="$($cmd)" '
BEGIN {FS=OFS="\t"}
$2 ~ /^[[:blank:]]*$/ {
$2 = ((cmd | getline out) > 0 ? out : uuid)
close(cmd)
} 1' file | column -t
field1 requiredfield field3
value1 networkId value3
value2 857063FB-673B-487B-8C00-6DF01537DA22 value4
value5 63333D5E-2156-4855-B8BC-CBB7CEF9E9F4 value6
value7 8CD16A13-F0F5-42E6-9998-3AA064F9B4FC value8
column -t
has been used for tabular output.