I have a tab separated matrix (say filename).
If I do:
head -1 filename | awk -F "\t" '{i=0;med=0;for(i=2;i<=NF;i++) array[i]=$i;asort(array);print length(array)}'
followed by:
head -2 filename | tail -1 | awk -F "\t" '{i=0;med=0;for(i=2;i<=NF;i++) array[i]=$i;asort(array);print length(array)}'
I get an answer of 24 (same answer) for all rows basically.
But if I do it:
cat filename | awk -F "\t" '{i=0;med=0;for(i=2;i<=NF;i++) array[i]=$i;asort(array);print length(array)}'
I get:
24
25
25
25
25 ...
Why is it so?
Following is the inputfile:
Case1 17.49 0.643 0.366 11.892 0.85 5.125 0.589 0.192 0.222 0.231 27.434 0.228 0 0.111 0.568 0.736 0.125 0.038 0.218 0.253 0.055 0.019 0 0.078
Case2 0.944 2.412 4.296 0.329 0.399 1.625 0.196 0.038 0.381 0.208 0.045 1.253 0.382 0.111 0.324 0.268 0.458 0.352 0 1.423 0.887 0.444 5.882 0.543
Case3 21.266 14.952 24.406 10.977 8.511 21.75 6.68 0.613 12.433 1.48 1.441 21.648 6.972 42.931 8.029 4.883 11.912 6.248 4.949 26.882 9.756 5.366 38.655 12.723
Case4 0.888 0 0.594 0.549 0.105 0.125 0 0 0.571 0.116 0.019 1.177 0.573 0.111 0.081 0.401 0 0.05 0.073 0 0 0 0 0.543
Well, I found an answer to my own problem:
I wonder how I missed it, but nullifying the array at the end of each initiation is always critical for repeated usage of same array name (no matter which language/ script one uses).
correct awk was:
cat filename | awk -F "\t" '{i=0;med=0;for(i=2;i<=NF;i++) array[i]=$i;asort(array);print length(array);delete array}'