I need to compute md5
of a table in KDB. However, md5
can only (as far as I know) be called on a string. So, how can one convert a table in KDB to a single string representation? It does not matter if the result is human-readable as long as it is deterministic.
tab:flip `items`sales`prices!(`nut`bolt`cam`cog;6 8 0 3;10 20 15 20)
md5 tab / does not work
md5 string tab / does not work
Thank you very much for your help!
String based outputs truncate based on console size so are not reliable in comparing the whole table.
Using -8!
to serialize the table to bytes then you can convert to characters to run md5
:
https://code.kx.com/q/kb/serialization/
q)md5 `char$-8!([] a:1 2 3 )
0xeb5cc58551051573c696ea6f746c0dc8
One item to watch out for still is attributes on data:
https://code.kx.com/q/ref/set-attribute/
q)md5 `char$-8!([] a:1 2 3 )
0xeb5cc58551051573c696ea6f746c0dc8
q)md5 `char$-8!([] a:`g#1 2 3 )
0xda110d1ca3127bffe0a76ea21574f91d
You can remove at a column level:
q)md5 `char$-8!{c:cols x;![x;();0b;c!{(#;enlist`;x)}each c]} ([] a:1 2 3 )
0xeb5cc58551051573c696ea6f746c0dc8
q)md5 `char$-8!{c:cols x;![x;();0b;c!{(#;enlist`;x)}each c]} ([] a:`g#1 2 3 )
0xeb5cc58551051573c696ea6f746c0dc8
Note: If the column contained further nested contents with attributes you would need to extend the above.
If your API enforces chunking you could compare md5
per chunk:
q){c:count x;i:max(1;ceiling c%y);w:y*til i;{[x;y] md5 `char$-8!select from x where i within y}[t] each flip (w;-1+ 1_ w,c)}[20001#t;10000]
0x528380f49e1aa4d5e6b0c4c1ead8924a
0xa02cf08e70a5eb0ad7d094cba44e9146
0x36d6252313d89920d1f50341f8738f8f