stringmd5kdb

KDB q: how to convert a table into a string?


I need to compute md5 of a table in KDB. However, md5 can only (as far as I know) be called on a string. So, how can one convert a table in KDB to a single string representation? It does not matter if the result is human-readable as long as it is deterministic.

tab:flip `items`sales`prices!(`nut`bolt`cam`cog;6 8 0 3;10 20 15 20)
md5 tab / does not work
md5 string tab / does not work

Thank you very much for your help!


Solution

  • String based outputs truncate based on console size so are not reliable in comparing the whole table.

    Using -8! to serialize the table to bytes then you can convert to characters to run md5:

    https://code.kx.com/q/kb/serialization/

    q)md5 `char$-8!([] a:1 2 3 )
    0xeb5cc58551051573c696ea6f746c0dc8
    

    One item to watch out for still is attributes on data:

    https://code.kx.com/q/ref/set-attribute/

    q)md5 `char$-8!([] a:1 2 3 )
    0xeb5cc58551051573c696ea6f746c0dc8
    q)md5 `char$-8!([] a:`g#1 2 3 )
    0xda110d1ca3127bffe0a76ea21574f91d
    

    You can remove at a column level:

    q)md5 `char$-8!{c:cols x;![x;();0b;c!{(#;enlist`;x)}each c]} ([] a:1 2 3 )
    0xeb5cc58551051573c696ea6f746c0dc8
    q)md5 `char$-8!{c:cols x;![x;();0b;c!{(#;enlist`;x)}each c]} ([] a:`g#1 2 3 )
    0xeb5cc58551051573c696ea6f746c0dc8
    

    Note: If the column contained further nested contents with attributes you would need to extend the above.

    If your API enforces chunking you could compare md5 per chunk:

    q){c:count x;i:max(1;ceiling c%y);w:y*til i;{[x;y] md5 `char$-8!select from x where i within y}[t] each flip (w;-1+ 1_ w,c)}[20001#t;10000]
    0x528380f49e1aa4d5e6b0c4c1ead8924a
    0xa02cf08e70a5eb0ad7d094cba44e9146
    0x36d6252313d89920d1f50341f8738f8f