kdb

KDB Return the count of distinct elements in a set of named columns


In KDB, I want to count the number of distinct elements in a set of columns.

The table for example would have columns a, b, c ... and by supplying a symbol list lst:`a`b`c I would be able to get the number of distinct elements in those columns.

I have done most of this work, I am just having trouble at the end creating the table after using "each" with my functioun and list.

So far I have

Made the function:

f:{[x]?[`table;();0b;(enlist x)!enlist(#:;(?:;x))]}

This was made by parsing "select distinct x from table".

I then want apply the function to each element in the symbol list using each, and take each answer and join them together using each right

table has columns a,b,c,d ... 
lst:`a`b`c
ans:(),/: f each lst

Currently this produces an output like this:

+(,`a)!,,983
+(,`b)!,,931
+(,`c)!,,5

The numbers are correct but I don't understand why the output looks like this.

If I run the function f with just one input the answer is:

f[`a]
a
----
983

I was expecting the function to produce many of these single column tables and then join them to one table, which isn't happening I don't understand why.


Solution

  • f can be tweaked to perform what you need on single or multiple columns:

    q)table:([] a:100?100;b:100?100;c:100?100)
    q)f:{[x]?[`table;();0b;((),x)!{(count;(distinct;x))} each (),x]}
    q)f[`a]
    a
    --
    65
    q)f[`a`b`c]
    a  b  c
    --------
    65 63 59
    

    Alternative format:

    q)f:{[x]?[`table;();0b;`cls`counts!(enlist (),x;(each;{count distinct x};enlist,x))]}
    q)f[`a]
    cls counts
    ----------
    a   65
    q)f[`a`b`c]
    cls counts
    ----------
    a   65
    b   63
    c   59