sql-serveraggregate-functionssqlclrmedianuser-defined-aggregate

Cannot Pass Null Value to Custom Aggregate


Afternoon,

I'm writing a custom median function (without looking at existing solutions, i like the challenge), after lots of fiddling I'm most of the way there. I cannot however pass in a column that contains a null value. I'm handling this in the c# Code but it seems to be being stopped by SQL before it gets there.

You get this error...

Msg 6569, Level 16, State 1, Line 11 'Median' failed because parameter 1 is not allowed to be null.

C#:

 namespace SQLMedianAggregate
{
    [System.Serializable]
    [Microsoft.SqlServer.Server.SqlUserDefinedAggregate(
   Microsoft.SqlServer.Server.Format.UserDefined,
   IsInvariantToDuplicates = false, // duplicates may change results
   IsInvariantToNulls = true,      // receiving a NULL is handled later in code 
   IsInvariantToOrder = true,       // is sorted later
   IsNullIfEmpty = true,            // if no values are given the result is null
        MaxByteSize = -1,
   Name = "Median"                 // name of the aggregate
)]

    public struct Median : IBinarySerialize
    {
        public double Result { get; private set; }

        public bool HasValue { get; private set; }

        public DataTable DT_Values { get; private set; } //only exists for merge essentially

        public static DataTable DT_Final { get; private set; } //Need a static version so its accesible within terminate

        public void Init()
        {
            Result = double.NaN;
            HasValue = false;
            DT_Values = new DataTable();
            DT_Values.Columns.Add("Values", typeof(double));
            DT_Final = new DataTable();
            DT_Final.Columns.Add("Values", typeof(double));
        }

        public void Accumulate(double number)
        {

            if (double.IsNaN(number))
            {
                //skip
            }
            else
            {
                //add to tables
                DataRow NR = DT_Values.NewRow();
                NR[0] = number;
                DT_Values.Rows.Add(NR);
                DataRow NR2 = DT_Final.NewRow();
                NR2[0] = number;
                DT_Final.Rows.Add(NR2);
                HasValue = true;
            }
        }

        public void Merge(Median group)
        {
            // Count the product only if the other group has values
            if (group.HasValue)
            {
                DT_Final.Merge(group.DT_Values);
                //DT_Final = DT_Values;
            }
        }

        public double Terminate()
        {
            if (DT_Final.Rows.Count == 0) //Just to handle roll up so it doesn't crash (doesnt actually work
            {
                DataRow DR = DT_Final.NewRow();
                DR[0] = 0;
                DT_Final.Rows.Add(DR);
            }
            //Sort Results
            DataView DV = DT_Final.DefaultView;
            DV.Sort = "Values asc";
            DataTable DTF = new DataTable();
            DTF = DV.ToTable();

            ////Calculate median and submit result
            double MiddleRow = (DT_Final.Rows.Count -1.0) / 2.0;
            if (MiddleRow % 2 != 0)
            {

                double upper =  (double)(DT_Final.Rows[Convert.ToInt32(Math.Ceiling(MiddleRow))]["Values"]);
                double lower =  (double)(DT_Final.Rows[Convert.ToInt32(Math.Floor(MiddleRow))]["Values"]);
                Result = lower + ((upper - lower) / 2);

            } else
            {
                Result = (double)(DT_Final.Rows[Convert.ToInt32(MiddleRow)]["Values"]);
            }
            return Result;
        }

        public void Read(BinaryReader SerializationReader)
        {
            //Needed to get this working for some reason
        }

        public void Write(BinaryWriter SerializationWriter)
        {
            //Needed to get this working for some reason
        }

    }
}

SQL:

DROP AGGREGATE dbo.Median
DROP ASSEMBLY MedianAggregate
CREATE ASSEMBLY MedianAggregate
AUTHORIZATION dbo
FROM 'C:\Users\#######\Documents\Visual Studio 2017\Projects\SQLMedianAggregate\SQLMedianAggregate\bin\Debug\SQLMedianAggregate.dll'
WITH PERMISSION_SET = UNSAFE;


CREATE AGGREGATE dbo.Median (@number FLOAT) RETURNS FLOAT
EXTERNAL NAME [MedianAggregate]."SQLMedianAggregate.Median";

Any ideas of what setting or code i'm missing that will allow this. I pretty much just want it to ignore nulls.

SQL Version is SQL2008 R2 btw


Solution

  • The problem is your datatype. You need to use the Sql* types for SQLCLR parameters, return values, and result set columns. In this case, you need to change:

    Accumulate(double number)
    

    into:

    Accumulate(SqlDouble number)
    

    Then, you access the double value using the Value property that all Sql* types have (i.e. number.Value in this case).

    And then, at the beginning of the Accumulate method, you need to check for NULL using the IsNull property:

    if (number.IsNull)
    {
      return;
    }
    

    Also, for more information on using SQLCLR in general, please see the series I am writing on this topic on SQL Server Central: Stairway to SQLCLR (free registration is required to read content on that site, but it's worth it :-).

    And, since we are talking about median calculations here, please see the article I wrote (also on SQL Server Central) on the topic of UDAs and UDTs that uses Median as the example: Getting The Most Out of SQL Server 2005 UDTs and UDAs. Please keep in mind that the article was written for SQL Server 2005 which has a hard limit of 8000 bytes of memory for UDTs and UDAs. That limit was lifted in SQL Server 2008, so rather than using the compression technique shown in that article, you could simply set MaxByteSize in the SqlUserDefinedAggregate to -1 (as you are currently doing) or SqlMetaData.MaxSize (or something very close to that).

    Also, DataTable is a bit heavy-handed for this type of operation. All you need is a simple List<Double> :-).


    Regarding the following line of code (broken into 2 lines here to prevent the need to scroll):

    public static DataTable DT_Final { get; private set; }
       //Need a static version so its accesible within terminate
    

    This is a huge misunderstanding of how UDAs and UDTs work. Please do NOT use static variables here. Static variables are shared across Sessions, hence your current approach is not thread-safe. So you would either get errors about it already being declared or various Sessions would alter the value unbeknownst to other Sessions, as they would all share the single instance of DT_Final. And the errors and/or odd behavior (i.e. erroneous results that you can't debug) might happen in a single session if a parallel plan is used.

    UDTs and UDAs get serialized to a binary value stored in memory, and then are deserialized which keeps their state intact. This is the reason for the Read and Write methods, and why you needed to get those working.

    Again, you don't need (or want) DataTables here as they are over-complicating the operation and take up more memory than is ideal. Please see the article I linked above on UDAs and UDTs to see how the Median operation (and UDAs in general) should work.