I am using the following .NET code to iterate through all input columns to create a HashByte value of the columns to then determine if the row has been updated.
The script works fine until a column has a NULL value. How can I code around this without using a TRY CATCH block?
Imports System.Text
Imports System.Reflection
Imports System.Security.Cryptography
Private sha As SHA256 = New SHA256CryptoServiceProvider() 'used To create the SHA256 hash
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
Try
' create hashbyte column to identify if row has changed
Dim concat As StringBuilder = New StringBuilder("")
' loop over all the input columns of the buffer row
For Each col As PropertyInfo In Row.GetType().GetProperties()
If Not col.Name.ToLower.EndsWith("isnull") AndAlso Not col.Name.ToLower.Equals("hash") Then
MsgBox(col.Name.ToLower)
If col.GetValue(Row, Nothing) Is Nothing Then
concat.Append("")
Else
concat.Append(Convert.ToString(col.GetValue(Row).ToString()))
End If
concat.Append("|")
End If
Next
Dim sHashBytes As String = concat.ToString().Remove(concat.ToString().Length - 1) ' remove the last character of the string
Dim bHashBytes As Byte() = sha.ComputeHash(System.Text.UnicodeEncoding.Unicode.GetBytes(sHashBytes))
Dim sb As StringBuilder = New StringBuilder("")
For i As Integer = 0 To bHashBytes.Length - 1
sb.Append(bHashBytes(i).ToString("X2"))
Next
Row.Hash = "0x" & sb.ToString()
Catch ex As Exception
MsgBox(ex.ToString)
End Try
End Sub
Thanks, Greg
Well, assume that all input columns are of type String. I post a fragment of code which concatenates all string columns into a string (in C#) for omitted calculations.
public override void ProcessInput(int InputID, PipelineBuffer Buffer)
{
bool fireAgain = true;
string columnData = null;
while (Buffer.NextRow())
{
for (int columnIndex = 0;
columnIndex < Buffer.ColumnCount;
columnIndex++)
{
if (!Buffer.IsNull(columnIndex))
{
BufferColumn columnInfo = Buffer.GetColumnInfo(columnIndex);
switch (columnInfo.DataType)
{
// check column data type
case DataType.DT_WSTR:
columnData += Buffer.GetString(columnIndex);
break;
// add code to support more data types here
default:
columnData = "";
break;
}
}
}
}
// Do some calculations...
base.ProcessInput(InputID, Buffer);
}
Reason why calling ProcessInput
and not Input0_ProcessInputRow
- in the latter you have all columns with column named properties, so cannot loop over them. This method is called per each buffer, so mode data (and buffers) you will have, the more calls you will receive.
Calling base.ProcessInput(InputID, Buffer)
is essential for correct buffer handling, marking it processed and being available to further data flow transformations.
Inspired by Todd McDermid's Blog post on this.