I am trying to, using SSIS, obtain a table to get the percentage of missing values of every field in every table of a SQL Server database.
Ideally I would like to create a new table in another database with 4 fields
Table / Field / Type / PctMissingValues
with one row for each field of a table. Type would be the field type
My idea was to use foreach loop containers to loop through tables and fields, with inside the container a Data flow task consisting of OLE DB Source > Aggregate > OLE DB Destination but I can't figure out how to do this, I am new to SSIS and ETLs in general
The following SQL query generates one query per column in a database that counts total rows and rows where the value is NULL.
You can load this in to a variable and loop through it in SSIS running the statement in each row one at a time and logging the results form that query out to another table.
SELECT
OBJECT_SCHEMA_NAME(C.object_id) AS TableSchema
,OBJECT_NAME(C.object_id) AS TableName
,C.name AS ColumnName
,'SELECT COUNT(*) AS TotalRows, COUNT(IIF([' +C.name+ '] IS NULL,1,NULL)) AS NullRows
FROM [' + OBJECT_SCHEMA_NAME(C.object_id) + '].[' + OBJECT_NAME(C.object_id) + ']' AS CountQuery
FROM sys.columns AS C
INNER JOIN sys.tables AS T
ON C.object_id = T.object_id