I have a query like this:
SELECT DISTINCT 'EntityName' AS [Column], EntityName AS [Value]
FROM dbo.Deals
WHERE EntityName IS NOT NULL
UNION
SELECT DISTINCT 'AssetClass' AS [Column], AssetClass AS [Value]
FROM dbo.Deals
WHERE AssetClass IS NOT NULL
UNION
SELECT DISTINCT 'VehicleType' AS [Column], VehicleType AS [Value]
FROM dbo.Deals
WHERE VehicleType IS NOT NULL
UNION
SELECT DISTINCT 'Strategy' AS [Column], Strategy AS [Value]
FROM dbo.Deals
WHERE Strategy IS NOT NULL
UNION
SELECT DISTINCT 'Bundled' AS [Column], Bundled AS [Value]
FROM dbo.Deals
WHERE Bundled IS NOT NULL
UNION
SELECT DISTINCT 'Geography' AS [Column], Geography AS [Value]
FROM dbo.Deals
WHERE Geography IS NOT NULL
UNION
SELECT DISTINCT 'Sector' AS [Column], Sector AS [Value]
FROM dbo.Deals
WHERE Sector IS NOT NULL
If I were to write something similar in a different programming language, I would create a list of columns and then iterate through that list.
Is there a way to refactor this SQL code to reduce the amount of duplication?
In SQL Server, here is one way to do it with cross apply
:
select distinct v.col, v.val
from dbo.Deals
cross apply ( values
( 'EntityName', EntityName),
('AssetClass', AssetClass),
('VehicleType', VehicleType),
('Strategy', Strategy),
('Bundled', Bundled),
('Geography', Geography),
('Sector', Sector)
) v(col, val)
where v.val is not null
This is more efficient that multiple union
s, as it scans the table only once, then unpivots - and also much easier to maintain (you just modify the values()
row constructor).
Note that, for this to work, all unpivoted columns must be of the same datatype (otherwise, some casting is required) ; this constraint equally applies to your original query.