In a table I have a column Building Height
which contains the heights of buildings. Annoyingly the values are entered like this e.g. '18m', '8m' or '8.1m'.
Ideally, it would have the 'm' in the column name and numeric values in the rows but that's not the case. I do not have admin rights to the table in SQL Server.
I need to perform a data quality rule where I need to perform greater than or less than operators on the Building Height
column.
How do I overcome this obstacle?
Example of the table:
Building Height | No Storeys |
---|---|
8.1m | 3 |
13m | 5 |
7m | 6 |
9.0m | 9 |
Data quality rule to perform:
I have to find rows that do not fit criteria. Obviously the obstacle is the Building Height
column containing m
next to the number.
I've looked into substring
and string_split
, but as I do not have admin rights to alter the table I can't really change much, can only query.
2 steps:
m
SELECT
REPLACE([Building Height],'m','') step1,
CAST(REPLACE([Building Height],'m','') as DECIMAL(8,2)) step2,
[No Storeys]
FROM mytable;
see: DBFIDDLE
Using STRING_SPLIT(), you can do:
SELECT
m.*,
CAST(value as DECIMAL(8,3)) as [Height in meters]
FROM mytable m
CROSS APPLY STRING_SPLIT([Building Height],'m')
WHERE value<>'';
see: DBFIDDLE