I have the following string:
'Siemens','Simatic','Microbox','PC','27','6ES7677AA200PA0','6ES7','677AA200PA0'
I want to remove any "terms" that are less than 5 characters. So in this case I'd like to remove 'PC', '27' and '6ES7'.
Which would result in:
'Siemens','Simatic','Microbox','6ES7677AA200PA0','677AA200PA0'
This is in SQL server and I have a function that accepts a regex command, so far it looks like this:
SELECT dbo.fn_StripCharacters(title, '/^''PC''$/')
I tried to hardcode to remove 'PC' but I think its removing all apostrophes, and 'P' and 'C' characters:
Siemens,Simati,Mirobox,,427B,6ES76477AA200A0,6ES7,6477AA200A0
This is the function I'm using:
CREATE FUNCTION [dbo].[fn_StripCharacters]
(
@String NVARCHAR(MAX),
@MatchExpression VARCHAR(255)
)
RETURNS NVARCHAR(MAX)
AS
BEGIN
SET @MatchExpression = '%['+@MatchExpression+']%'
WHILE PatIndex(@MatchExpression, @String) > 0
SET @String = Stuff(@String, PatIndex(@MatchExpression, @String), 1, '')
RETURN @String
END
If you don't care about the particular order of the words which are retained after filtering off words 4 characters or less, you could use STRING_SPLIT
and STRING_AGG
:
WITH cte AS (
SELECT id, value
FROM yourTable
CROSS APPLY STRING_SPLIT(val, ',')
)
SELECT id, STRING_AGG(value, ',') AS val
FROM cte
WHERE LEN(value) > 6
GROUP BY id;