sqlsql-servert-sqlpatindex

SQL Patindex / Regex - Match where there are 4 or less characters between 2 apostrophes


I have the following string:

'Siemens','Simatic','Microbox','PC','27','6ES7677AA200PA0','6ES7','677AA200PA0'

I want to remove any "terms" that are less than 5 characters. So in this case I'd like to remove 'PC', '27' and '6ES7'.

Which would result in:

'Siemens','Simatic','Microbox','6ES7677AA200PA0','677AA200PA0'

This is in SQL server and I have a function that accepts a regex command, so far it looks like this:

SELECT dbo.fn_StripCharacters(title, '/^''PC''$/')

I tried to hardcode to remove 'PC' but I think its removing all apostrophes, and 'P' and 'C' characters:

Siemens,Simati,Mirobox,,427B,6ES76477AA200A0,6ES7,6477AA200A0

This is the function I'm using:

CREATE FUNCTION [dbo].[fn_StripCharacters]
(
    @String NVARCHAR(MAX), 
    @MatchExpression VARCHAR(255)
)
RETURNS NVARCHAR(MAX)
AS
BEGIN
    SET @MatchExpression =  '%['+@MatchExpression+']%'
    
    WHILE PatIndex(@MatchExpression, @String) > 0
        SET @String = Stuff(@String, PatIndex(@MatchExpression, @String), 1, '')
    
    RETURN @String
    
END

Solution

  • If you don't care about the particular order of the words which are retained after filtering off words 4 characters or less, you could use STRING_SPLIT and STRING_AGG:

    WITH cte AS (
        SELECT id, value
        FROM yourTable
        CROSS APPLY STRING_SPLIT(val, ',')
    )
    
    SELECT id, STRING_AGG(value, ',') AS val
    FROM cte
    WHERE LEN(value) > 6
    GROUP BY id;
    

    Demo