I have a column in a SQL Server table that has strings of varying lengths. I need to find the position of the first occurrence of the string , --
that's not enclosed in single quotes or square brackets.
For example, in the following two strings, I've bolded the portion I would like to get the position of. Notice in the first string, the first time , --
appears on its own (without being between single quote or square bracket delimiters) is at position 13 and in the second string, it's at position 16.
'a, --'[, --]**, --**[, --]
[a, --b]aaaaaaa_ **, --**', --'
Also I should mention that , --
itself could appear multiple times in the string.
Here's a simple query that shows the strings and my desired output.
SELECT
t.string, t.desired_pos
FROM
(VALUES (N'''a, --''[, --], --[, --]', 14),
(N'[a, —-b]aaaaaaa_ , --'', --''', 18)) t(string, desired_pos)
Is there any way to accomplish this using a SELECT query (or multiple) without using a function?
Thank you in advance!
I've tried variations of SUBSTRING, CHARINDEX, and even some CROSS APPLYs but I can't seem to get the result I'm looking for.
Before I write down my solution, I must warn you: DON'T USE IT. Use a function, or do this in some other language. This code is probably buggy. It doesn't handle stuff like escaped quotes etc etc.
The idea is to first remove the stuff inside brackets [] and quotes '' and then just do a "simple" charindex
.
To remove the brackets, I'm using a recursive CTE that loops ever part of matching quotes and replaces their content with placeholder strings.
One important point is that quotes might be embedded in each other, so you have to try both variants and chose the one that is earliest.
WITH CTE AS (
SELECT *
FROM
(VALUES (N'''a, --''[, --], --[, --]', 14),
(N'[a, —-b]aaaaaaa_ , --'', --''', 18)) t(string, desired_pos)
)
, cte2 AS (
select x.start
, x.finish
, case when x.start > 0 THEN STUFF(string, x.start, x.finish - x.start + 1, REPLICATE('a', x.finish - x.start + 1)) ELSE string END AS newString
, 1 as level
, string as orig
, desired_pos
from cte
CROSS APPLY (
SELECT *
, ROW_NUMBER() OVER(ORDER BY case when start > 0 THEN 0 ELSE 1 END, start) AS sortorder
FROM (
SELECT charindex('[', string) AS start
, charindex(']', string) AS finish
UNION ALL
SELECT charindex('''', string) AS startQ
, charindex('''', string, charindex('''', string) + 1) AS finishQ
) x
) x
WHERE x.sortorder = 1
UNION ALL
select x.start
, x.finish
, STUFF(newString, x.start, x.finish - x.start + 1, REPLICATE('a', x.finish - x.start + 1))
, 1 as level
, orig
, desired_pos
from cte2
CROSS APPLY (
SELECT *
, ROW_NUMBER() OVER(ORDER BY case when start > 0 THEN 0 ELSE 1 END, start) AS sortorder
FROM (
SELECT charindex('[', newString) AS start
, charindex(']', newString) AS finish
UNION ALL
SELECT charindex('''', newString) AS startQ
, charindex('''', newString, charindex('''', newString) + 1) AS finishQ
) x
) x
WHERE x.sortorder = 1
AND x.start > 0
AND cte2.start > 0 -- Must have been a match
)
SELECT PATINDEX('%, --%', newString), *
from (
select *, row_number() over(partition by orig order by level desc) AS sort
from cte2
) x
where x.sort = 1