I need to remove duplicate values in a string but the string must have the same sequence. e.g.
@string = "I need to remove duplicate duplicate values in a a string"
must be converted to
@string = "I need to remove duplicate values in a string"
How to remove duplicates from a string in SQL This solution was inspirational but the solution was sorted and that is not what i wanted.
select Ordinal, Value from dbo.string_split(@string, ' ',1)
Gives me
Ordinal | Value |
---|---|
1 | I |
2 | need |
3 | to |
4 | remove |
5 | duplicate |
6 | duplicate |
7 | values |
8 | in |
9 | a |
10 | a |
11 | string |
a "group by" or a "distinct" messes with the sequence.
How can I fix this? Thanks for your time!
Now that you have split, you have a perfect entry point for a windowed comparison and specifically, the lag()
window function will give each row access to its predecessor. You'll then be able to simply compare the current word with its predecessor, to flag it as duplicate. Finally you'll aggregate back your string by keeping only words not flagged.
with split as
(
select
Ordinal, Value,
case when Value = lag(Value) over (order by Ordinal) then 1 end as dup
from string_split(@string, ' ', 1)
)
select string_agg(Value, ' ') within group (order by Ordinal) from split where dup is null;
(see it in a fiddle)