t-sqluniqueidentifierbulk-load

Set repeating IDs till first record repeats (bulk load csv file)


I have a file that I imported via bulk-insert and I want to assign group IDs/sequences. I would like to assign the IDs till the first record with the first character is repeated. In this example its "A".

The challenge I have is how to achieve this example and set the IDs like this example:

ID data
1 A000abcefd
1 E00asoaskdaok
1 C000dasdasok
2 A100abcasds
2 E100aandas
2 C100adsokdas

Solution

  • Here is one way to do it, but given the limited info you provided I will make the following assumptions:

    **The data in your table has some order to it. This obviously will not work if that is not the case. I used an ID, you use what you have.

    **The first row in the table has the character you are looking for.

    CREATE TABLE #tmp(ID int,   [data] varchar(20))
    INSERT INTO #tmp
    VALUES
    (1, 'A000abcefd'),
    (2, 'E00asoaskdaok'),
    (3, 'C000dasdasok'),
    (4, 'A100abcasds'),
    (5, 'E100aandas'),
    (6, 'C100adsokdas')
    
    
    DECLARE @CHAR varchar(1)
    
    SELECT @CHAR = (SELECT TOP 1 SUBSTRING([data],1,1) FROM #tmp ORDER BY ID)
    
    SELECT SUM(CASE WHEN SUBSTRING([data],1,1) = @CHAR THEN 1 ELSE 0 END) 
                        OVER(ORDER BY ID ROWS BETWEEN UNBOUNDED PRECEDING and CURRENT ROW) SeqNum
    ,[data]
    FROM #tmp