sqlsql-servercharindex

Query to find position in a string before keyword and extract data before that position


Our data is in a SQL Server 2022 (v16.0.1000.6) database. I'm using SSMS v20.2.

When a person makes a change to a record, that change is captured in a log table as a string column called LogText. Each time a change is made, the change is appended to this column. The date of the change is captured in a different column.

My problem is trying to parse out the most current change which would be at the end of the string in the LogText column. The following is an example of an inventory record for a part. The DemandQty and OnHandQty have been changed numerous times with the most current changes at the end of the string. I'm thinking that I'd need to start at the end of the string, find the first occurrence of DemandQty and extract the value after the -> characters. Same with OnHandQty.

My question is twofold:

  1. How do I find the position of -> that comes before DemandQty or OnHandQty in the reversed string?
  2. How do I extract the integer value before the arrow in the reversed string?

Example string:

12:01:07  OnHandQty: 10000.00000000 -> 9743.00000000  12:01:07  OnHandQty: 9743.00000000 -> 10000.00000000  15:24:25  DemandQty: 0 -> 257.00000000  15:31:09  OnHandQty: 10000.00000000 -> 9743.00000000  15:31:09  DemandQty: 257.00000000 -> 127.00000000

Expected result: DemandQty = 127, OnHandQty = 9743

Actual result: NULL

This is what I've come up with for the DemandQty, however the @arrowPos variable is returning the position of <- (in the reversed string) AFTER the first DemandQty rather than before.

DECLARE @inputString NVARCHAR(MAX) = '12:01:07  OnHandQty: 10000.00000000 -> 9743.00000000  12:01:07  OnHandQty: 9743.00000000 -> 10000.00000000  15:24:25  DemandQty: 0 -> 257.00000000  15:31:09  OnHandQty: 10000.00000000 -> 9743.00000000  15:31:09  DemandQty: 257.00000000 -> 127.00000000'
 
-- Reverse the input string to find the last occurrence of 'DemandQty' from the end.
DECLARE @reversedString NVARCHAR(MAX) = REVERSE(@inputString);
 
-- Find the position of the first 'DemandQty' in the reversed string.
DECLARE @demandQtyPos INT = CHARINDEX('ytqdnamed', @reversedString) - 1;  -- 'ytqdnamed' is 'DemandQty' reversed.
 
-- Find the position of the '>-' symbol that comes before the first 'DemandQty' in the reversed string.
DECLARE @arrowPos INT = CHARINDEX('>-', @reversedString, @demandQtyPos);
 
-- Extract the value after '>-' and the space.
DECLARE @result NVARCHAR(50) = 
    CASE 
        WHEN @arrowPos > 0 THEN
            -- Extract the substring after '>-' (ignoring the space), and reverse it back to get the correct order.
            REVERSE(SUBSTRING(@reversedString, @arrowPos + 2, CHARINDEX(' ', @reversedString + ' ', @arrowPos + 2) - @arrowPos - 2))
        ELSE 
            NULL
    END;

------Convert the extracted value to an integer.
SELECT CAST(@result AS INT) AS ExtractedIntegerValue; 

Solution

  • I would, honestly, make best attempts to normalise your data, and then work with that data. This makes best attempts to do so, and works for the data you have. Most likely you then want a Top 1 in each group in that result set.

    DECLARE @YourString varchar(8000) = '12:01:07  OnHandQty: 10000.00000000 -> 9743.00000000  12:01:07  OnHandQty: 9743.00000000 -> 10000.00000000  15:24:25  DemandQty: 0 -> 257.00000000  15:31:09  OnHandQty: 10000.00000000 -> 9743.00000000  15:31:09  DemandQty: 257.00000000 -> 127.00000000';
    
    WITH Split AS(
        SELECT TRIM(SS.value) AS value,
               SS.ordinal,
               (ss.ordinal+1) / 2 AS item
        FROM STRING_SPLIT(REPLACE(@YourString,'  ','|'),'|',1) SS),
    Pivoted AS(
        SELECT S.item,
               MAX(CASE WHEN S.value LIKE '[0-9][0-9]:[0-9][0-9]:[0-9][0-9]' THEN S.value END) AS EventTime,
               MAX(CASE WHEN S.value NOT LIKE '[0-9][0-9]:[0-9][0-9]:[0-9][0-9]' THEN S.value END) AS EventValue
        FROM Split S
        GROUP BY S.item) --Will also need to group on a unique value if this is coming from a table, not a variable
    SELECT P.EventTime,
           V.EventName,
           MAX(CASE SS.ordinal WHEN 1 THEN TRIM(SS.value) END) AS FromValue,
           MAX(CASE SS.ordinal WHEN 2 THEN TRIM(SS.value) END) AS ToValue
    FROM Pivoted P  
         CROSS APPLY (VALUES(NULLIF(CHARINDEX(':',P.EventValue),0)))CI(I)
         CROSS APPLY (VALUES(LEFT(P.EventValue,CI.I-1),TRIM(STUFF(P.EventValue,1,CI.I,''))))V(EventName,EventValue)
         CROSS APPLY STRING_SPLIT(REPLACE(V.EventValue,'->','|'),'|',1) SS
    GROUP BY P.EventTime,
             P.item,
             V.EventName;--Will also need to group on a unique value if this is coming from a table, not a variable
    

    And, as mentioned, if you needed to do a get the "last" values:

    WITH Split AS(
        SELECT TRIM(SS.value) AS value,
               SS.ordinal,
               (ss.ordinal+1) / 2 AS item
        FROM STRING_SPLIT(REPLACE(@YourString,'  ','|'),'|',1) SS),
    Pivoted AS(
        SELECT S.item,
               MAX(CASE WHEN S.value LIKE '[0-9][0-9]:[0-9][0-9]:[0-9][0-9]' THEN S.value END) AS EventTime,
               MAX(CASE WHEN S.value NOT LIKE '[0-9][0-9]:[0-9][0-9]:[0-9][0-9]' THEN S.value END) AS EventValue
        FROM Split S
        GROUP BY S.item), --Will also need to group on a unique value if this is coming from a table, not a variable
    RNs AS(
        SELECT P.EventTime,
               V.EventName,
               MAX(CASE SS.ordinal WHEN 1 THEN TRIM(SS.value) END) AS FromValue,
               MAX(CASE SS.ordinal WHEN 2 THEN TRIM(SS.value) END) AS ToValue,
               ROW_NUMBER() OVER (PARTITION BY V.EventName ORDER BY P.EventTime DESC, P.item DESC) AS RN --May need to aklter PARTITION BY is using a column
        FROM Pivoted P  
             CROSS APPLY (VALUES(NULLIF(CHARINDEX(':',P.EventValue),0)))CI(I)
             CROSS APPLY (VALUES(LEFT(P.EventValue,CI.I-1),TRIM(STUFF(P.EventValue,1,CI.I,''))))V(EventName,EventValue)
             CROSS APPLY STRING_SPLIT(REPLACE(V.EventValue,'->','|'),'|',1) SS
        GROUP BY P.EventTime,
                 P.item,
                 V.EventName) --Will also need to group on a unique value if this is coming from a table, not a variable
    SELECT R.EventName,
           R.ToValue
    FROM RNs R
    WHERE R.RN = 1;
    

    But, of course, as I stated in the comments, data like you have here should never be getting to the database. Fix the design; don't store your data like this. As you have found, working with it is a pain... As a temporary solution, you could use a VIEW, but the long term goal should be fixing things. Future you will thank you.