sqlpentaho-spoonpentaho-data-integration

Interact within range of values - Pentaho Kettle


I have a table where it contains minimum and maximum account values ​​to organize within a classification, I want to interact with this value to generate the accounts within the range.

ORDEM ABC MIN_CONTA ABC MAX_CONTA
102 0000132000 0000132012
102 0000132014 0000132021
102 0000132030 0000132031
102 0000132050 0000132051
102 0000132200 0000132202
103 0000132101 0000132112
103 0000132115 0000132115
103 0000132118 0000132119
103 0000132130 0000132130
104 0000134000 0000134009

It manages to make the change I sought through a script in SQL Server, but I cannot execute it in Pentaho. Follow the script:

DECLARE @COL_MIN INT;
DECLARE @COL_MAX INT;

SELECT @COL_MIN = MIN(CAST(MIN_CONTA AS INT)), @COL_MAX = MAX(CAST(MAX_CONTA AS INT))
FROM EDW.STG.STG_DRE_CONTA_RAZAO_FAGL_011ZC_SAP;

IF OBJECT_ID('TEMPDB..#DUMMY_NUMBER') IS NOT NULL
BEGIN
DROP TABLE #DUMMY_NUMBER
END

CREATE TABLE #DUMMY_NUMBER (NUM INT);

WHILE @COL_MIN <= @COL_MAX
BEGIN
INSERT INTO #DUMMY_NUMBER VALUES (@COL_MIN);
SET @COL_MIN = @COL_MIN+1;
END;

IF OBJECT_ID('EDW.STG.STG_DRE_CONTA_RAZAO_FINAL', 'U') IS NOT NULL
BEGIN
    DROP TABLE EDW.STG.STG_DRE_CONTA_RAZAO_FINAL;
END

SELECT * INTO EDW.STG.STG_DRE_CONTA_RAZAO_FINAL
FROM #DUMMY_NUMBER T2 
INNER JOIN EDW.STG.STG_DRE_CONTA_RAZAO_FAGL_011ZC_SAP T1 ON T2.NUM BETWEEN CAST(T1.MIN_CONTA AS INT) AND CAST(T1.MAX_CONTA AS INT);

DROP TABLE #DUMMY_NUMBER;
123 NUM ABC ORDEM ABC MIN_CONTA ABC MAX_CONTA
111000 90 0000111000 0000111100
111001 90 0000111000 0000111100
111002 90 0000111000 0000111100
111003 90 0000111000 0000111100
111004 90 0000111000 0000111100
111005 90 0000111000 0000111100
111006 90 0000111000 0000111100
111007 90 0000111000 0000111100
111008 90 0000111000 0000111100
111009 90 0000111000 0000111100
111010 90 0000111000 0000111100
111011 90 0000111000 0000111100
111012 90 0000111000 0000111100
111013 90 0000111000 0000111100
111014 90 0000111000 0000111100
111015 90 0000111000 0000111100
111016 90 0000111000 0000111100
111017 90 0000111000 0000111100
111018 90 0000111000 0000111100
111019 90 0000111000 0000111100
2023/03/06 08:53:49 - EDW - New database connection defined
2023/03/06 08:53:49 - Execute SQL script 2.0 - Connected to database... 2023/03/06 08:53:49 - EDW - launch DDL statement:  
2023/03/06 08:53:49 - EDW - DECLARE @COL_MIN INT 
2023/03/06 08:53:49 - EDW - launch DDL statement:  
2023/03/06 08:53:49 - EDW - DECLARE @COL_MAX INT 
2023/03/06 08:53:49 - EDW - launch SELECT statement:  
2023/03/06 08:53:49 - EDW - SELECT @COL_MIN = MIN(CAST(MIN_CONTA AS INT)), @COL_MAX = MAX(CAST(MAX_CONTA AS INT)) 
2023/03/06 08:53:49 - EDW - FROM EDW.STG.STG_DRE_CONTA_RAZAO_FAGL_011ZC_SAP 2023/03/06 08:53:49 - Execute SQL script 2.0 - ERROR (version 8.0.0.0-28, build 8.0.0.0-28 from 2017-11-05 07.27.50 by buildguy) : An error occurred, processing will be stopped:  
2023/03/06 08:53:49 - Execute SQL script 2.0 - An error occurred executing SQL:  
2023/03/06 08:53:49 - Execute SQL script 2.0 - SELECT @COL_MIN = MIN(CAST(MIN_CONTA AS INT)), @COL_MAX = MAX(CAST(MAX_CONTA AS INT)) 
2023/03/06 08:53:49 - Execute SQL script
2.0 - FROM EDW.STG.STG_DRE_CONTA_RAZAO_FAGL_011ZC_SAP 
2023/03/06 08:53:49 - Execute SQL script 2.0 - Must declare the scalar variable "@COL_MIN". 
2023/03/06 08:53:49 - EDW - Statement canceled! 
2023/03/06 08:53:49 - EDW - Statement canceled! 
2023/03/06 08:53:49 - Execute SQL script 2.0 - ERROR (version 8.0.0.0-28, build 8.0.0.0-28 from 2017-11-05 07.27.50 by buildguy) : Error initializing step [Execute SQL script 2] 
2023/03/06 08:53:49 - TRF_DIM_DRE_CONTA_RAZAO_SAP - Step [Script.0] initialized flawlessly. 
2023/03/06 08:53:49 - TRF_DIM_DRE_CONTA_RAZAO_SAP - ERROR (version 8.0.0.0-28, build
8.0.0.0-28 from 2017-11-05 07.27.50 by buildguy) : Step [Execute SQL script 2.0] failed to initialize! 
2023/03/06 08:53:49 - TRF_DIM_DRE_CONTA_RAZAO_SAP - Step [Dummy (do nothing).0] initialized flawlessly. 
2023/03/06 08:53:49 - TRF_DIM_DRE_CONTA_RAZAO_SAP - Step [Join Rows (cartesian product).0] initialized flawlessly. 
2023/03/06 08:53:49 - Execute SQL script 2.0 - Finished reading query, closing connection. 
2023/03/06 08:53:49 - EDW - Connection to database closed! 
2023/03/06 08:53:49 - Spoon - ERROR (version 8.0.0.0-28, build
8.0.0.0-28 from 2017-11-05 07.27.50 by buildguy) : TRF_DIM_DRE_CONTA_RAZAO_SAP: preparing transformation execution failed 
2023/03/06 08:53:49 - Spoon - ERROR (version 8.0.0.0-28, build
8.0.0.0-28 from 2017-11-05 07.27.50 by buildguy) : org.pentaho.di.core.exception.KettleException:  
2023/03/06 08:53:49 - Spoon - We failed to initialize at least one step.  Execution can not begin! 
2023/03/06 08:53:49 - Spoon -  
2023/03/06 08:53:49 - Spoon -  
2023/03/06 08:53:49 - Spoon -   at org.pentaho.di.trans.Trans.prepareExecution(Trans.java:1163) 
2023/03/06 08:53:49 - Spoon -   at org.pentaho.di.ui.spoon.trans.TransGraph$29.run(TransGraph.java:4202)
2023/03/06 08:53:49 - Spoon -   at java.lang.Thread.run(Thread.java:748) 
2023/03/06 08:53:49 - TRF_DIM_DRE_CONTA_RAZAO_SAP - ERROR (version 8.0.0.0-28, build
8.0.0.0-28 from 2017-11-05 07.27.50 by buildguy) : Errors detected! 
2023/03/06 08:53:49 - TRF_DIM_DRE_CONTA_RAZAO_SAP - ERROR (version
8.0.0.0-28, build 8.0.0.0-28 from 2017-11-05 07.27.50 by buildguy) : Errors detected!

I try to carry out this interaction of the range, through chatgpt he informed me of another method to carry out, but it does not manage to flow, follow the comment: 'Add a "Row generator" node to the transformation and configure it to generate sequential numbers according to the minimum and maximum values ​​of "MIN_CONTA" and "MAX_CONTA" obtained in the previous step. To do so, create two variables in the "Row generator" node: "start_num" and "end_num", and configure them to receive the minimum and maximum values ​​of "MIN_CONTA" and "MAX_CONTA", respectively. Then, select the "Series" option in the "Row generator" node and configure it to generate a sequence of numbers from "start_num" to "end_num". Finally, add a field in the output of the "Row generator" node with the name "num" and the type "Integer".'

I tried to perform an iteration on the table to get the account range numbers.


Solution

  • With the help of friends, he manages to come up with two solutions:

    1 - Using the recursive union of a common table to iterate the range values, follow the script:

    -- Este script utiliza uma CTE (Common Table Expression) para gerar uma lista de números inteiros.
    -- A CTE começa com os valores mínimos e máximos de uma determinada coluna de uma tabela de origem.
    -- Em seguida, ela se une recursivamente a si mesma, adicionando 1 à cada valor até chegar ao valor máximo.
    -- O resultado é uma tabela com uma única coluna de números inteiros.
    -- O OPTION (MAXRECURSION 0) é adicionado para evitar o erro de recursão máxima atingida. WITH cte AS (
      -- Seleciona os valores mínimos e máximos da coluna de origem e os converte em inteiros.
      SELECT ORDEM, CAST(MIN_CONTA AS INT) AS MIN_CONTA, CAST(MAX_CONTA AS INT) AS MAX_CONTA, CAST(MIN_CONTA AS INT) AS CONTA
      FROM EDW.stg.STG_DRE_CONTA_RAZAO_FAGL_011ZC_SAP   -- Une recursivamente a CTE a si mesma, adicionando 1 à cada valor até chegar ao valor máximo.
      UNION ALL
      SELECT cte.ORDEM, cte.MIN_CONTA, cte.MAX_CONTA, cte.CONTA + 1 AS CONTA
      FROM cte
      WHERE cte.CONTA < cte.MAX_CONTA) -- Seleciona os resultados finais da CTE e os retorna.
    SELECT DISTINCT ORDEM, MIN_CONTA, MAX_CONTA, CONTA
    FROM cte
    OPTION (MAXRECURSION 0);
    

    2 - We use the "Modified Java Script Value" step and then the "Split field to rows" to create the "CONTA" column with the data from the intersected row.

    var COL_MIN = MIN_CONTA;
    var COL_MAX = MAX_CONTA;
    
    
    var CONTA_INTERADORA = [];
      for (var i = COL_MIN; i <= COL_MAX; i++) {
        CONTA_INTERADORA.push(i);
      }
    
    

    Step Javascript

    Split Fields