sqlsnowflake-cloud-data-platform

Join or union for aggregated queries from 2 tables


I am looking to capture error rates in production for each employee. There is a table that collects their operator ID, what the error was and how many. For different errors on the same day, a new row is added to the table. Here is a sample of the error table and my query to sum total errors for the past 30 days. :

ERROR_ID, REDO_OPR, OPR, REDO_DESC, PROUDCT_ID, REDO_COUNT
1, 345, 123, Mistake 1, X123456, 1
2, 345, 456, Mistake 1, X123457, 1
3, 345, 123, Mistake 2, X123459, 1
4, 345, 456, Mechanical Er, X123500, 1
  SELECT OPR as id, COUNT(REDO_COUNT) AS errors
    FROM ERROR_TABLE
   WHERE REDO_DESC IN ('Mistake 1', 'Mistake 2', 'Mistake 3')
     --AND REDO_DATE >= DATEADD(day, -30, getdate())
GROUP BY OPR

Result

id, errors
123, 2
456, 1

To get the error rate I count the amount of times they are recorded performing the task in a table that tracks one units production. Here is a sample of the data and the query used. Each process records the operator id.

PRODUCT_ID, START_DATE, PROCESS_1_OPR, PROCESS_2_OPR, PROCESS_3_OPR
X123456, 2024-01-01, 987, 123, 345 
X123457, 2024-01-01, 987, 456, 345
X123458, 2024-01-02, 987, 456, 345
X123459, 2024-01-02, 987, 123, 345
X123500, 2024-01-02, 987, 123, 345
  SELECT PROCESS_2_OPR AS id, COUNT(PROCESS_1_OPR) as Total
    FROM PRODUCTION_TABLE 
   WHERE START_DATE >= DATEADD(day, -30, getdate())
GROUP BY PROCESS_2_OPR

Result

id, Total
123, 3
456, 2

My want is a table that would show

OPR, Errors, Total, Error_Rate
123, 3, 3, 100%
456, 1, 2, 50%

I know how to query error rate using round() but the data being returned is not accurate.

   SELECT t1.OPR, REDO_COUNT, REDO_DESC
     FROM PRODUCTION_TABLE AS t1
LEFT JOIN ERROR_TABLE AS t2
       ON t1.PRODUCT_ID = t2.PRODUCT_ID
    WHERE t1.OPR = '123'
      AND t1.START_DATE >= DATEADD(day, -30, getdate())

I am able to get every instance the OPR was recorded in a process and every time an Error was recorded. The issue is filtering and producing the aggregate.

If I filter to just return mistakes 1, 2, 3, I lose any NULL error values. I also seem to run into errors trying to count process and sum errors. Any advice would be helpful.


Solution

  • with opr_errors as(
        select 
            opr id, 
            count(redo_count) errors
        from error_table
        where (redo_desc in ('Mistake 1', 'Mistake 2', 'Mistake 3')
            or redo_desc is null)
            and redo_date >= dateadd(day, -30, getdate()))
        group by opr
    ),
    opr_total as(
        select 
            process_1_opr id,
            count(process_1_opr) total
        from production_table 
        where start_date >= dateadd(day, -30, getdate()))
        group by process_1_opr
    )
    select
        a.id,
        errors,
        total,
        to_char(round(errors/total * 100,2)) || '%' error_rate
    from opr_total a
    left join opr_errors b on a.id = b.id