sqlsql-servert-sql

How to SUM by parent table value while ignoring child table cardinality?


I am looking for a way to extract data with a single query and answering to two fundamental questions:

Suppose the following tables:

CREATE TABLE Invoice (
  Id        VARCHAR(3) PRIMARY KEY,
  Pax       INT NOT NULL
)

CREATE TABLE InvoiceLine (
  Id        VARCHAR(3) PRIMARY KEY,
  InvoiceId VARCHAR(3) NOT NULL,
  Activity  VARCHAR(80) NOT NULL,
  Amount    DECIMAL(19,4) NOT NULL,
  CONSTRAINT FK_InvoiceLine_InvoiceId FOREIGN KEY (InvoiceId) REFERENCES Invoice (Id)
)

And the following datas:

Invoice Table:

Id Pax
001 2
002 2
003 4

InvoiceLine Table:

Id InvoiceId Activity Amount
001 001 Ticketing 40.00
002 001 Reservation 10.00
003 002 Ticketing 30.00
004 002 Ticketing 30.00
005 002 Reservation 8.00
006 002 Insurance 6.60
007 003 Reservation 120.00
008 003 Reservation 40.00
    INSERT INTO Invoice VALUES ('001', 2);
    INSERT INTO Invoice VALUES ('002', 2);
    INSERT INTO Invoice VALUES ('003', 4);
    INSERT INTO InvoiceLine VALUES ('001', '001', 'Ticketing',   40);
    INSERT INTO InvoiceLine VALUES ('002', '001', 'Reservation', 10);
    INSERT INTO InvoiceLine VALUES ('003', '002', 'Ticketing',   30);
    INSERT INTO InvoiceLine VALUES ('004', '002', 'Ticketing',   30);
    INSERT INTO InvoiceLine VALUES ('005', '002', 'Reservation',  8);
    INSERT INTO InvoiceLine VALUES ('006', '002', 'Insurance',    6.6);
    INSERT INTO InvoiceLine VALUES ('007', '003', 'Ticketing',  120);
    INSERT INTO InvoiceLine VALUES ('008', '003', 'Ticketing',   40);

My goal is to have a query to retrieve (in the same query):

  1. Amount: sum of amount per activity
  2. Pax: number of pax per activity
  3. GlobalPax: number of pax served globally (sum = 8)

I have found a way to obtain the correct result, but it requires to group by i.Id, which increase the number of rows returned (in my case I have millions of rows).

SELECT
  i.Id,
  il.Activity,
  SUM(il.Amount) AS Amount,
  AVG(i.Pax) AS Pax,
  AVG(CAST(i.Pax AS float)) / COUNT(*) OVER (PARTITION BY i.Id) AS GlobalPax
FROM
  Invoice AS i
  JOIN InvoiceLine AS il ON i.Id = il.InvoiceId
GROUP BY
  i.Id,
  il.Activity

Which gives me:

Id Activity Amount Pax GlobalPax
001 Reservation 10.0000 2 1
001 Ticketing 40.0000 2 1
002 Insurance 6.6000 2 0.666666666666667
002 Reservation 8.0000 2 0.666666666666667
002 Ticketing 60.0000 2 0.666666666666667
003 Ticketing 160.0000 4 4

The results is almost perfect, but I don't need the Id column (required for the PARTITION BY clause).

A perfect result could be:

Activity Amount Pax GlobalPax
Insurance 6.6000 2 0.666666666666667
Reservation 18.0000 4 1.666666666666667
Ticketing 260.0000 8 5.666666666666667

Thanks


Solution

  • You should apply next grouping by Activity, if don't need the Id column.

    See example

    select Activity
      ,sum(Amount) as Amount
      ,sum(Pax) as Pax
      ,sum(GlobalPax) as GlobalPax
    from( -- your query
    SELECT
      i.Id,
      il.Activity,
      SUM(il.Amount) AS Amount,
      AVG(i.Pax) AS Pax,
      AVG(CAST(i.Pax AS float)) / COUNT(*) OVER (PARTITION BY i.Id) AS GlobalPax
    FROM
      Invoice AS i
      JOIN InvoiceLine AS il ON i.Id = il.InvoiceId
    GROUP BY i.Id, il.Activity
    )a
    group by Activity
    
    Activity Amount Pax GlobalPax
    Insurance 6.6000 2 0.666666666666667
    Reservation 18.0000 4 1.66666666666667
    Ticketing 260.0000 8 5.66666666666667

    fiddle