sqlgoogle-bigquery

SUM this column


I want to SUM the total amount of the user payments, but the result was different.

I tried to use JOIN to combine 3 table and add the list user_id, list survey, and total amount for each user. But, the total amount result wasn't the same as I did in separate table.

SELECT
users.user_id,
COUNT(DISTINCT participations.survey_id) as total_survey,
SUM(amount) as total_amount
FROM `angket.users` as users
RIGHT JOIN `angket.participations`as participations
using (user_id)
RIGHT JOIN `angket.payments`as payments
USING (user_id)
GROUP BY 1
ORDER BY 1

with JOIN

SELECT
user_id,
SUM(amount) as total_amount
FROM `angket.payments` as payments
LEFT JOIN `angket.users`as users
USING(user_id)
GROUP BY 1
ORDER BY 1

1


Solution

  • The answer is relatively easy, once you compare the sums in one and the other query:

    this lead to an erronous partial Cartesian product, so that the occurrences of the various payments were multiplied by the number of participations, so that the amounts were summed 8, 4 and 3 times in the query joining with partitipations.

    But - as you need a count distinct from participations , try two different groupings in two nested queries:

    SELECT
      users.user_id
    , COUNT(DISTINCT participations.survey_id) as total_survey
    , total_amount
    FROM angket.users as users
    RIGHT JOIN angket.participationsas participations
    USING (user_id)
    RIGHT JOIN (
      SELECT
        user_id
      , SUM(amount) AS total_§amount
      FROM angket.payments 
      GROUP BY user_id
    ) sumpayments
    USING (user_id)
    GROUP BY 1, 3
    ORDER BY 1