sqlpostgresqldistributionbuckets

Allocating quantities to different buckets using SQL


I am trying to put together some SQL code that is able to distribute a certain quantity among buckets with a limited capacity for this quantity in a specific order but in separate transactions. See below for example.

I have 2 buckets, each with a certain capacity (integer):

bucket       capacity
1            100
2            50

I have certain quantities to be bucketed in a number of transactions:

transaction   quantity
1             50
2             60
3             20
4             40

I want the following result after running the SQL code, which should keep the transaction number and tell me how much each bucket was able to hold of that quantity. The buckets have to be filled in order of the bucket number, and in order of transaction number:

transaction   quantity_bucketed   bucket_id   overage
1             50                  1           0
2             50                  1           0
2             10                  2           0
3             20                  2           0
4             20                  2           20

If there are no more buckets, and there is still a quantity to be bucketed, it should go to the "overage" column as in the example above.


Solution

  • DROP TABLE IF EXISTS buckets;
    CREATE TABLE buckets (
        bucket_id bigserial primary key,
        capacity integer);
    
    -- something to put in the buckets
    DROP TABLE IF EXISTS transactions;
    CREATE TABLE transactions (
        transaction_id bigserial primary key,
        quantity integer);
    
    -- create 2 buckets with different capacities
    INSERT INTO buckets (capacity)
    VALUES (100),(50);
    
    -- create some traffic to put in the buckets
    INSERT INTO transactions (quantity)
    VALUES (50),(60),(20),(40);
    
    
    WITH buckets AS (
        -- expand buckets (create a row per bucket capacity)
        SELECT row_number() OVER () bucket_row_id, *
        FROM (
            -- slot = a unit of capacity
            SELECT *, generate_series(1,b.capacity) slot 
            FROM buckets b
        ) useless_alias
    ), xact AS (
        -- expand transactions, creating an id per unit of quantity
        SELECT row_number() OVER () unit_row_id, *
        FROM (
            -- an item per transaction quantity
            SELECT *, generate_series(1,t.quantity) unit 
            FROM transactions t
        ) useless_alias
    ), filled AS (
        -- join buckets to transactions on slots=units
        --   slots with no units = wasted bucket capacity
        --   units with no slots = overage
        SELECT b.*, x.*
        FROM xact x
        FULL JOIN buckets b
        ON b.bucket_row_id = x.unit_row_id
    )
    -- finally, do the do
    SELECT transaction_id, CASE WHEN bucket_id IS NULL THEN 'overage' ELSE bucket_id::text END bucket_id , count(unit_row_id) quantity_bucketed
    FROM filled
    GROUP BY 1,2
    ORDER BY 1,2
    

    Caveat: I didn't try to make an extra column out of the "overage". When filling buckets, it's kinda irrelevant which bucket the overage didn't fit into. In the example case, only 1 transaction had overage. I'm assuming in your real use case that if there were more transactions, you'd really like to see quantities per transaction not bucketed.