I have 2 dataframes (purchase and sales) as follows:
PURCHASE:
Name | item | voucher | Amt | Qty |
---|---|---|---|---|
A | Item1 | Purchase | 10000 | 100 |
B | Item2 | Purchase | 500 | 50 |
B | Item1 | Purchase | 2000 | 20 |
C | Item3 | Purchase | 1000 | 100 |
D | Item4 | Purchase | 500 | 100 |
A | Item3 | Purchase | 5000 | 50 |
SALES:
Name | item | voucher | Amt | Qty |
---|---|---|---|---|
A | Item1 | Sales | 5300 | 50 |
B | Item2 | Sales | 450 | 40 |
B | Item1 | Sales | 1675 | 15 |
C | Item3 | Sales | 1800 | 100 |
I want an output dataframe where if the person (Name
) sells an item, the Amt
and Qty
should be deducted from the purchase dataframe and a new dataframe should be created with the remaining Amt
and Qty
as shown below:
OUTPUT DATAFRAME:
Name | item | voucher | Amt | Qty |
---|---|---|---|---|
A | Item1 | Remaining | 4700 | 50 |
A | Item3 | Remaining | 5000 | 50 |
B | Item2 | Remaining | 50 | 10 |
B | Item1 | Remaining | 325 | 5 |
C | Item3 | Remaining | -800 | 0 |
D | Item4 | Remaining | 500 | 100 |
Notice that whatever items have been sold by a person (Name
) has been deducted from the purchase dataframe and the remaining items (Amt
and Qty
) are stored in a new output dataframe. Also person D
never sold any items even then it should be included in the output dataframe.
Thanks in advance!
Dataframe
import pandas as pd
Purchases = {
"Name": ["A", "B", "B", "C", "D", "A"],
"item": ["Item1", "Item2", "Item1", "Item3", "Item4", "Item3"],
"voucher": ["Purchase", "Purchase", "Purchase", "Purchase", "Purchase", "Purchase"],
"Amt": [10000, 500, 2000, 1000, 500, 5000],
"Qty": [100, 50, 20, 100, 100, 50],
}
Purchases = pd.DataFrame(Purchases)
Sales = {
"Name": ["A", "B", "B", "C"],
"item": ["Item1", "Item2", "Item1", "Item3"],
"voucher": ["Sales", "Sales", "Sales", "Sales"],
"Amt": [5300, 450, 1675, 1800],
"Qty": [50, 40, 15, 100],
}
Sales = pd.DataFrame(Sales)
Using good old index alignment:
tmp = Purchases.set_index(['Name', 'item'])
out = (tmp
.sub(Sales.set_index(['Name', 'item'])[['Amt', 'Qty']])
.combine_first(tmp).assign(voucher='Remaining')
.reset_index()[Purchases.columns]
)
Output:
Name item voucher Amt Qty
0 A Item1 Remaining 4700.0 50.0
1 A Item3 Remaining 5000.0 50.0
2 B Item1 Remaining 325.0 5.0
3 B Item2 Remaining 50.0 10.0
4 C Item3 Remaining -800.0 0.0
5 D Item4 Remaining 500.0 100.0