I have the following table which holds market assets and the date they were purchased:
CREATE TABLE ##Holdings
(
HoldingID INT NOT NULL IDENTITY (1, 1) PRIMARY KEY,
Symbol VARCHAR(20) NOT NULL,
PurchaseDate DATE NOT NULL,
Units INT NOT NULL,
PurchasePrice DECIMAL(19, 2) NOT NULL,
SoldDate DATE,
SoldPrice DECIMAL(19,2)
)
I have the following records in this table:
INSERT INTO ##Holdings (Symbol, PurchaseDate, Units, PurchasePrice)
VALUES
('ALL','2014-06-30',100, 5.26),
('ALL','2014-07-01',100, 5.26),
('ALL','2014-07-02',100, 5.26),
('ALL','2014-07-03',100, 5.26),
('ALL','2014-07-04',713, 5.26)
SELECT * FROM ##Holdings
HoldingID | Symbol | PurchaseDate | Units | PurchasePrice | SoldDate | SoldPrice |
---|---|---|---|---|---|---|
1 | ALL | 2014-06-30 | 100 | 5.26 | NULL | NULL |
2 | ALL | 2014-07-01 | 100 | 5.26 | NULL | NULL |
3 | ALL | 2014-07-02 | 100 | 5.26 | NULL | NULL |
4 | ALL | 2014-07-03 | 100 | 5.26 | NULL | NULL |
5 | ALL | 2014-07-04 | 713 | 5.26 | NULL | NULL |
All of these assets were purchased on separate dates but all for the same price and security.
I then want to process a sell order for 320 units of the security ALL
at 7.65
, selling off the oldest holdings first.
After the sell order is processed, the ##Holdings
table should look like this:
SELECT * FROM ##Holdings
HoldingID | Symbol | PurchaseDate | Units | PurchasePrice | SoldDate | SoldPrice |
---|---|---|---|---|---|---|
1 | ALL | 2014-06-30 | 100 | 5.26 | 2015-06-30 | 7.65 |
2 | ALL | 2014-07-01 | 100 | 5.26 | 2015-06-30 | 7.65 |
3 | ALL | 2014-07-02 | 100 | 5.26 | 2015-06-30 | 7.65 |
4 | ALL | 2014-07-03 | 80 | 5.26 | NULL | NULL |
5 | ALL | 2014-07-04 | 713 | 5.26 | NULL | NULL |
6 | ALL | 2014-07-03 | 20 | 5.26 | 2015-06-30 | 7.65 |
Rows 1-3 have been used to fill the sell order. Only 20 units of row 4 were required to fill the remaining units of the order. A new row has been created (row 6) representing the 20 units, while row 4 has had 20 units removed.
Row 5 remains untouched.
I could solve this by using a while loop, altering the records as I go, updating the final row (4) and inserting a new row (6). I would like to know if there is a better and more efficient way to achive this (I'm sure there is). I'm sure this approach would be very ugly when the table has millions of rows.
This article sort of comes close, but not quite what I'm looking for.
SQL Server has had an OUTPUT into clause for a while, but the problem here is we only want to create one new row with rest of the sale. For this reason, one could use so called nested INSERT/MERGE trick:
;WITH sell (Symbol, Date, Units, Price) AS (
SELECT 'ALL', '20150630', 320, 7.65
)
INSERT INTO #Holdings (
Symbol, PurchaseDate, Units, PurchasePrice, SoldDate, SoldPrice
)
SELECT Symbol, PurchaseDate, newRow, PurchasePrice, SellDate, SellPrice
FROM (
MERGE #Holdings AS TARGET
USING (
SELECT t.*, sell.Date AS SellDate, sell.Units AS SellUnits, sell.Price AS SellPrice
, case when aggUnits > sell.Units THEN aggUnits - sell.Units END AS restOnHolding
, case when aggUnits > sell.Units THEN t.Units - (aggUnits - sell.Units) END AS newRow
FROM (
SELECT *
, SUM(Units) OVER(PARTITION BY Symbol ORDER BY PurchaseDate) AS aggUnits
FROM #Holdings t
WHERE solddate IS NULL
) t
INNER JOIN sell
ON sell.Symbol = t.Symbol
AND sell.Units > t.aggUnits - t.Units
) AS source
ON source.HoldingID = TARGET.HoldingID
WHEN MATCHED THEN UPDATE
SET solddate = IIF(restOnHolding IS NULL, source.SellDate, target.SoldDate)
, soldprice = IIF(restOnHolding IS NULL, source.SellPrice, target.SoldPrice)
, units = ISNULL(restOnHolding, target.Units)
output inserted.Symbol, inserted.PurchaseDate, inserted.PurchasePrice, source.SellDate, source.SellPrice
, source.newRow
) x
WHERE x.newRow > 0
Or this (a simpler version which skips the self-joining, but likely on the edge of sql server syntax):
;WITH sell (Symbol, Date, Units, Price) AS (
SELECT 'ALL', '20150630', 320, 7.65
)
, target as (
select *
, SUM(Units) OVER(PARTITION BY Symbol ORDER BY PurchaseDate) AS aggUnits
FROM #Holdings t
WHERE solddate IS NULL
)
INSERT INTO #Holdings (
Symbol, PurchaseDate, Units, PurchasePrice, SoldDate, SoldPrice
)
SELECT Symbol, PurchaseDate, newRow, PurchasePrice, SellDate, SellPrice
FROM (
update target
SET solddate = IIF(case when aggUnits > source.Units THEN aggUnits - source.Units END IS NULL, source.Date, target.SoldDate)
, soldprice = IIF(case when aggUnits > source.Units THEN aggUnits - source.Units END IS NULL, source.Price, target.SoldPrice)
, units = ISNULL(case when aggUnits > source.Units THEN aggUnits - source.Units END, target.Units)
output inserted.Symbol, inserted.PurchaseDate, inserted.PurchasePrice, source.Date as SellDate, source.Price as SellPrice
, case when deleted.aggUnits > source.Units THEN deleted.Units - (deleted.aggUnits - source.Units) END as NewRow
from target
inner join sell source
ON source.Symbol = target.Symbol
AND source.Units > target.aggUnits - target.Units
) x
WHERE x.newRow > 0
To find which rows to update, I create a running SUM aggregate, and get all rows until we pass the needed total value.
Then, the source subquery will be containing the rows which will be affected by the update (the one that sets sell price, date, and rest of the units).
Merge outputs some info like rows updated and mostly importantly, the newRow
column, containing how much we should transfer to the new row.
Finally, the nested insert has a newRow > 0
condition that filters out the other UPDATEd rows which we aren't interested in.
The final result becomes:
HoldingID | Symbol | PurchaseDate | Units | PurchasePrice | SoldDate | SoldPrice |
---|---|---|---|---|---|---|
1 | ALL | 2014-06-30 | 100 | 5.26 | 2015-06-30 | 7.65 |
2 | ALL | 2014-07-01 | 100 | 5.26 | 2015-06-30 | 7.65 |
3 | ALL | 2014-07-02 | 100 | 5.26 | 2015-06-30 | 7.65 |
4 | ALL | 2014-07-03 | 80 | 5.26 | NULL | NULL |
5 | ALL | 2014-07-04 | 713 | 5.26 | NULL | NULL |
6 | ALL | 2014-07-03 | 20 | 5.26 | 2015-06-30 | 7.65 |
Disclaimer: in real code i would probably make this into a simple temp table solution and skip the nested trick.