I want to create a calculated field to use with the rpivotTable
package, similar to the functionality seen in excel.
For instance, consider the following table:
+--------------+--------+---------+-------------+-----------------+
| Manufacturer | Vendor | Shipper | Total Units | Defective Units |
+--------------+--------+---------+-------------+-----------------+
| A | P | X | 173247 | 34649 |
| A | P | Y | 451598 | 225799 |
| A | P | Z | 759695 | 463414 |
| A | Q | X | 358040 | 225565 |
| A | Q | Y | 102068 | 36744 |
| A | Q | Z | 994961 | 228841 |
| A | R | X | 454672 | 231883 |
| A | R | Y | 275994 | 124197 |
| A | R | Z | 691100 | 165864 |
| B | P | X | 755594 | 302238 |
| . | . | . | . | . |
| . | . | . | . | . |
+--------------+--------+---------+-------------+-----------------+
(my actual table has many more columns, both dimensions and measures, time, etc. and I need to define multiple such "calculated columns")
If I want to calculate defect rate (which would be Defective Units/Total Units
) and I want to aggregate by either of the first three columns, I'm not able to.
I tried assignment by reference (:=
), but that still didn't seem to work and summed up defect rates (i.e., sum(Defective_Units/Total_Units)
), instead of sum(Defective_Units)/sum(Total_Units)
:
myData[, Defect.Rate := Defective_Units / Total_Units]
This ended up giving my defect rates greater than 1. Is there anywhere I can declare a calculated field, which is just a formula evaluated post aggregation?
You're lucky - the creator of pivottable.js foresaw cases like yours (and mine, earlier today) by implementing an aggregator called "Sum over Sum" and a few more, likewise, cf. https://github.com/nicolaskruchten/pivottable/blob/master/src/pivot.coffee#L111 and https://github.com/nicolaskruchten/pivottable/blob/master/src/pivot.coffee#L169.
So we'll use "Sum over Sum" as parameter "aggregatorName", and the columns whose quotient we want in the "vals" parameter.
Here's a meaningless usage example from the mtcars data for reproducibility:
require(rpivotTable)
data(mtcars)
rpivotTable(mtcars,rows="gear", cols=c("cyl","carb"),
aggregatorName = "Sum over Sum",
vals =c("mpg","disp"),
width="100%", height="400px")